编辑: 施信荣 2016-12-07
?

2015 Chevron U.

S.A. Inc. All rights reserved. Earth Data Science in the Era of Big Data and Compute Earth Data Science: Foundations and Principles A Chevron Perspective Scott Hills Chevron Energy Technology Company Board on Earth Sciences and Resources Meeting

29 April

2015 ?

2015 Chevron U.S.A. Inc. All rights reserved. Big Data Definitions ? Diversity of opinion For example, Gil Press (2014)1: C Lists

12 selected definitions, and references a separate "compilation of big data definitions from 40+ thought leaders" C Notes the "first documented use of the term 'big data' appeared in a

1997 paper by scientists at NASA" (Cox and Ellsworth, 1997)2

1 http://www.forbes.com/sites/gilpress/2014/09/03/12-big-data-definitions-whats-yours/2/ (Retrieved

15 April 2015)

2 http://dl.acm.org/citation.cfm?id=266989.267068 ? Preferred "'Big data' is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization." (Laney and Douglas. Gartner.)3

3 http://www.gartner.com/resId=2057415 (Retrieved

21 June 2012) C In other words, it's relative to current capabilities.

2 ?

2015 Chevron U.S.A. Inc. All rights reserved. The Big Picture of Subsurface Work

3 ?

2015 Chevron U.S.A. Inc. All rights reserved. Is Information Discovery a Big Data Problem? ? High Volume ? GeoRef geoscience publication database (as of 4/20/2015): ~22 million publications, 1990-2013 of which ~11.8 million are from ~3,500 journals ? High Velocity ? GeoRef database: >37,000 publications/month, 1990-1999 >100,000 publications/month, 2000-2013 ? High Variety ? Digital documents, data, models, software ? References to physical documents, specimens, tools/equipment ? Novel Processing ? Documentation with semantically-enabled metadata ? Automated metadata enrichment using text analytics (digital documents)

4 ?

2015 Chevron U.S.A. Inc. All rights reserved. The Current State: Metadata for Discovery of Distributed Resources Multiple communities pursuing use of standard metadata as a means to a common vision, e.g., ? U.S. Open Data Policy ? USGS: ScienceBase (www.sciencebase.gov), Science Data Catalog (data.usgs.gov) ? NASA Land Processes DAAC (lpdacc.usgs.gov) ? NOAA: NGDC (www.ngdc.noaa.gov), NODC (www.nodc.noaa.gov) ? US Geoscience Information Network (USGIN) (www.usgin.org) ? Project Mercury (mercury.ornl.gov) ? Open Geoportal Project (www.opengeoportal.org) ? U. of Utah ? Energy & Geoscience Institute (EGI) (egi.utah.edu) iCORDS Project (icordsgeo.org) ? Energistics (www.energistics.org) ? ISO TC

211 Class A Liaison organization ? Energy Industry Profile of ISO 19115-1 published

2014 ? Partnered with USGIN, and working with closely with USGS, EGI/iCORDS None yet using semantic tags or analytics for automated enrichment

5 Search Index w/structured metadata External Metadata Catalog Partner & Subscription Delivered Metadata Structured resources Unstructured resources Application Managed Metadata Internally Harvested Metadata An Extended Vision: Metadata for Discovery of Diverse, Distributed Resources _ _ _ _ Legend: Metadata exchange via EIP* standard _ External Metadata (Commercial, Gov't & Academic: e.g., AAPG, EGI, USGIN, USGS) _

6 *EIP=Energy Industry Profile of ISO 19115-1 (Used with permission, 2015) ?

2015 Chevron U.S.A. Inc. All rights reserved. Information Integration for Wetware Analysis and Synthesis

7 Selected article term Manual term search Definitions from Wikipedia & the PDB Editor term annotation panel Document metadata Live links to citations An Example from Health/Life Sciences: Utopia Documents1

下载(注:源文件不在本站服务器,都将跳转到源网站下载)
备用下载
发帖评论
相关话题
发布一个新话题