Changes between Version 2 and Version 3 of ESGF

Oct 28, 2014 2:59:26 PM (8 years ago)



  • ESGF

    v2 v3  
    11== See Also ==
     4= ESGF =
     6[[Image(architecture.png, 40%, align=right, margin=10)]]The Earth System Grid Federation (ESGF[[FootNote(]]) is a spontaneous collaboration of groups, agencies and institutions around the world, that are dedicated to the development and operation of a long-term system for the management, access and analysis of climate data. ESGF's primary goal is to facilitate advancements in Earth System Science. Some of the challenges that ESGF is committed to address include[[FootNote(]]:
     7    * The enormous scale of the data holdings, moving from Peta-bytes to Exa-bytes.
     8    * Support for both model output and a wide variety of observational data
     9    * The distributed nature of the data archives, which are geographically distributed and autonomously operated
     10    * The need to enable users to access and analyze data with a wide variety of client tools - not just web browsers, but also rich desktop clients, libraries and toolkits
     11    * The need to harmonize and federate multiple local access policies
     14ESGF is not a directly funded organization. The current core contributors to the project work for various agencies around the world, including:
     16{{{#!th colspan=4
     28[[Image(doe.svg, 100px)]]
     32[[Image(nasa.svg, 100px)]]
     36[[Image(noaa.svg, 100px)]]
     40[[Image(nsf.png, 100px)]]
     44[[Image(IS-ENES2.png, 120px)]]
     48[[Image(NCI_logo.png, 120px)]]
     51== ESGF Architecture ==
     52The ESGF architecture is based in the peer-to-peer(P2P) paradigm[[FootNote(]], allowing a system of autonomous and distributed Nodes, which interoperate through common acceptance of federation protocols and trust agreements. The system is composed of multiple  sites (called “Nodes”) that are geographically distributed around the world, but can interoperate because they have adopted a common set of services, protocols and APIs. Nodes exchange information about their data holdings and services, trust each other for registering users and establishing access control decisions.
     54Data and metadata are managed and stored independently at each Node. Internally, each ESGF Node is composed of a set of services and applications that collectively enable data and metadata access and user management. The software components are logically grouped in four areas of functionality to be able install ESGF modularly.
     56{{{#!td rowspan=17 width=500% style="border: 0px" align=center
     57[[Image(nodo_esgf.png, 80%, border=0)]]
     59[[Image(ESGF_System.jpeg, 100%, border=0)]]
     61{{{#!td valign=top
     62{{{#!th rowspan=5  style="background: #99CCFF; border: 0px"
     63Data node
     65{{{#!td colspan=2
     66Includes services for secure data publication and access
     69{{{#!td align=center
     70Data Publisher
     73Generates the metadata catalogs. Scans data stored on a Data Node and making it available through the system. \\Extracts metadata from the directory structure and filenames, and from the content of the files themselves and then generates THREDDS/XML catalogs.
     76{{{#!td align=center
     77THREDDS Data Server
     80Provides access to the ESGF data and metadata. Is developed by Unidata.
     83{{{#!td align=center
     84GridFTP server
     87Serves data using a special protocol based on FTP to allow for a high-performance, securely authenticated, and reliable data transfer. Is developed by Globus.
     90{{{#!td align=center
     91OpenID Relying Party \\and \\Authorization Service
     94Ensure proper authentication and authorization.
     97{{{#!th rowspan=6  style="background: #FFFF66"
     98Index node
     100{{{#!td colspan=2
     101Contains services for indexing and searching metadata, currently implemented using Apache Solr as the back-end server
     104{{{#!td align=center
     105Indexing Service
     108Parses the metadata content available at some repository (located by its URL), and ingests it in the back-end metadata storage. At present ESGF parses metadata only from
     109THREDDS catalogs.
     112{{{#!td align=center
     113Search Service
     116Queries the index metadata content and retrieves matching results that include descriptive information as well as all the available data access points (e.g., HTTP, GridFTP, OPeNDAP, and LAS). \\The search service is invoked by clients through its REST API.
     119{{{#!td align=center
     120Apache Solr
     123Apache Solr is the underlying search engine. Solr is a popular web application which is used in many commercial web sites, featuring high-performance text and faceted searching, geospatial and temporal querying, and partition of searchable metadata across multiple local indexes (cores) and distributed servers (shards).
     126{{{#!td align=center
     127Web Portal UI
     130The web UI materializes a some of ESGF-services through the browser: user account management, collection and file-level search and discovery, the dashboard service, the LAS visualization engine, and the CIM viewer.
     133{{{#!td align=center
     137The Dashboard is the distributed monitoring system of ESGF. It is responsible for collecting historical informationabout the status of the federation.
     140{{{#!th rowspan=4  style="background: #B2FF66"
     141Identity provider node \\ (IdP node)
     143{{{#!td colspan=2
     144Allows user authentication and secure delivery of user attributes
     147{{{#!td align=center
     148OpenID Provider
     151Allows users to register and authenticate with the system, including Single-Sign-On functionality for browser-based access throughout the federation.
     154{{{#!td align=center
     155!MyProxy Server
     158The !MyProxy server, developed by NCSA, is used to issue short term certificates that can be used by client libraries and toolkits to authenticate the user during a data product request.
     161{{{#!td align=center
     162Attribute Service \\and\\ Registration Service
     165Make available to trusted clients the user attributes. When authorization is required, the local Authorization Service enforces the Node security policies by querying the particular Attribute Service in the federation that manages the configured access control group (e.g. CMIP5 Research, CMIP5 Commercial, CORDEX Research, ...).
     168{{{#!th rowspan=5  style="background: #FF66FF"
     169Compute node
     171{{{#!td colspan=2
     172Contains higher-level services for data analysis and visualization.
     175{{{#!td align=center
     176Live Acces Server (LAS)\\+\\Ferret
     179The Live Access Server (LAS), developed by NOAA/PMEL, is an analysis and visualization engine that allows users to request advanced data and imaging products from multiple ESGF Nodes at once. \\It can be configured with a pluggable visualization engine such as Ferret (the default), NCL or CDAT.