Welcome to ESGF Software and Services
The Earth System Grid Federation (ESGF) is a collaboration that develops, deploys and maintains software infrastructure for the management, dissemination, and analysis of model output and observational data. Below are the different data access interfaces and software tools. You can install and configure all the tools or a subset depending on your needs.
ESGF Software Stack: for the Node Administrator
ESGF Data and Index/Identity Node
- The ESGF Data Node software stack enables sites hosting earth system data to make it available to the community over several transfer protocols including http(s). Index nodes enable search for hosted data via data publishing to the index, and these nodes include a search API and web frontend. Identity nodes manage user accounts. All these services together consitute a “Full” ESGF installation. These nodes are installed using the popular Ansible automation platform using our esgf-ansible collection of playbooks.
- Use case:
- I want to install a data and/or index/IdP node software stack using the current architecture
- I want to upgrade my existing node software stack to the latest supported service versions
- New and returning installations:
- Regardless of whether you have installed and administered an ESGF node previously, please read the following document on ESGF policies, as this should influence what type of installation you should do:
- Requirements, Setup and Usage documentation
- Basic Prerequisite:
- The ESGF software stack requires Linux RedHat Enterprise or Centos 7 distributions and administrators have full sudo privileges to root access
- The services are meant to run on webserver-grade hardware. For data nodes, storage holding your data to share must be mounted on the node.
- See the main documentation site for more information
- Source repository on github
- Issues: (bug reporting)
- Installation email list:
ESGF Docker (beta)
- ESGF Docker is the deployment mechanism for the next generation ESGF architecture, and can be tested concurrently with the production platform.
- Use case
- I want to test install the next-generation architecture
- Main page: includes installation instructions
User Interface (CoG) Frontend
- See the CoG README for instructions to access the Admin and Developers Guide:
ESGF data publisher
ESG publisher (esg-publisher)
- The ESG Publisher (esgcet python package) enables data publishers to push references of their data on their site’s data node to an ESGF Index for use in search and retrieval
- Use Cases:
- I want to publish a new dataset to ESGF
- I want to update an existing dataset that I published on ESGF
- I want to retract/delete a dataset that I published from ESGF
- Main Page: (user documentation)
- Publishers to ESGF must have an existing Data Node installed at their site.
- Installation: (Python3 recommended)
- Next generation publisher: (v5 Alpha version)
- This version is compatible with the current and next-generation ESGF Archtectures
- The Next-gen (v5) Publisher can be run external to the Data Node, but the data to be published must be locally accessible on your linux file system.
- Publication working team mailing list:
For Data Preparation, our collaborators at IPSL provide the Pre-publication Tools for a number of ongoing ESGF data projects.
- Description: CDAT is a powerful and complete front-end to a rich set of visual-data exploration and analysis capabilities well suited for data analysis problems.
- Use Cases:
- I want to perform data analysis of multi-dimensional gridded climate and simulation data
- I want to visualize data through graphical plots of gridded data
- Main Page: https://github.com/CDAT/cdat/wiki
- Installation: https://github.com/CDAT/cdat/wiki/install
ESGF Compute end-user API (esgf-compute-api)
- Description: The esgf-compute-api is python package design to interact with the ESGF Compute Node’s Web Processing Service (ECN WPS). It provides access to primitive operations (subset, min, max, etc) that will be execute using remote resources.
- Use Cases
- I want to retrieve a subset of the data.
- I want to execute compute operations on data using remote resources.
- Main Page:
ESGF Compute Node Web Processing Service (ECN WPS)
- Description: The ECN WPS is a scalable compute service. The service is exposed to users through a WPS interface. The compute backend is Xarray based and scales on a Kubernetes cluster.
- Use Cases:
- I want to host a compute service near data.
- Main Page:
Misc software and documentation
CMIP6 administrators and publishers
PCMDI has produced the Data Node Managers guide for CMIP6 data specifically here: https://pcmdi.llnl.gov/CMIP6/Guide/dataManagers.html
- https://github.com/ESGF/sproket Sproket Download tool for command line data search and download, where you can specify search criteria and download data files in a single command.
- http://prodiguer.github.io/synda/index.html From IPSL, Synda is an automated download service to manage massive replica copies of ESGF datasets. (suited for server administrators to operate)