Community Data Portal (CDP) accomplishments

The NCAR Community Data Portal (https://cdp.ucar.edu/) is an NCAR Cyberinfrastructure Strategic Initiative aimed at developing a central institutional gateway to the large and diversified data holdings of UCAR, NCAR, and UOP. The ultimate goal of the project is to provide a state-of-the-art data portal with a broad spectrum of functionality ranging from data search and discovery to catalogs and metadata browsing, and from high performance and reliable data download to server-side data processing including aggregation, subsetting, analysis, and visualization.

CDP home page

During 2005, CDP staff collaborated with data providers within and outside UCAR to make several new datasets accessible through the CDP:

  • CME (Carbon in the Mountains Experiment) data from numerous disparate observational platforms (airborne and ground-based)

  • WACCM-3 (Whole Atmosphere Community Climate Model) case studies

  • ACD models source code and analysis tools, including MOZART (Model for OZone And Related chemical Tracers) versions 2 and 4, TUV (Tropospheric Ultraviolet and Visible radiation model), SOCRATES, and GeoV.

  • DAYMET (Daily Surface Weather Data and Climatological Summaries)

  • WRF (Weather and Research Model) forecasts for the Katrina hurricane event

  • HIAPER (High Performance Instrumented Airborne Platform for Environmental Research) aircraft test flights data

  • EOL data holdings for several field campaigns

Other data publishing efforts currently ongoing include case studies from University of Oklahoma collaborators (starting with radar data for 2003 hurricane Isabel) and data from the CIPDSS project (Critical Infrastructure Protection Decision Support System), as well as high-level data services for ERA-40 data managed by DSS (Data Support Section). Currently the CDP allows access to more than 900 datasets in a wide variety of disciplines, platforms, and formats.

During 2005, the CDP software infrastructure was upgraded and expanded in many respects, with particular emphasis on maximizing the reliability, scalability, and uniformity of the services provided. The major innovations with respect to last year were:

  • An extensive framework for Access Control to data and services was developed and deployed, which allows selective authorization based on user groups and roles. This framework (developed under direct requirements from data providers) allows web-based management of groups by local administrators, potential sharing of user information among federated portals, and it interoperates with the UCAS and DSS authentication protocols.

  • Several applications were developed to facilitate the process of publishing data to the CDP: a Catalog Crawler that allows generation of data catalogs from listing of local and network-accessible directory trees, a web-based application for in-context editing of metadata by the data providers, and an Indexer application to make the content of the catalogs and their metadata searchable by users.

  • A new architecture based on the OAI (Open Archive Initiative) protocol was set up to exchange metadata records with partner data centers BADC (British Atmospheric Data Center) and GCMD (NASA Global Change Master Directory), and for having the CDP records harvested by other institutions like the University of Michigan and Yahoo. This system allows data indexed by the CDP to be found by users of a wide variety of scientific data centers and commercial search engines.

  • The CDP data download capabilities were enhanced by supporting Data Mover Light, a Java client-side application developed by ESG collaborators at LBNL which is intended to allow easy, one-click downloads of multiple files from the data portal (where the files may have been previously retrieved from the NCAR MSS).

Some of the most fruitful collaborations during the past year are noted below:

  • CDP staff worked together with the NCAR GIS initiative to develop and deploy the NCAR GIS portal (http://www.gisclimatechange.org/), which serves CCSM IPCC data to the GIS community.

  • CDP staff collaborated intensively with EOL toward the final goal of enabling the CDP data services for all the extensive data holdings managed by EOL. A first concrete result of this collaboration was the dynamic generation of THREDDS catalogs from the EOL metadata database, which can then be viewed and browsed through the CDP.

  • Prototype work was initiated with the CISM (Center for Integrated Space Modeling) project to set up a network of interconnected SRB (Storage Resource Broker, developed by UCSD under NSF sponsorship) servers at NCAR, BU and other institutions to allow easy access and distribution of Space Physics and Space Weather data.

  • CDP staff continued collaborating with WMO (World Meteorological Organization) toward the establishment of a WIS (WMO Information Systems) prototype. In collaboration with DSS, metadata from three major NCAR datasets (CCSM, ERA-40, and NCEP) was converted to the WMO specification and made available for integration in the WMO prototype data portal.

  • Cutting-edge research was undertaken in collaboration with CGD, ACD, Unidata, and ATD to enable direct access by analysis and visualization clients like IDV (Interactive Data Viewer) to restricted multiple data sources hosted on the CDP. This access paradigm promises to significantly enhance the way scientists access and analyze data during a field campaign, and in general the way data is accessed, combined, and shared within a geo-scientific project of any kind.

IDV-based access to CDP

In addition, CDP staff participated in the annual GO-ESSP (Global Organization for Earth Science System Portals) meeting, which included participants from centers and universities worldwide, and had preliminary contacts with engineers from UC-Irvine toward collaborating on developing improved server-side data processing capabilities based on the netCDF Operators.

Finally, the CDP project was paramount in generating new funding opportunities that will further enhance the NCAR cyberinfrastructure capabilities. Newly funded or just started projects include VSTO (Virtual Solar Terrestrial Observatory, in collaboration with HAO and Stanford University), Chronopolis (Data Preservation over Space and Time, in collaboration with UCSD and UMD), and Earth System Curator (in collaboration with ESG and ESMF).

 

 

FY2005 Annual Report