CISL Annual Report banner  
   

Research Data Archive

 
 

Service and growth metrics for the Research Data Archive (RDA) during FY 2006. 1a) The number of unique users separated by access pathway: the NCAR MSS, publicly available web servers, and one-time special requests prepared for individual users. 1b) The amount of data provided separated by access pathway. 1c) The amount of data in the archive on the MSS showing the annual growth. 1d) The amount of data on the public servers (a high-demand subset of the RDA archive) showing the annual growth.

Figures 1a-b indicate the RDA's great significance to the community, and Figures 1c-d show the annual progress toward building more valued content into the RDA.

The Research Data Archive (RDA) is a key part of the NCAR strategic priority to create an Earth System Knowledge Environment because it provides an information resource through a large collection of data sets that supports scientific studies in climate, weather, Earth Systems modeling, and increasingly other related geosciences. The RDA activities can be viewed from two different perspectives, user data services and archive content development. Over 5,000 unique users were provided 95 TB of data through three primary access pathways: the NCAR MSS, public servers on the web, and one-time special requests prepared for individuals (Figures 1a-b). The largest user group is associated with the web access pathway, and the largest amount of data is transferred via the MSS access pathway. A simple measure of content development is archive growth. The RDA is 74 TB with 6.0 TB (9%) added during FY 2006 (Figure 1c). Global and regional atmospheric reanalyses and operational analyses are the largest contributors to this growth. The most demanded datasets from the RDA are provided to users through publicly available web servers. In FY 2006, the data available from the web servers has grown 60% and is now 12.1 TB (Figure 1d).

As a whole, the RDA is constantly changing. Curation extends and adds to existing datasets, while stewardship improves the documentation, creates systematic organization, applies data quality assurance and verification, and develops access for users. Many routine tasks and background infrastructure developments are necessary to maintain the RDA. The major new activities for FY 2007 include:

  • Initiate operational data collection, archiving, and user access through the Community Data Portal for TIGGE (THORPEX [The Observing System Research and Predictability Experiment] Interactive Grand Global Ensemble).

  • To continue supporting user access to atmospheric reanalysis datasets, add the Japanese 25-Year Reanalysis (JRA-25) to the RDA. Action is pending approval from the Japanese Meteorological Agency.

  • Add a global ocean reanalysis and continuing operational analysis.

  • Continue to aggressively expand the amount of data available through the web servers (both the advanced Community Data Portal and the traditional server). This includes longer time series of routine collected operational models and observed data, and new datasets relevant to large user groups.

The RDA maintenance and development within CISL is supported entirely by NSF Core funding.