Research Data Archive (RDA) plansThe Research Data Archive is dynamic. Curation extends and adds to existing datasets, while stewardship improves the documentation, creates systematic organization, applies data quality assurance and verification, and develops data access for the users. All activities are based on this simple principle: acquire the necessary data and make it readily available for scientific research. Many routine tasks and background infrastructure developments are necessary to maintain the RDA. The major new activities are:
Each new major activity is briefly described below. CISL will collaborate with ECMWF and the Chinese Meteorological Agency (CMA) to establish identical international data repositories for TIGGE. This involves receiving, archiving, and providing access to ensemble forecasts from potentially nine international centers. The activity has several challenges: the data volumes are relatively high (200 GB/day), the most current data must be quickly available through the CDP, and secure long-term archives and services need to be implemented. The FY2006 goals, and first-phase TIGGE program objectives, are to achieve as much as practically possible by leveraging existing capabilities and infrastructure. Out year planning includes expanding services under program support. User access to ERA-40 data will continue to improve in FY2006 through development of new and enhancement of existing data access methods. The CDP has recently deployed an access control framework. This framework provides user access control in the portal environment and makes it possible to serve data with restricted access like ERA-40. Appropriate THREDDS catalogs will be created to document a new netCDF-formatted collection of ERA-40 data in the RDA, and individual data files and custom subsets of this collection will be served to users via the CDP. The existing ERA-40 user access on the RDA server (dss.ucar.edu) will also be enhanced so that users can obtain GRIB-formatted subsets based on various selections (e.g. date and time, parameter, level, etc.). These new services will first target the lower data volume and most-used products from ERA-40. In the out years, other products will be served in a similar manner. Service expansion will be based on earlier year successes and lessons learned in the process. The International Comprehensive Ocean-Atmosphere Data Set (ICOADS) is a world-wide recognized reference marine surface data collection. The reference period will be extended, giving global ocean coverage for 1784-2004. This includes preparation of new data, reprocessing old data, blending in research-quality inputs and removing matching real-time data, creating monthly summary products at 1° and 2° resolution, and providing service through a newly designed interface to all archive files and user-selected subsets. ICOADS has been developed with NOAA over the past 20 years and is a significant contribution to U.S. IOOS and international GEOS observing systems. Through national and international recognition and promotion, this project will continue in the foreseeable future. An ambitious plan has been devised and work has begun that will result in systematic access to the RDA upper-air observation datasets. In the form of numerous diverse individual datasets, the RDA holds one of the world's most complete collections of upper-air observations from in situ instruments. The current data organization is not optimal for user access and support. This project will correct the deficiencies in a stepped fashion by reducing the many diverse historical data formats to one, sorting all the pieces into the same order, developing consistently formatted inventories for cross-checking observations between pieces and driving user-determined subset selection, upgrading station histories and libraries to provide the best estimate of station elevation and location over time, and ultimately offering users choices for data level selection and output format. There are myriad data problems to be solved or accommodated throughout this development. The original archives will be preserved, and to the extent possible, user products will be derived from the basic archives with software providing the final data integration and merging. The first version offering public access will be online in FY2006 and in future years, further refinements, enhancements, and extensions will be added. |
|
|||||