Community Data Portal (CDP) plans
In the next year and beyond, the broad goals of the Community
Data Portal project are to enhance the scope of the data services
offered through the portal as well as augment the size and the
diversity of the data holdings, fostering interoperability and
integration with all data management and access systems currently
maintained by different groups within UCAR, NCAR and UOP.
To this end, the CDP will continue collaboration with several
projects and groups within and outside NCAR, in particular:
With DSS to offer high-level data services (aggregation,
subsetting, and ultimately analysis) of highly requested datasets
like ERA-40 and NCEP.
With EOL, both to enable access to existing observational
datasets and to develop a new paradigm for data infrastructure
and support during NSF-funded field campaigns (proposed joint
development of a Virtual Operations Center for Field Experiments
in Atmospheric Science).
With WMO (World Meteorological Organization) to establish
NCAR as a primary U.S. DCPC (Data Collection or Production Center)
within the WIS (WMO Information Systems) global network.
With CISM (Center for Integrated Space Weather Modeling) to
enable public and restricted access through a single data portal to
Space Physics and Space Weather datasets stored at participating
universities and institutions with underlying data management
and transfer provided by SRB (Storage Resource Broker).
With UC-Irvine to provide advanced server-side data
processing capabilities backed up by netCDF Operators for
datasets hosted on the CDP.
With the NCAR GIS initiative to serve additional data
types (CCSM ocean, land, and ice datasets) through the NCAR
GIS portal.
With all other data providers who express a desire to
leverage the CDP infrastructure to access and share data with a
restricted team of collaborators or with the community at large.
On the technical side, CDP staff will be involved in the following
major development areas:
Sub-portals and branding: the creation of a general framework
such that sub-portals for a specific division (e.g. EOL), scientific
theme (e.g. Global Warming), or project (e.g. Mirage) can be easily
created and maintained to expose a specific branding and
functionality while fully leveraging the broad portfolio
of the CDP data services.
Data publishing: definition, development and support for
machine-negotiated API for ingesting data and metadata into the
CDP system.
OAI: Further development of the OAI infrastructure for
exchanging metadata with partner institutions: establish
production-level services, possibly broaden the number of partners,
feed metadata records to Google search engine.
Visualization: Start to explore the possibility of providing
server-side visualization capabilities on CDP data holdings through
NCL and PyNGL, either via the portal interface or via standalone
web services clients.
The CDP and RDA datasets currently reside on a high-speed shared
file system between the systems run over a storage area network
(SAN). The storage space of 20 TB will be augmented over the next
year to over 50 TB. The additional space will accommodate growth
in the RDA, data from other providers in the community, and testbed
space for CPG activities. Further details about the SAN development
plan including the needs for data policies appear at
Storage Area Network (SAN) deployment
plans for CDP/ESG.
|
|
 |
FY2005 Annual Report |
|
|
|