CISL 2007 annual report banner

Chronopolis: Federated digital preservation across space and time

 
 
Chronopolis organization

This image represents a conceptual architecture for the Chronopolis Digital Preservation Environment. Through a combination of advanced data management technologies, geographic replication, and policy instruments, Chronopolis is aimed at providing cyberinfrastructure for national-scale digital information preservation.

 

There is a critical and growing need to organize, preserve, and make accessible the increasing number of digital holdings that represent vital intellectual capital, much of which is precious and irreplaceable. Chronopolis is a strategic collaboration among the San Diego Supercomputing Center (SDSC, lead organization), NCAR/CISL, the University of California Library System, and the University of Maryland. It is aimed at developing national-scale digital preservation infrastructure that has the potential to broadly serve any community with digital assets—science, engineering, humanities, and more.

This new effort encompasses studying viable models and effective systems that facilitate establishing standard reference datasets, preserving collections that evolve over time, and establishing preservation resources "of last resort" for digital assets that might become lost. Digital collections that must persist for 100 or more years are one important focus of this activity. Approaching this problem requires a special synthesis of relationships and capabilities: scientists, librarians, curators, computer scientists, and long-term distributed cyberinfrastructure.

The problem spans the gamut of academic scientific disciplines, historical collections, and digital library content. Though broadly useful, new capabilities developed in Chronopolis are expected to be powerful services that we can potentially offer to the Earth System sciences community through, for example, our Community Data Portal (CDP). This activity addresses NCAR's strategic goal to "Provide robust, accessible, and innovative information services and tools."

In FY2007, we deployed additional core Chronopolis infrastructure on our computational systems and on the TeraGrid, integrating and testing end-to-end capabilities that include archival functions on the NCAR MSS. Under the terms of last year's Memo of Understanding with SDSC, we began the process of replicating some of our unique observational and reanalysis datasets, and these activities have made good use of our new TeraGrid connections.

We also received a positive response from The National Digital Information Infrastructure and Preservation Program (NDIIPP) regarding our expressed interest in developing an NDIIPP-supported Chronopolis pilot project. We are in the final phases of establishing a 1-1.5-year contract with NDIIPP where NCAR will co-develop core preservation infrastructure for the effort. Our pilot project is framed initially as an R&D activity aimed at prototyping a preservation environment for the California Digital Library's (CDL) Web-at-Risk project and the Interuniversity Consortium for Political and Social Research's (ICPSR) data archive.

In FY2008 we expect to begin our Chronopolis pilot project, and the first steps will include building and integrating the preservation system. Once we have the prototype system populated and operating, we will simulate a "disaster" and investigate the performance and effectiveness of resurrecting a lost, multi-terabyte data resource. We will also continue our efforts to broaden our support base and associated scope in the upcoming year.

CISL is engaging in Chronopolis as an important strategic thrust, supporting it through a combination of NSF Core funding and NCAR's Cyberinfrastructure Strategic Initiative (CSI).