![]() |
![]() |
While some of the data stored on the NCAR MSS originate from field experiments and observations, the bulk of the data is generated by global climate-simulation models and other earth-science models that run on supercomputers. SCD therefore faces an increasing demand to archive data from ever-faster supercomputers. Essentially, the faster the supercomputer, the more data there are to be archived. Ever-greater demands for archiving data will result from the growing use of coupled atmospheric/oceanic simulation models.
The NCAR Mass Storage System has evolved over the last fifteen years. Prior to late 1989, Mass storage at NCAR was comprised strictly of off-line, manual-mount media. In November 1989, the first STK Powderhorn Automated Cartridge System ("silo") was acquired, commencing a new era of mass storage at NCAR. The following figure shows the various technologies that have been utilized to store critical datasets throughout NCAR's history.
![]()
During FY2000, the NCAR Mass Storage System grew from 6,850,168 files with a total of 212.04 TB to 8,329,517 files with a total of 274.02 TB. This was an average net growth rate of 5.165 TB per month during FY2000.
The NCAR MSS provides direct storage-device access via a High-Performance Data Fabric (HPDF). The data fabric consists of host computer High Performance Parallel Interface (HiPPI) channel interfaces, non-blocking HiPPI switches capable of supporting multiple bi-directional 100 MB/sec data transfers, and protocol converters that connect the HiPPI data fabric to the IBM-style device control units. The data fabric provides data paths directly between the MSS storage devices and the client compute servers. To utilize this data fabric, SCD has written a file-transport type of interface to enable users to copy files between their host systems and the MSS. The data fabric can support 14 independent file-transfer operations between the storage devices and the compute servers, with 4 transfers sustaining 3 MB/sec each and 10 transfers sustaining 10 MB/sec each, for an aggregate total of 112 MB/sec.HiPPI technology continues to be deployed only in a niche market. It has not shown signs of spreading into the commodity marketplace, and as a result the cost of HiPPI technology has remained high and the number of HiPPI vendors is dwindling. The lack of availability and support of HiPPI technology is becoming a critical issue to the continued operation of the MSS. Replacement technologies are on the horizon, but they are not yet widely available nor are they functional enough to immediately replace HiPPI. Promising replacement technologies are Fibre Channel and Network Attached Storage Devices. Fibre-Channel-attached RAID units are available today at extremely attractive costs. Over the next few years, the number and types of available Fibre-Channel-attached devices are expected to grow and include tape storage. Once tape devices can be Fibre-Channel attached, SCD will evaluate the replacement of our HiPPI fabric with Fibre Channel.
Network Attached Storage Devices (NASD) is another emerging technology that is being closely tracked by SCD. Today a handful of vendors supply Network File System (NFS)-based NASD devices. Some vendors are developing "local-disk"-attached NASD products using Fibre Channel and HiPPI connections. SCD's current strategy is to deploy a Fibre Channel infrastructure and add NASD to it at a later time. The end result will be the decommissioning of our HiPPI fabric and ESCON and BMX storage devices, and the wholesale replacement of those older technologies with new, vendor-supported (and hopefully standards-based) technologies.
The NCAR MSS currently uses two levels of storage: online and offline. The most frequently accessed data are kept on the fastest storage media, which is the online storage devices: 180 GB of IBM 3390 Model 3 disks and five StorageTek Powderhorn ACSs. The Powderhorn ACSs use StorageTek 9840 as well as StorageTek SD-3 (Redwood) technology. Currently, the NCAR MSS has five ACSs providing a total online capacity of approximately 1 petabyte. Lower in the storage hierarchy is a 3490E offline cartridge tape library holding 142,000 cartridges that can be staged with one of the 16 external IBM 3490E manually mounted cartridge drives. These media are in read-only mode and a migration off of this media was started in FY2000. StorageTek 9840 and SD-3 drives have been added to the offline storage level for providing secondary copies of the Powderhorn-resident files."Data Ooze" refers to the massive task of transferring tens of terabytes of data from old media to modern media before the equipment that uses the old media becomes obsolete. This task by itself is straightforward; however, this data ooze must be handled as a background task while the processing and storage components of the system remain fully dedicated to supplying prompt, 24-hour-per-day service to users. When the ooze is complete, the total capacity of the offline archive (assuming no reduction in the offline archive's available floor space in the SCD machine room) will exceed 1 petabyte. A data ooze to StorageTek 9840 technology started in the spring 2000. To date the StorageTek 4490 media have been oozed to 9840 media and the StorageTek 4490 drives have been decommissioned.
Expansion of the MSS storage hierarchy is planned over the next five years with the introduction of new tape technologies, new ACSs, and with the integration of a front-end file server having its own HSM to offload active and temporary data. The MSS archive will become a back-end store for the file server accessed only by the front-end HSM. A single global name space will be provided for all data managed by SCD. Evaluation of HSM solutions will continue in FY2000.
Another important capability of the NCAR MSS is the ability to import and export data to and from external portable media. Importing data involves copying data from portable media to the MSS data archive, while exporting data involves copying data from the MSS data archive to portable media. Import/export allow users to bring data to NCAR with them, as well as take data away. Import also allows data from field experiments to be copied to the NCAR MSS archive.Options to exchange data with smaller satellite storage systems are being investigated. Using this technique, data generated at NCAR could be transferred to remote sites for further analysis. The NCAR SCD storage model would thus be geographically distributed, rather than centrally located and administered.
In addition to 3480 and 3490E cartridge tapes and 9-track round tapes, the NCAR MSS also offers import/export to single and double-density 8mm Exabyte cartridge tapes. The deployment of an MSS-IV Import/Export server in FY2000 provided the ability to support many more device types, such as CD-ROM, DAT, and newer Exabyte media, to name a few.
In FY2000, SCD made the following improvements to the NCAR MSS:Y2K compliance:
Modifications and tests were made to the MSS to ensure Y2K compliance. The MSS successfully survived the passing of 1 Jan 2000 and 29 Feb 2000 without incident.HPDF upgrades:
All remaining S-BUS parallel HiPPI interfaces were converted to serial HiPPI in FY2000. Two new HiPPI switches were integrated into the HPDF and two older switches that are no longer supported by the vendor were decommissioned. A proof-of-concept serial HiPPI project was completed that demonstrated high-speed MSS connectivity could be extended from the Mesa Lab to the Foothills Lab using dark optical fibre from the BRAN project. Plans were made to deploy 2 MSS HiPPI host connections at the Foothills Lab in early FY2001.MSS hosts:
The Compaq ES40 Cluster HiPPI interface was debugged and integrated into the MSS.New tape technology:
After beta testing the StorageTek 9840 tape drives in FY1999, they were integrated into the MSS in FY2000. The 9840 technology replaced the StorageTek 4490, and IBM 3490E drives. The StorageTek 9940 tape drive was beta tested in FY2000. The 9940 uses the 9840 technology, but the 9940 cartridge capacity (60 GB) is three times that of the 9840 cartridge (20 GB), making the 9940 a viable replacement for the StorageTek SD-3 media. Integration and production deployment of the StorageTek 9940 tape drive will be completed in FY2001, after which time the data ooze off the SD-3 media will begin.MSS services:
Web-enabled MSS services debuted in FY2000 with the deployment of the MSS Accounting Tool and a proof-of-concept MSS FTP server.
NCAR Mass Storage System growth during FY2000 continued at a flat rate despite the introduction of a 160-node IBM SP and a 4-node Compaq ES40 cluster into the SCD computing environment. The amount of new data injected into the MSS grew, but the amount of data purged also grew; this resulted in a flat net growth rate. The addition of this new supercomputing power effectively doubled the average number of GFLOPS being delivered to user applications. Projecting this growth into the future, it is not difficult to realize that new storage paradigms and user education will be required, since without this the growth in just three to five years will be untenable.The following table compares year-end statistics for FY1997, FY1998, FY1999, FY2000, and projected statistics for year-end FY2005. The FY2005 estimates assume a flat budget for supercomputing, historical data storage trends at NCAR, and Moore's Law growth in computer performance per unit cost. Even with the most optimistic vendor projections for storage densities and costs, these estimates indicate that the NCAR MSS would require between one and two dozen ACSs and the annual MSS budget will exceed that for supercomputers.
MSS growth statistics and expectations eFY1997 eFY1998 eFY1999 eFY2000 eFY20054 Total storage (TB) 110 150 212 273 5,700 Total files (x106) 3.9 5.1 6.9 8.3 190 Net growth (TB per month) at eFY 3.0 5.0 5.0 5.2 220 Data read/written (TB per month) 161 20 20 25 500 Data migrated internally (TB per month) 16 20 20 25 500 Manual tape mounts (number per month) 60,000 37,000 28,000 18,000 1,0005 Robotic tape mounts (number per month) 50,000 37,000 46,000 54,000 900,0005 Offline cartridge count 165,0002 169,0003 169,000 142,000 85,0006 GFLOPS on NCAR computing floor ~10 ~20 ~36 ~83 ~1,000 Notes:
- 16 TB per month = 5 MB/sec
- All on IBM 3490 cartridge media
- Mixture of 166,700 3490 cartridge media and 2,300 SD-3 cartridge media
- Projected assuming a flat computational budget through 2005
- Assumes one copy of all data is under robotic mount control
- Assumes size of existing offline archive will decrease as media densities increase and existing data is oozed to those higher density media
A description of the four-year MSS roadmap is provided on the web at:
http://www.scd.ucar.edu/hps/NCAR-MSS-Strategic-Plan.htmlAn MSS roadmap presentation appears at:
http://www.scd.ucar.edu/dig/atlas/mss/mss/index.htmKey issues to be addressed over the next four years include:
- Providing web-based MSS tools and interfaces
- Providing MSS interfaces for higher level data portals and data servers
- Managing MSS data growth
- Integrating new storage technologies to keep pace with the projected growth in computing cycles
SCD ASR - Table of contents