Storage Area Network (SAN) Deployment for CDP and ESG
During the summer of 2004, the Distributed Systems Group (DSG) became part of a joint SCD and University of Colorado (CU) effort to evaluate high-performance shared file systems. The use of high-performance shared file systems was driven by the need to share common data between diverse operating systems at speeds exceeding the use of network file systems (NFS). In this shared file system project, DSG was charged to set up a testbed storage area network (SAN) that would house the shared file system, the Quantum/ADIC StorNext FileSystem (SNFS), between attached servers. Quantum/ADIC was chosen because it works well in a heterogeneous operating system environment such as SCD, and it is not dependent on any specific hardware vendor for components such as switches and storage units. CU would investigate the use of more Linux-specific shared file systems such as IBM's GPFS and Lustre.
There was a pressing need for the Community Data Portal (CDP) system and the main Data Support Section (DSS) server to share and provide large data sets to the user community. Both of these servers were running the Sun Solaris operating system. As a result, the initial test was set up to determine how Sun servers interacted with the SNFS over a SAN. SCD's Mass Storage System Group supported this project by running benchmarks for file transfer speed rates using the SNFS. The results were compared to speeds of directly attached storage (DAS) units. The SNFS ran at speeds comparable to the DAS systems, and it was decided to put a large shared file system into production between the CDP and DSS servers. During FY 2006, a DSS Sun server was added to the SAN for computational support for DSS datasets, and a dedicated Sun server was added to handle the TIGGE project, bringing the total number of servers sharing the SAN to four. The storage space was augmented to more than 50 TB of available space.
This project supports the NCAR strategic priority of "Developing and providing advanced services and tools."
For FY 2006, datasets on the SAN include:
Many activities for the CDP/ESG (Earth System Grid) shared file system are planned for FY 2007. The ESG project was moved off the dataportal server and set up as a gateway, called datagrid, during the first phase of integrating NCAR into the national TeraGrid. The datagrid server will be attached to the SAN as soon as it passes its security analysis. It should be online in early FY 2007. The amount of data being housed for ESG will be expanded to more than 80 TB by the end of the year.
The SAN is made possible by NSF Core funds including CSL funding.