Storage Area Network (SAN) Deployment for CDP and ESG
| |
 |
| |
The Storage Area Network (SAN)
allows 60 TB of data to be shared between different nodes using a
high-speed Fibre Channel switch and network. The SAN makes it possible
for changes made on a dataset by a computational server (e.g. bison)
to be instantaneously available via the shared file system on the SAN.
The user community can access these datasets through the Community
Data Portal (CDP) and the Data Support Section (DSS) server huron,
and we plan to attach a TeraGrid server to the SAN in FY 2007 using
a web interface that lists the datasets available for downloads. |
During the summer of 2004, the Distributed Systems Group (DSG)
became part of a joint SCD and University of Colorado (CU) effort
to evaluate high-performance shared file systems. The use of
high-performance shared file systems was driven by the need to
share common data between diverse operating systems at speeds
exceeding the use of network file systems (NFS). In this shared file
system project, DSG was charged to set up a testbed storage area
network (SAN) that would house the shared file system, the Quantum/ADIC
StorNext FileSystem (SNFS), between attached servers. Quantum/ADIC was
chosen because it works well in a heterogeneous operating system
environment such as SCD, and it is not dependent on any specific
hardware vendor for components such as switches and storage units.
CU would investigate the use of more Linux-specific shared file
systems such as IBM's GPFS and Lustre.
There was a pressing need for the Community Data Portal (CDP) system
and the main Data Support Section (DSS) server to share and provide
large data sets to the user community. Both of these servers were
running the Sun Solaris operating system. As a result, the initial test
was set up to determine how Sun servers interacted with the SNFS over a
SAN. SCD's Mass Storage System Group supported this project by running
benchmarks for file transfer speed rates using the SNFS. The results
were compared to speeds of directly attached storage (DAS) units. The
SNFS ran at speeds comparable to the DAS systems, and it was decided to
put a large shared file system into production between the CDP and DSS
servers. During FY 2006, a DSS Sun server was added to the SAN for
computational support for DSS datasets, and a dedicated Sun server
was added to handle the TIGGE project, bringing the total number
of servers sharing the SAN to four. The storage space was augmented
to more than 50 TB of available space.
This project supports the NCAR strategic priority of "Developing and
providing advanced services and tools."
For FY 2006, datasets on the SAN include:
- ECMWF ERA-40 Reanalysis Data (ERA40)
- NCEP North American Regional Reanalysis Data (NARR)
- International Comprehensive Ocean-Atmosphere Data Set (ICOADS)
- CME (Carbon in the Mountains Experiment)collaboration between
CGD, EOL, ACD, NASA, NOAA, and several universities
- ACD models and visualization clients code (including MOZART and
TUV models) (Model for OZone And Related chemical Tracers,
Tropospheric Ultraviolet and Visible radiation model)
- HIAPER test flights data
- University of Oklahoma hurricane Isabel case study
- WACCM model data (Whole Atmosphere Community Climate
Model)collaboration between ACD, HAO, and CGD
- WRF forecast for hurricane Katrina (Weather Research and
Forecast model)
Many activities for the CDP/ESG (Earth System Grid) shared file
system are planned for FY 2007. The ESG project was moved off the
dataportal server and set up as a gateway, called datagrid, during
the first phase of integrating NCAR into the national TeraGrid. The
datagrid server will be attached to the SAN as soon as it passes its
security analysis. It should be online in early FY 2007. The amount
of data being housed for ESG will be expanded to more than 80 TB by
the end of the year.
The SAN is made possible by NSF Core funds including CSL funding.
|