SCD Research: Computational Science
| |
 |
| |
This diagram shows the newly
created experimental environment, its components, and how it relates
to the NCAR production environment. The objective of this experimental
environment is to provide the infrastructure required to track, evaluate,
and adopt new technologies. This capability supports NCAR's leadership
in high performance computing and helps it influence the design of
coming generations of computing and information systems. |
SCD's Computational Science Section (renamed at end-FY 2006 as the
CISL Computer Science Section) is responsible for tracking and evaluating
new computing technologies, making early adoption decisions, and for
performing systems research. Section members are actively pursuing
research in the following areas:
- High-performance computing
- Grid computing
- Experimental systems
- Linux clusters
- Experimental networking and evaluation of high-performance
interconnects
- System and network performance analysis
- High-performance file systems and archival storage systems
- Parallel Algorithms and Architectures
- Model development
Research results from the past year include our successes in
model development,
high-performance computing,
Grid computing, and
experimental systems.
One area of strategic importance to the Section is Linux cluster
research. In collaboration the University of Colorado, we have
evaluated various aspects of cluster design, including parallel file
systems, diskless compute nodes, scalable management systems, and
cluster interconnect solutions. Some results of this research are
a better understanding of how to deploy cost-effective computational
clusters that can accommodate the NCAR scientific workload while
minimizing ongoing management overhead and complexity.
One essential infrastructure component required by the high-performance
computing community is archival storage, as evidenced by NCAR's >3-PB
Mass Storage System. Magnetic tape is a well-established archival storage
medium, but tape systems have limited read and write throughput, require
tape retrieval queue time, and risk storing all of the important data in
one place. To address these limitations, we are building a reliable and
high-performance file system for archival storage using low-density
parity check codes (LDPC). The advantage of moving to an LDPC scheme
based on an open software infrastructure is that it allows us to
leverage emerging storage solutions. To date it has been shown that
Tornado encoding schemes can be designed so that they are significantly
more fault tolerant than either RAID or mirrored systems, and that by
using cooperatively selected Tornado Code graphs to build a geographically
distributed data stewarding system, one can obtain overall systems fault
tolerance exceeding that of its constituent storage sites or site
replication strategies.
These efforts support NCAR's strategic priorities of "Providing
capability and capacity supercomputing to the community," "Developing
and providing advanced services and tools," and "Creating an Earth
system knowledge environment." These SCD research projects and programs
are supported by NSF Core funding, with other support as indicated by
the individual reports in this document.
|