CISL 2007 annual report banner

Message from CISL Director Al Kellie

CISL Director Al Kellie

I am proud to present the FY2007 Annual Report of NCAR's Computational and Information Systems Laboratory (CISL). As you will find in this report, CISL has completed another productive year and is well prepared to continue its research and service mission to the atmospheric and related sciences. Our plans for the future are organized by these five broad initiatives:

  • Replace NCAR's computing facilities to advance the leading edge of geosciences simulation
  • Procure, secure, and deploy robust and highly capable cyberinfrastructure that can support such advances
  • Strengthen and extend our research arm that enables geoscientists to accomplish more
  • Develop and improve tools and research environments that extend the reach of our communities of researchers
  • Ensure that our work remains meaningful to others in academia, government, industry, and the general public

Specifically, our top priorities for FY2008 include:

  • Pursuing the expansion of the geosciences computing facility with our partners in Wyoming
  • Increasing our support of science with Breakthrough Computational Campaigns and TeraGrid-enabled applications
  • Stimulating rising talent in the computer sciences, applied mathematics, and statistics communities with research challenges in geosciences simulation
  • Releasing and improving powerful new knowledge-based tools for production use by our constituents
  • Increasing the relevance of our work to science communities and society

The following overview points to highlights of CISL's 2007 report via links from the images.

Space weather

One of the many science drivers for NCAR's discipline-specific petascale supercomputing center, this visualization of a coronal mass ejection impacting the Earth's magnetosphere represents some of the significant challenges in this field that will benefit from petascale computing capability. These include first-principles modeling of solar convection and its contribution to the 22-year solar cycle, and crucially, modeling the emergence of magnetic flux from the solar convection zone and the conditions that lead to solar flares and coronal mass ejections. (Image courtesy of Michael Wiltberger, NCAR HAO.)

Evolution of a salinity

Visualizations from a large-scale simulation conducted as part of the Breakthrough Science initiative. Shown here are double diffusive convective motions organized by ambient shear into alternating planar regions of rising and sinking fluid that contain opposed horizontal flow patterns. This process depends heavily on the vast difference between the molecular diffusivities of heat and salt in water, which renders the simulation a computational grand challenge. Using 320,000 processor hours on blueice, these simulations were the first to take full account of that difference and evolve to a fully turbulent state. It is therefore the first simulation of the important and largely unexplored natural phenomenon of saltwater turbulence. (Image courtesy W.D. Smyth, Oregon State University.)

Solar physics reseach uses

The cover of the 2007 TeraGrid Science Highlights brochure showcases a visualization of giant cell convection patterns beneath the surface of the sun. These processes, revealed by a recently developed model that allows scientists to examine inner workings of the sun that are hidden from any current observational technique, are being explored by researchers at the University of Colorado and NCAR using terabytes of data that reside at the Pittsburgh Supercomputing Center and the San Diego Supercomputer Center. Using NCAR's TeraGrid network node and VAPOR software, this new ability to explore remote data via the TeraGrid holds potential to significantly advance U.S. scientists' ability to rapidly pursue research questions that demand large-scale resources. (Image courtesy of Mark Miesch, NCAR HAO.)

Model calibration technique

Generated as part of research being conducted in IMAGe, this figure displays contours of the posterior distribution of optimal calibration parameter values for the LFM model of the magnetosphere for a 1997 geomagnetic storm. A space-filling statistical design was used to choose a collection of values of the parameters (blue dots), and the resulting model runs were used to fit a statistical model to the surface representing the discrepancy between the LFM model output and satellite data from the date of the storm. The goal of this research is not only to improve the model and our understanding of the magnetosphere, but also to do so in a manner that uses computational resources most efficiently.

ESMF data assimilation output

This image shows some of the first ultra-high-resolution results (2/3-degree longitude by 1/2-degree latitude) from the GEOS-5 atmospheric general circulation model coupled via ESMF to a sophisticated NASA stratospheric chemistry package (STRAT-CHEM). The GEOS-5 modeling and data assimilation system developed at NASA's Global Modeling and Assimilation Office consists of over 24 ESMF gridded components that can be coupled through the framework for a variety of applications, from atmospheric reanalysis and weather prediction to coupled climate modeling. The coupling of GEOS-5 and STRAT-CHEM enables scientists to perform calculations with a variety of interactions between the chemistry and the atmospheric radiation, large-scale dynamics, and sub-grid parameterizations.

2007 SIParCS interns

These interns worked at NCAR in the mutually beneficial SIParCS program. The CISL-based SIParCS program challenges students in applied mathematics and computational science to help solve real-world problems associated with CISL's mission to support the atmospheric and related sciences. The students gain valuable work experience, and CISL is cultivating a skilled workforce for future supercomputing centers.


Ready to break ground for the NCAR Supercomputing Center:
CISL, with the support of the geosciences modeling community, various advisory bodies, and the NSF, has established a partnership with the University of Wyoming, the State of Wyoming, and Wyoming businesses to develop a modern, energy-efficient NCAR Supercomputing Center (NSC) in Cheyenne, Wyoming.

This vision for the NSC is well aligned with the NSF strategic plan and the NSF vision for cyberinfrastructure. For example, CISL has been prototyping future NSC operations through our TeraGrid integration work. The NSC project is driven by science and is being proposed in direct response to the exploding demand for both capability and capacity HPC resources needed by Earth system science researchers. Regional climate simulations of the future with resolutions approaching 10-20 kilometers will require scaling up to petascale resources.

Breakthrough Science initiative:
NCAR and NSF created a new level of supercomputing resource allocation early in FY2007. Before NCAR's newest and most capable supercomputer was released for production use, NSF program managers in OCE and EAR, and the CISL HPC Advisory Panel (CHAP) invited a small number of researchers with successful records for using large amounts of processor hours to run very large simulations. Because of its potential for discoveries through simulation, this initiative is named Breakthrough Science (BTS).

Six of the eight projects successfully used their large allocations, consuming almost 3 million processor hours on blueice. At end-FY2007, two of the six projects had submitted papers for publication, and three others have multiple papers in progress. Because of the BTS successes, the NCAR Executive Committee and CHAP both decided to continue this practice of allocating large amounts of computer time to a single project at a time.

This is significant because it shows that CISL can provide the necessary resources and that university researchers are ready and able to effectively use very large allocations for their geosciences research.

TeraGrid computing resources enter production phase:
NCAR's IBM Blue Gene/L (BG/L) supercomputer, named frost, became an operational TeraGrid resource on August 1, 2007, and it is expected to provide 4.5 million CPU hours annually to the TeraGrid research community. In addition to the computational resources, NCAR is also testing experimental systems and services on the TeraGrid. These include the wide-area versions of parallel file systems from IBM and Cluster File Systems, as well as a remote data visualization capability based on the VAPOR tool, an open source application developed by NCAR, the University of California at Davis, and Ohio State University under the sponsorship of the National Science Foundation.

The NCAR TeraGrid integration effort in FY2007 shifted from the equipment acquisition and deployment phase that dominated FY2006 to a new phase characterized by security-hardening TeraGrid components, CTSS software deployment, testing, and migration, and integration of accounting software. During this period, testing of the storage cluster capabilities expanded to experimentation with the capabilities of grid technologies to support wide-area parallel file systems and distributed scientific visualization workflows. The effort culminated on August 1, 2007 with the successful deployment of the frost resource on the TeraGrid.

Integrating math, statistics, and geosciences:
IMAGe, the math institute housed within CISL, strives to bring mathematical models and tools to bear on fundamental problems in the geosciences, and to be a center of activity for the mathematical and geophysical communities. Each year, IMAGe focuses on a particular area of the geosciences or applied mathematics that has an impact on NCAR's scientific mission and develops a series of workshops plus a summer school to that theme. In FY2007, the Theme-of-the-Year focused on statistics for numerical models. This theme was undertaken with the goal of matching cutting-edge statistical methods to the needs of geophysical model development and to make statistical scientists aware of the particular scientific issues and research in the geophysical modeling community.

In collaboration with the Statistical and Applied Mathematical Sciences Institute (SAMSI) and the Mathematical Sciences Research Institute (MSRI), four modeling groups at NCAR were engaged to present their models and highlight potential statistical connections at the first workshop. Several collaborations between NCAR scientists and statisticians at SAMSI and the broader statistical community were begun, and the results of these efforts were presented at the second workshop. The next workshop focused on random matrices. Finally, a summer school program on the Carbon Cycle was hosted at NCAR. These kinds of coordinated activities have the potential to significantly increase the multidisciplinary training of young scientists. This also brings new mathematical approaches to challenging geophysical problems.

Frameworks to standardize large-scale modeling efforts:
Again in FY2007, The Earth System Modeling Framework (ESMF) has made significant strides in helping researchers manage the growing complexity of developing Earth system models. Disparate model components representing physical domains and processes—for example, atmosphere, ocean, and sea ice—are coupled into integrated systems to create realistic simulations. These models are computationally intensive and must run on a variety of parallel-processor supercomputing platforms. ESMF defines a set of standard software interfaces and a set of high-performance tools for common functions.

Now in its fifth year, ESMF has transitioned from NASA funding to multi-agency support and is the technical basis for the DoD Battlespace Environments Institute, the NASA Modeling Analysis and Prediction Program, and a host of smaller projects. The number of ESMF science components in the community is an important metric, since more standard components mean more options for researchers creating coupled systems. The adoption of ESMF grew steadily this year, with the number of available science components growing from 36 at end FY2006 to 58 at end FY2007.

Initiating the next generation of computational scientists:
A formalization of CISL's summer internship efforts, the Summer Internships in Parallel Computational Science (SIParCS) program is a prototype partnership between CISL and selected universities. SIParCS is sponsored and administered by CISL to provide opportunities for exceptional students with backgrounds in computational science, applied mathematics, computer science, or the computational geosciences.

This new program offers a significant opportunity to make a positive impact on the quality and diversity of the workforce needed to use and operate 21st century supercomputers. Ultimately, SIParCS aspires to help address shortages of trained scientists and engineers capable of using and maintaining high-end computer and data systems—people desperately needed to achieve the goals of future computational geoscience research.

In return, SIParCS provides a framework for interns to gain practical experience with a wide variety of parallel computational science problems by working under the guidance of CISL mentors on HPC systems and applications relevant to NCAR's Earth System science mission.