Director's Message  |   Executive Summary  |   Divisional Narrative   |   Publications   |   Educational Activities   |   Awards   |   Community Service   |   Staff   |   Visitors and Collaborators   |   NCAR FY2003 ASR   |

Executive summary

SCD provides a world-class supercomputing facility and a complete set of integrated support services for researchers in the atmospheric, oceanic, and related sciences. SCD's facilities support researchers around the world as well as at consortium member universities and at NCAR and UCAR. SCD supports the development and execution of large, long-running numerical simulations and the archiving, manipulation, and analysis of extremely large datasets.

SCD's mission specifies how this work is structured,and it defines the management units of SCD's internal organization. This Annual Scientific Report describes the accomplishments of each SCD management unit.

This executive summary reports SCD's FY2003 progress in each areas defined by SCD's FY2003 program plan:

  • Computing resources
  • Mass Storage Systems
  • Operations and Infrastructure
  • High-speed networking and data communications
  • Computing user support services
  • Visualization and enabling technologies
  • Research data support and services
  • Computational science, networking, and visualization research and development
  • Climate Simulation Laboratory (CSL) for large, long-running simulations of climate

Computing resources

The production supercomputer environment managed by SCD for NCAR evolved from serial codes running on single processors to codes that harness the power of multiple CPUs in cluster systems. In FY2003, SCD more than quadrupled the capacity of these cluster systems by completing Phase II of the Advanced Research Computing System (ARCS) and a late-FY2003 augmentation to that system.

A complete new IBM Cluster 1600 system, named bluesky, was added to the SCD computational environment in early FY2003. This bluesky system had a total of 1,216 POWER4 processors and a peak computational rate of 6.323 teraflops. Late in FY2003, bluesky was augmented by 12, 32-way p690 SMP frames adding an additional 384 POWER4 processors and raising the total peak computational rate of the system to 8.23 teraflops. The augmentation also increased bluesky's total disk capacity to 31 TB. Bluesky now has 50 POWER4 p690 SMP frames, making it the single largest system of this type in the world. The addition of the bluesky system in FY2003 increased both Community Computing and Climate Simulation Lab resources by a factor of two each. Initially, the 12 additional bluesky frames will be used to increase the resource capacity available to CCSM IPCC activity.

Other accomplishments in FY2003:

  • Acquired two 32-way P690 SMP frames in late FY2003 for the evaluation of Phase III of the ARCS plan
  • Monitored and maintained hardware and software at high levels of utilization for seven computational servers in the CSL computing environment
  • Monitored and maintained hardware and software at high levels of utilization for five computational servers in the Community Computing production environment

See the full FY2003 report at High performance computing.

Mass Storage System

The NCAR Mass Storage System (MSS) is a large-scale data archive that stores data used and generated by climate models and other programs executed on NCAR's supercomputers and compute servers. At the end of FY2003, the NCAR MSS managed more than 20.3 million files containing more than 1.5 petabytes (PB - 1,502 terabytes), over 880 unique terabytes (TB), and the net growth rate of data in the MSS was approximately 27 TB per month. SCD faces an increasing demand to archive data from ever-faster supercomputers.

The new Storage Manager (STMGR) disk cache component (internal disk cache) was placed into production as a replacement for the aging IBM 3390 diskfarm. In FY2003, STMGR provided a 2.7x increase in storage capacity and a 10x increase in aggregate data transfer rate. STMGR is most notable for its potential of providing more than a 220x increase in storage capacity, a 33x increase in aggregate data transfer rate, and the ability to buffer files of all sizes. Further, it will permit newly written files to reside in the cache longer to reduce tape mounts and tape I/O.

To aid capacity planning and performance tuning of the MSS, a simulator that includes all the major hardware and software components of the MSS was developed in 2003. The simulator enables the MSS group to consider different design alternatives for new software and hardware components and estimate how the different designs will perform before the components are added to the actual system. Simulation studies were conducted in 2003 using an earlier version of this simulator (that only simulated the disk cache component of the MSS) to aid in configuring and sizing the STMGR disk cache system.

Simulator output was also combined with MSS warehouse information to help measure the effectiveness of external data caches which were deployed to avoid re-reading data from the MSS, thus avoiding the abuse of a data archive as a file server. The external cache deployment resulted in as much as a 60% drop in such re-reads.

Other accomplishments in FY2003:

  • Deployed beta-test versions of web-based tools to help users manage their MSS holdings
  • Completed decommissioning the IBM 3490 drives and media
  • Completed decommissioning the Redwood drives and media
  • Continued research and development of external disk cache systems

See the full FY2003 report at Data archiving and management system: The MSS.

Operations and infrastructure

The Operations an Infrastructure Section (OIS) contributes the infrastructure necessary to support and operate the computers, networks and services that are integral to SCD's mission. In addition to physical infrastucture such as electrical distribution, cooling and 7x24 operational oversight of the facility, OIS also contributes to the software infrastructure. Examples include the development and maintenance of the problem-tracking system within the division, the room reservation system, and more recently the SCD portal.

Planning for the second-phase ARCS system bluesky culminated in FY2003 with its smooth commissioning. Plans began to add 14 more compute nodes to bluesky. This addition placed more heat stress on the computing infrastructure as each node produces up to of 42,000 BTU/hr. For the first time, SCD began quantitatively measuring and recording changes in electrical consumption based on the workload of the supercomputers. The SCD Portal, a web-based entry point to SCD computing resources, was released to all users. At end-FY2003, more than 200 fixed assets are being tracked, tagged, recorded in a database, and delivered for configuration and deployment. This process has greatly increased the accuracy and detail of SCD's asset information.

Other accomplishments in FY2003:

  • Beyond the large IBM installations, OIS managed the installation and removal of numerous other servers and equipment housed in the computing center.
  • In support of the IBM installations, OIS completed the installation of additional cooling equipment and addressed single points of failure to provide a more robust infrastructure.
  • Provided equipment and infrastructure to support SCD staff, including small and large computers, printers, environmental control equipment, test equipment, and other services
  • Transitioned operations staff schedule to a format that allows them to work on projects, attend training, and interact with other SCD staff
  • Began construction of a facility for housing two 1.2-Megawatt backup power generators
  • Completed conversion of the 3490E tape media; this has reduced manual tape mounts by 80% and the need for temporary staff
  • Managed SCD's business continuity plan for recovering critical functions after a catastophic event

See the full FY2003 report at Computing center operations and infrastructure.

High-speed networking and data communications

Networking is an essential technology that is vital to UCAR's ability to function and prosper in a rapidly evolving technological environment. Networking capabilities fundamentally support UCAR's goals for the advancement of science, technology, and education.

Primary accomplishments in FY2003 included new telecommunications and networking systems for all UCAR staff, providing network support for the expansion to the Center Green campus, and participating in multiple national network research projects.

Other accomplishments in FY2003:

  • Deployed Voice over IP (VoIP) telecommunications throughout UCAR
  • Rebuilt the LAN infrastructure throughout the buildings in the new Center Green campus to bring it into compliance with UCAR standards
  • Replaced the OC-3 connection between NCAR and the FRGP with Gigabit Ethernet on dark fiber
  • Produced multiple proposals including one to include UCAR on the TeraGrid network
  • Participated in networking research projects described in the Research section of this executive summary
  • Provided engineering and support for the FRGP consortium
  • Led participation in LambdaRail, a consortium of leading U.S. research universities and private-sector technology companies seeking to develop a new networking infrastructure for all forms of education and research
  • Contributed to SCD's business continuity plan for recovering critical functions after a catastophic event

See the full FY2003 report at Network engineering and telecommunications.

Computing user support services

SCD's User Support Section (USS) provides leading-edge software expertise for the climate, atmospheric, and oceanic research communities to facilitate their high-performance computing endeavors at NCAR. The section provides a variety of services to users, both local and remote, that enable them to pursue their research within SCD's "end-to-end" high-performance computing environment. USS provides users with focused support services tailored to their specific needs. USS also oversees the process for allocating supercomputing resources for the university and NCAR communities, as well as for the Climate Simulation Laboratory (CSL). FY2003 achievements were made in five areas: CSL, database management, user consulting, infrastructure support, and digital information.

In FY2003, USS helped users become productive on bluesky, a new 1312-processor POWER4 IBM cluster system through code testing, documentation, code scaling assistance, and problem resolution. USS supported two User Forums featuring 20 invited speakers, and responded to the needs voiced by participants by creating new products and services and revising existing ones. Under the direction of the NCAR/UCAR Computer Security Advisory Committee, computer security received ongoing improvement and users were guided to mitigate the impact of security issues. Also in FY2003, USS staff continued to increase the number of high-availability clusters that provide critical infrastructure services for users and UCAR staff.

Other accomplishments in FY2003 include:

  • Implemented accounting charges based on the wall-clock time a job runs rather than CPU time. This more accurately reflects the computational resources used by individual computer jobs
  • Streamlined the process for users to manage their MSS files
  • Tested and worked to approve more than 30 software changes to the supercomputers by testing functionality and results before and after the modifications
  • Installed a Lightweight Directory Access Protocol (LDAP) database in the UCAR mail relay system
  • Developed a formal proposal for evaluating user software requests to respond more quickly to user requests

See the full FY2003 report at Assistance and support for NCAR's research community.

Visualization and enabling technologies

SCD's Visualization and Enabling Technologies Section (VETS) has a primary focus on advancing the knowledge development process. Activities span the development and delivery of software tools for analysis and visualization, advancing visualization and collaboration environments, web engineering for all of UCAR, R&D endeavors in collaboratories, the development of a new generation of data management and access, Grid R&D, novel visualization capabilities, and a sizable outreach effort.

In the Cyberinfrastructure (CI) Strategic Initiative, VETS released a newly designed Community Data Portal (CDP) site with powerful search and browse capabilities and published metadata for hundreds of datasets from across our organization. We also made substantial inroads on the other major component of the CI initiative, the Web Outreach, Redesign, and Development (WORD) effort, and we achieved a strong, UCAR-wide consensus on a new strategy and design for our institutional web presence. In addition to this internal initiative, we were also awarded new R&D contracts in the areas of Grid computing and modeling (NASA) and advanced visualization (NSF/ITR), and we continued to play a strong role in the Unidata-led NSF/NSDL THREDDS project.

VETS made substantial progress across all of its areas of endeavor. Some of the highlights include:

  • Internal and external project funding enabled us to expand our staff to 22 people this year, and along the way, we improved our project management and tracking processes and launched a major effort to improve our metrics gathering and reporting.
  • Continued an aggressive outreach effort that included the launching of the VizKids program, an extensive list of Vislab tours, and a fine presence at the SC2002 conference -- while reducing overall external events and the associated costs.
  • Added a new full-time staff member to support collaboration services, participated in the NCSA/Alliance Scientific Workspaces of the Future (SWOF) effort, and partnered with Howard University to place an AccessGrid node at their site.
  • Executed a successful mid-term review of the Earth System Grid (ESG) project with DOE and demonstrated an early multi-site Grid for climate research.
  • Advanced enterprise web services substantially, including a large storage and server increment, a formal streaming video service, a new visualization portal, new directory services, and improved statistics and metric capabilities.
  • Continued to observe our NCAR Command Language growing in popularity and usage across the community and made additional advances in the areas of remote visualization services, Python-based data analysis and visualization, and new multi-resolution data capabilities. We also engaged in a number of planning meetings with various DOE groups in the area of building larger community efforts in visualization and data analysis.

See the full FY2003 report at Visualization and enabling technologies.

Research data support and services

The Data Support Section (DSS) maintains a large, organized archive of computer-accessible research data that is made available to scientists around the world. This 22.5-terabyte archive of 550 datasets is an irreplaceable store of observed data and analyses that are used for major national and international atmospheric and oceanic research projects.

We carried out a number of data development projects to add more data to the archives. We have added about three old sets of surface observations taken during the 1928 - 1973 period for the U.S. and Canada. Progress is being made to add data to the surface ocean dataset (I-COADS). We got ready to accept data from the ERA-40 reanalysis for 1957 - 2002 from ECMWF and the NCEP regional reanalysis for 1979 - 2002.

Much progress on the document project was made during FY2003. The goal of this project is to assemble hardcopy information about datasets and projects into documents, and scan these for online use. We also wrote many more data reports for inclusion. The production of scanned documents started in March 2000. By October 2003 we has prepared 305 documents and 18,233 pages, an increase of over 3,000 pages during FY2003.

Major progress was made in the data backup project. The goal is to put backups for about 10% of our archives (the most important) into distant archives. In May 1999, we started with 40-GB-per-tape technology and moved to 110 GB per tape in September 2002. We have completed backups for 1,733 GB of data, leaving some of our very important observations to be backed up.

Other accomplishments in FY2003:

  • Added new datasets, concentrating on older observations, updates of recent data, and oceanography data
  • Increased the amount of data available via the web, especially the reanalysis data
  • Continued involvement in large projects such as reanalyses and comprehensive international data-compilation efforts
  • Continued supporting users by providing consulting services from scientifically knowledgeable staff
  • Carried out major improvements on the I-COADS marine surface data collection, and continued similar work on other important atmospheric datasets
  • Continued saving legacy data still remaining on 7- and 9-track reel tapes
  • Acquired new tape storage systems to handle transfers that can only be done using tape technology
  • Updated reanalysis observations and numerous other research data products on an ongoing basis
  • Continued assembling seven important sets of global observations for reanalysis and climate research

See the full FY2003 report at Research data support and services.

Research in computation, networking, and visualization

Advanced research and development activities take place across the division and sections and often include collaboration with other NCAR/UCAR divisions and programs.

Computational science research

SCD's Computational Science Section (CSS) helps realize the end-to-end scientific simulation environment envisioned by the NCAR Strategic Plan. The mission of CSS is to develop much of the critical software and intellectual infrastructure needed to achieve the plan's ambitious goals. CSS tracks computer technology, learns to extract performance from it, pioneers new and efficient numerical methods, creates software frameworks to facilitate scientific advancement -- particularly interdisciplinary geoscience collaborations -- and share the resulting software and findings with the community through open source software, publications, talks, and websites.

The past year has been productive for CSS in many areas. Research and development activities funded in FY2003 are up and running. The section had nine papers published or accepted in peer-reviewed journals in the last year, and has four more currently submitted for publication. CSS has generated five posters or papers published in conference proceedings. One patent application was initiated. CSS staff initiated work under two new grants in FY2003. A critical DOE cooperative agreement research grant (CCPP) was renewed for another year at full funding. In terms of institutional impact, the three most significant accomplishments by CSS in FY2003 are:

  1. The ESMF software development team in CSS released the first prototype version of the Earth System Modeling Framework (ESMF 1.0) software in May 2003. This early version of the software included a coupled fluid flow demonstration, showing how the ESMF infrastructure could be used to build a compressible fluid flow solver and how the ESMF superstructure could be used to assemble multiple components into an application. To date there have been 300+ downloads of the framework software. The on-time delivery of proof-of-concept framework software helped build community confidence in the ESMF effort. It also demonstrated the testing, release, and support capabilities of the ESMF team.
  2. CSS staff participated in an NCAR-DOE-IBM "tiger team" to tune the performance of the Community Atmosphere Model (CAM) for the upcoming IPCC runs. This collaboration resulted in 30% performance improvements, which greatly enhanced the scientific throughput capability of the NCAR POWER4 IBM cluster for this important series of climate change experiments.
  3. CSS scientists successfully implemented a non-conforming spectral element mesh refinement scheme and applied it to the shallow water equations on the sphere. They demonstrated the utility of a scalable Operator Integration Factor (OIF) scheme by using it to take large time-steps on a such a refined mesh.

In addition to these important accomplishments, CSS has:

  • Made significant advances in the long-term process of building a computer science program within SCD by developing an increasingly fruitful partnership between CSS and the University of Colorado department of computer science.
  • Made significant progress integrating the High-Order Multiscale Modeling Environment (HOMME) dynamical core with CAM2 physics.
  • Done important work developing space-filling curves for parallel decompositions of spectral element atmospheric models.
  • Collaborated in a successful effort with CU computer scientists to improve the numerical and computational performance of the GMRES algorithm.
  • Developed and implemented (for the shallow water equations) a new conservative advection scheme based on Discontinuous Galerkin methods.
  • Applied new mathematical methods to solve part of the highly nonlinear elliptic problem of the solar Coronal Mass Ejections (CMEs).
  • Expanded the Spectral Toolkit to include SPHEREPACK functionality needed to create basic, high performance infrastructure for spectral analysis and synthesis.

Networking research

Over the last decade, we have witnessed a tremendous increase in raw network capacity. Today, we are seeing the ubiquitous deployment of cross-country 10 Gb/s optical networks and early standards efforts in support of 40Gb/s and 100Gb/s networking technology. As a result, "long fat pipes" (elephants) [RFC1323] are no longer a rarity -- in fact they are becoming the norm. For example, the NSF funded the 40Gb/s Distributed Terascale Facility (DTF) [RB02], UCAID has announced plans for the 10Gb/s Abilene Network of the Future [Int02], and contractual negotiations are near completion on a dark fiber lambda network called the National LamdaRail. However, it is not clear whether these network capacity increases will actually result in comparable increases in application performance, due to a number of specific underlying technical issues and limitations.

For UCAR scientists to benefit from these faster networks, it is critical that SCD stay active in the network research activities to influence and make available the latest technology.

NETS has successfully continued their work on the Web100 and Net100 research projects. Additionally, NETS secured a Cisco grant to continue related work and an NSF STI award for ongoing network performance research.

Ongoing networking research initiatives include:

  • Continue work on the Web100 collaboration
  • Continue work on the Net100 collaboration
  • Collaborate with PSC on Cisco University Research Program Funding -- Investigating Large Maximum Transmission Units (MTU)
  • Collaborate with PSC NSF Strategic Technologies for the Internet Proposal -- Effective Diagnostic Strategies for Wide Area Networks

Visualization and enabling technologies research

SCD engages in a number of research efforts aimed at advancing our ability to manage, access, analyze, and visualize scientific data. Our efforts span novel visualization techniques and algorithms, Grid technologies, access methods and geoscientific metadata, and applications of advanced visualization in the educational arena. SCD recruited a new postdoctoral researcher who will be studying the applications of advanced visualization techniques in undergraduate geoscience curricula.

  • Continued participation in the Unidata-led NSF National Science Digital Libraries (NSDL) THREDDS (Thematic Real-time Distributed Data Servers).
  • Continued to advance the Earth System Grid (ESG) project, a DOE-funded collaboration of several DOE centers and NCAR. We executed a successful mid-term review of the Earth System Grid (ESG) project with DOE and demonstrated an early multi-site Grid for climate research. We also published a paper on ontologies and semantic approaches to metadata related to ESG.
  • Secured a three-year NASA grant to collaborate with CGD and the University of Colorado to develop a distributed, Grid-based environment for biogeochemical modeling, data management, and analysis.
  • Secured a five-year NSF Information Technology (ITR) research grant to collaborate with U.C. Davis to explore new approaches to visualizing time-varying scientific data.
  • Explored new wavelet-based multiresolution approaches to dealing with very large rectilinear gridded data and developed a prototype toolkit for this endeavor called mtk.
  • Developed a proposal to join the NSF Extensible Terascale Facility (ETF) with partnerships among NCAR, the University of Utah, the University of Colorado, and Colorado State University. This proposal was favorably reviewed but not funded. This was a cross-sectional effort across VETS, NETS, and CSS.

See the full FY2003 report at Computational science research and development.

Education and outreach

SCD has a vibrant education and outreach program. We collaborate with UCAR's Education and Outreach program such that we complement and support their endeavors. We also represent our program of scientific supercomputing at conferences and other events, and we support an aggressive and busy schedule of Visualization Lab demonstrations and presentations for a wide variety of visiting audiences, ranging from scientific through educational. Lastly, SCD provides training and promotes other activities that support the use of our services and the proliferation of advanced technologies to the university community.

SCD's Visualization and Enabling Technologies Section continued a very strong outreach program, providing dozens of presentations in our Visualization Lab. We also spun up a new program, informally called VizKids, where UCAR's Public Visitors Program (PVP) prepares and delivers highly visual presentations to visiting educational groups, and the results thus far have been very positive. Through teamwork, we are able to accommodate a much greater number of visitors with only a modest impact on SCD technical staff. We also engaged in an outreach activity to provide Howard University in Washington, D.C. with an AccessGrid node for their atmospheric sciences program. We had a strong presence at the SC2002 conference and showed off a new design scheme in our exhibit, one that emphasized our computing, visualization, and research efforts as well as our sponsorship by NSF (and other agencies, to a lesser degree). SC2002 was our only formal exhibit in FY2003, which reflected the implementation of our strategic plan to reduce our exhibit participation in conferences in favor of more technical R&D and a growing, stronger presence presenting and publishing papers. This new direction recouped a substantial amount of high-level staff time.

User Support organized three multi-day and four single-day seminars to help supercomputer users understand the background and learn optimal techniques for making productive use of the IBM SP-cluster systems. One staff member has a teaching appointment at the Colorado School of Mines, Golden, Colorado, as a member of the Colorado Higher Education Computing Organization.

See the full FY2003 report at Education and outreach.

Climate Simulation Laboratory (CSL) facilities for large, long-running simulations

NCAR has established a dedicated climate model computing facility in support of the multiagency U.S. Global Change Research Program (USGCRP). The Climate Simulation Laboratory (CSL) is administered by the National Science Foundation, and the CSL computational facilities are housed, operated, and maintained by SCD.

The purpose of the CSL is to provide high performance computing and data storage systems to support large-scale, long-running simulations of the earth's climate system (defined as the coupled atmosphere, oceans, land and cryosphere, and associated biogeochemistry and ecology, on time scales of seasons to centuries), including appropriate model components, that need to be completed in a short calendar period. A large simulation is one that typically requires thousands of processor hours for its completion and usually produces many gigabytes of model output that must be archived for analysis and intercomparison with other simulations and with observations.

During FY2003, CSL projects used a total of 905,567 General Accounting Units (GAUs) on CSL-allocated machines. The largest project was the Community Climate System Model (CCSM), which used over 600,000 GAUs. Computer resources were provided to 328 universities and U.S. non-profit research organizations. The largest single request granted was 60,000 GAUs. A total of more than 535,000 GAUs were allocated to universities during FY2003. SCD's User Support section provided personalized contact for researchers, which involved assistance with user accounts, computing, data management, consulting help, computing infrastructure services, and web-based user documentation.

Other accomplishments in FY2003:

  • More than doubled the total computing capacity available to CSL users
  • Prepared new cooling capacity for the computer room that will be installed in October 2003
  • Provided CSL user support in the areas of consulting, visualization, research data, and model development assistance