by Lynda Lester
If you haven't visited the NCAR
Computer Room for awhile, one thing will make your eyes pop out as soon
as you see the current machine lineup: blackforest, the IBM SP, appears
to be replicating.
Where there used to be eleven six-foot towers, glimmering with subcutaneous green and orange lights, there are now twenty-nine. Tall, black, almost daunting, they march across the floor, two hedges of one-and-a-half ton units with an aggregate peak speed of 2.0 teraflops -- NCAR's latest bid for the processing power it needs to answer the burning scientific questions at the core of its research mission.
The blackforest expansion is part of the new Advanced Research Computing System (ARCS), a collection of IBM equipment that will be significantly upgraded over the next several years. (See "NCAR powers up climate and weather research with enhanced IBM System.") But while the first installation alone doubles computational power at NCAR, many users will find that for them, ARCS means business as usual -- with a shorter wait in the job queues.
"When we designed the RFP [Request for Proposal], we wanted a production-level, high-performance computing system that offered both capability and capacity computing," says SCD Director Al Kellie. "But more than that, we wanted to provide an upwardly compatible system architecture and user environment, and a stable environment for software engineering and development."
Thus, users who have been computing on blackforest will find the same operating system, the same node configuration, the same batch system (LoadLeveler), and the same industry-standard debugger (TotalView). Usage of the new system will be split 50/50 between Community Computing and the Climate Simulation Laboratory (CSL), as it was before the upgrade. But whereas blackforest has been saturated for some time, there will be plenty of room on the larger ARCS system, which is expected to be available for production computing sometime in December 2001.
To make certain users' scientific needs were adequately reflected in ARCS specifications, SCD invited representatives from each NCAR division to serve as full partners on the RFP technical committee. The committee evaluated the needs of the NCAR community and the CSL, forged them into explicit requirements, and assembled a benchmark suite that included tests of representative models run at NCAR (e.g., CCM3, MM5, POP, MHD3D, MOZART2, PCM, and the WRF Prototype).
Three members of the NCAR community (Jim Hack of NCAR's Climate and Global Dynamics Division, Jim Kinter of the Center for Ocean-Land-Atmosphere Studies, and Bert Semtner of the Naval Postgraduate School) constituted the external review team.
The collaborative effort continued during the evaluation and selection process, ensuring that the ARCS machine would serve a wide range of users.
The land of micro-opportunity
One look at the system and it's apparent: this is not your parents' parallel vector processor (PVP). The labyrinth of proliferating, interconnected boxes gives it away: ARCS is a clustered, distributed-shared-memory (DSM) system, built not of expensive proprietary ingredients but from commodity off-the-shelf (COTS) parts.
This may cause nostalgic sighs among some users, but it's a sign of the times -- indicative of a trend that began more than a decade ago when "killer micros" began to fall in price and increase in speed. Today, fast microprocessors can handle most computing needs, while customers whose codes have traditionally run best on PVPs comprise a tiny niche market.
This makes the old-fashioned PVPs, reliable standbys at computing centers in the 1980s and early 1990s, prohibitively expensive in the new millennium. And even though NCAR codes tend to get only 5-10% efficiency out of microprocessors, COTS systems are so much cheaper than vector supercomputers that in terms of price/performance, they win.
"When it comes right down to it, I see the low efficiency on these microprocessor systems as being a land of opportunity -- there's so much potential," says Tom Engel, a high-performance computing specialist in the SCD Director's Office. "By paying attention to cache optimization and so forth, you can double the efficiency of certain applications."
In fact, Rich Loft, Stephen Thomas, and John Dennis, software engineers in SCD's Computational Science Section, recently developed a scalable, 3D climate-model dynamical core based on spectral elements that achieves a higher percentage of peak on microprocessors. Their code achieved 15.7% (127 gigaflops) at NCAR using 134 four-processor IBM SP nodes, and 16.1% (370 gigaflops) at the National Energy Research Scientific Computing Center using 128 sixteen-processor IBM SP nodes.
Although the learning curve in the transition from PVP to DSM machines is similar to what programmers dealt with twenty-five years ago when vector hardware was introduced, today's users face an added layer of complexity. Codes must not only be rewritten for cache optimization, but distributed across nodes tied together by high-performance crossbar switches.
Certainly it's a challenge -- but real work is getting done. "We can be nostalgic, but we'll go where the opportunities are," says Vince Wayland, a software engineer with the Climate Change Research Section in NCAR's Climate and Global Dynamics Division. "We've done good science on these microprocessors. In the past two years, we've completed over 3,700 years of climate simulations on blackforest."
Help on the way
For those who have not yet made the leap into the brave new world of clustered microprocessing, the good news is that SCD is ready and willing to help. Consultants are working closely with programmers and researchers to convert and optimize codes for the ARCS system (many large codes have already been ported), as well as assisting with data conversion from Cray to IEEE format. And while SCD will continue to offer expertise in debugging and problem solving, it plans to take a more proactive role in collaborating with users to strategize, design, and implement new algorithms and codes.
Meanwhile, as part of the ARCS contract, IBM and SCD have agreed to work together to improve the user environment and support services for the NCAR community and the CSL. IBM will offer training in advanced programming, performance analysis, and tuning techniques as well as specialized training tailored to user needs. IBM will also provide two onsite applications specialists and is committed to a more efficient process for reporting, escalating, and resolving compiler and tools problems.
New scientific horizons
Once the new equipment is online, NCAR's IBM SP system will be ranked at number 11 on the top 500 Supercomputer Sites list (http://www.top500.org/list/2001/11) -- and installation of the second phase of ARCS could place NCAR within the top five.
"ARCS is a world-class system that exceeds the articulated purpose and goals of the RFP," says Al Kellie. As such, its extended capability (speed) and capacity (size) will allow NCAR users to be leading-edge participants in the discovery and exploration of new environmental science.