|

SCD's
Steve Hammond
The following reports provide more detail on the issues covered in this article.
An overall NCAR strategic plan is now under development and expected to be completed
this winter. The contact is Bob Harriss, NCAR associate director for planning,
ext. 8106, harriss@ucar.edu.
UCAR
Information Technology Strategic Plan
September 1998
Toward
a Robust, Agile, and Comprehensive Information Infrastructure for the Geosciences:
A Strategic Plan for High Performance Simulation (PDF)
May 2000
|
by Bob Henson
This article is reprinted by permission from the October
2000 Staff Notes
Monthly.
Major cultural shifts don't come
along very often. One is under way right now at UCAR and NCAR. The focus is
on our large-scale computer models and how they're designed, developed, and maintained.
Driven by changes in computer architecture and by the increasing complexity of
the models, people throughout UCAR-scientists, software engineers, and managers-are
rethinking how we ought to proceed. Like any paradigm shift, this one is producing
both angst and excitement among those at the heart of our modeling activities.
The guidebooks for this process are several strategic plans released in the
past few months and another soon to be completed (see panel, left). An internal
workshop on high-performance (HP) simulation took place on 31 August at the Mesa
Lab; a follow-up is being planned for the research community at large.
Even as our approach to modeling is being examined, the modeling continues.
A wholly new mesoscale package will be released in October (the Weather Forecast
Model, or WRF; see the sidebar, "WRF: A
model to follow?"), and an upgrade to the Community Climate System Model
is expected early next year (see below).
What's driving the action?
In the summer of 1999, an eight-member
panel of computer-science specialists was asked by NSF to review NCAR's key
models from a software standpoint. Their report, issued in August 1999, stressed
the need for change. As early as 1997, NCAR had begun a shift from vector machines
toward distributed-memory shared-multiprocessor (SMP) architectures with the introduction
of a 64-processor Hewlett-Packard cluster, and the Climate System Laboratory began
using a 128-processor SGI Origin 2000 in June of 1998.
A much larger IBM SP was installed for production use in August 1999, just
after the NSF review was completed. While acknowledging the promise of the new
IBM, the review panel asserted that NCAR would need to modify its model development
strategies in order to remain a leader in the field.
A
new direction began taking shape last October, when Bob
Serafin (then the director of NCAR) asked Steve
Hammond, manager of SCD's Computational Science Section, to chair a committee
that would prepare a strategic plan for HP simulation. Tim
Killeen joined the process shortly after becoming NCAR's director-designate,
while he was still at the University of Michigan. Instead of looking at each point
in the modeling process separately, the committee studied the entire computing
environment "end to end," says Steve. "There are some fundamental changes [proposed]
in the plan."
In the old days, a scientist could whip up the code for a model virtually solo
and call on a software engineer when problems arose. There's still a time and
a place for this approach, according to senior scientist Joe Klemp (MMM),
another member of the committee. "Lots of times we build models, run them for
a few weeks, and then throw them away," says Joe. However, it's clear that the
major community models are now far beyond the scope of a single researcher. The
NSF panel proposed -- and the NCAR committee agreed to -- a newly collaborative
approach that involves teams of scientists and software engineers in each model's
creation from beginning to end.
The NCAR plan, released in May, was a first step in establishing this
new paradigm. The subject headings -- "Computing Resources," "Software Tools,
Frameworks, and Algorithms," and the like -- imply as much concern with the software
itself as with the science behind it. "To some extent the computational aspects
of our models have been an afterthought," says Steve. "The emphasis has been more
on the phenomenological."
Steve says the committee called for a greater "level of formalism in our modeling
activities, consistent with [how we develop] field programs or observational programs.
There haven't typically been design reviews for our software. A lot of things
that are part of the systematic process of software development in the commercial
sector would be very beneficial to software projects conducted
here."
In
fact, the relevance of commercial software design was one of the hot topics at
the staff workshop on 31 August. SCD software engineer Cecelia
DeLuca pointed out that a popular framework used to describe the maturity
of an organization's software procedures employs a five-level system, ranging
from totally freewheeling (level 1) to exhaustively prescribed and documented
(level 5). The midpoint, level 3, "means that the entire organization has a standard
set of software engineering practices."
Given the range of tasks carried out here, Cecelia argued that full standardization
may not be feasible or appropriate ("for large-scale projects, level 2 would be
nice"). She also emphasized that we're not the first institution to have our coding
scrutinized closely. A major 1994 review of the U.S. Department of Defense called for improvements
in such areas as software quality assurance and definition of requirements. Looking
at the list, she pointed out, "These are some of the same issues we're facing
now."
What's on the table?
A range of topics was covered in the
strategic plan and at the workshop, including:
- Layered simulation software. One way to divide labor among scientists
and computer specialists is to confine their respective purviews to distinct layers
in the simulation code. Scientists, concerned primarily with algorithms for dynamics
and physics, are able to work within one layer of the software hierarchy to code
these in a standardized, platform-independent form. This leaves parallelism and
other computational concerns to an implementation layer tailored to the machine
at hand-primarily the domain of the computational specialists. The WRF model is
being built in this way, with a mediation layer in between.
- Easier access to data. "It should be very easy for people to compare
observations and model data, but it's not," says software engineer Lawrence
Buja (CGD). More metadata -- "data about data" -- is what's needed to help
researchers comb through the vast archives at NCAR and elsewhere. A number of
groups at NCAR and UCAR are exploring ways to create better metadata, especially
for large data sets accessible through the World Wide Web.
- Continued progress in visualization. The NSF review stressed NCAR's
leadership role in visualizing large data sets. Considering the unique demands
of earth system modeling, the HP strategic plan states that "visualization problems
such as these are outside the scope of the commercial marketplace, and an extensive
research and development program is required."
- A higher profile for computer science at NCAR. The strategic plan calls
for a "refocused view of computing professionals" and recommends their placement
in the scientist job category in order to ensure compensation on a par with their
modeling colleagues.
What's next?
SCD
director Al Kellie, who moderated
the August workshop, says he is quite pleased with the interest it generated.
"We had over 110 people to start the day off. The group discussions were highly
productive." The next step, he says, is a continuing series of workshops and discussions
to help make the new paradigm a reality. Already, UCAR and NCAR are teaming with
several other institutions for HP simulation projects.
- The next generation of CGD's Community Climate System Model (CCSM-2)
will be designed with support from the U.S. Department of Energy (DOE) over the
next 18 months. "We're starting to bring the pieces together," says CGD senior
scientist Byron Boville. "We are pushing to have a model at the end of
the year which will be running on the IBM." Early next year CGD hopes to carry
out a 1,000-year control run on the CCSM-2. It would be based on preindustrial
conditions in order to determine the model's internal variability.
To create the CCSM-2, about 15 people from NCAR and five DOE labs are collaborating
on the Avant Garde Project, part of DOE's Accelerated Climate Prediction
Initiative. The project is merging the CCSM with the DOE-supported Parallel Climate
Model, originally developed by CGD's Warren Washington and Jerry Meehl.
NCAR has worked with Argonne and Oak Ridge National Laboratories to create software
engineering guidelines for the entire model, including, for example, requirements
documents, unit testers, and validation code.
- SCD's new visualization laboratory, to be completed late this year,
will house a node on the AccessGrid.
Based at Argonne, this fast-growing network comprises several dozen institutions.
The network includes large-format displays integrated with virtual meeting rooms
that allow group-to-group communication. NSF has already conducted one of its
program reviews using the AccessGrid. Closer to home, SCD's Don Middleton
suggests that the high-speed connection between the Mesa and Foothills Labs be
used as a testbed for new collaborative technologies that increased bandwidth
on the 'Net might allow in coming years.
- Frameworks-reusable collections of code are being explored as a way
to simplify and speed up the creation of model implementation layers. NCAR is
about to submit a three-year proposal to lead the development of an earth system
model framework that could be used for multiple models. The project's large group
of collaborators includes NASA, Argonne National Laboratory, the National Centers
for Environmental Prediction, Los Alamos National Laboratory, the University of
Michigan, the Massachusetts Institute of Technology, and NOAA's Geophysical Fluid
Dynamics Laboratory
Where's the funding?
How will the enthusiasm and brainstorming
of recent months fare in a climate of relatively fixed budgets? Nobody has an
easy answer. It's already a challenge to carry out the community-service aspects
of our big models without cutting into research, says Joe. "Somehow we need a
mechanism to support these models as facility resources." As for the retooling
outlined in the HP strategic plan, one estimate is that the software engineering
components could require several million dollars over each of five years.
Many NCAR and UCAR groups are seeking grants to help in the transition.
Proposals are due by December for NSF's Information Technology Research (ITR)
program, which is offering grants ranging from single-investigator projects (total
budgets below $500,000 each) to group projects (up to $5 million) to large institutional
proposals (up to $15 million over 5 years). The topics include complex geophysical
coding, data assimilation, collaboratories, and accessible visualization tools.
Cliff Jacobs, the NCAR program officer at NSF, encouraged attendees at the
August workshop to apply for the ITR grants.
Tim Killeen has asked SCD to coordinate development of a large-institution
proposal for NCAR. Al says, "It'll probably build upon the themes laid out
in the HP strategic plan as well as other initiatives underway at NCAR. Some early
thoughts are that NCAR could really serve the geosciences community if we could
achieve much better efficiencies for our applications on highly parallel, microprocessor-based
systems. We need to crack some of the barriers that have been in the way of using
these machines. We're going to discuss this at the director's level and then go
out, and one of the keys will be to seek strong partnerships with universities
and potentially other centers."
According to Tim, NCAR will need more flexibility within its core funding as
well. "There definitely has to be a lot of permeability at the boundaries among
the divisions, and there already is. We need structures that facilitate cross-divisional
and cross-disciplinary interactions," such as those in place in ESIG and ASP.
Last month Tim's office announced a new position that will bring in
a distinguished visiting computer scientist for an initial term of roughly one
year. Tim hopes that the inherent appeal of modeling the earth system will spur
interest from top people in applying for this position and others that may follow.
"If you think of it from the computer science point of view -- someone doing research
-- what would bring them to the party? It would be a computer science challenge,
not a geophysics challenge: How do you integrate the different functionalities
[within an earth system model] in a way that maximally uses the available computing
architecture?"
Most current code runs at well below the ten percent efficiency level on today's
highly parallel computers, says Tim, "so we're underutilizing the computational
resources. That's what the strategic plan for scientific simulation is all about.
"Computer science has grown out of its infancy to be a mature science.
Now's the time to make an attempt to do this. It's going to be hard, because a
community climate model, for example, has players working at different paces.
Now we're giving them the extra challenge of a more rigorous approach to the computer
science, a more documented approach, and it's going to be more work. But I think
challenging things are often more work, and it'll pay off in the long run."
|