|
by Bob Henson
In
late August, the U.S. Department of Commerce and the International Trade
Commission ruled within days of each other that the offer from Federal
Computing Corporation (abbreviated as NEC in Japan) to sell an SX-4
supercomputer to NCAR constituted dumping. (In trade lingo, dumping refers to
merchandise being offered at an unrealistically low selling price in order to
stimulate enough succeeding business to recoup costs.)
The rulings were the
culmination of a year-long challenge initiated by Cray Research after NCAR
announced the NEC procurement in the spring of 1996. On 28 August, in
response to the Commerce and ITC rulings, the National Science Foundation officially terminated the SX-4
procurement process.
The events of August ended efforts by NCAR's Scientific Computing
Division to bring one of the most powerful computers in the world
to NCAR and the modeling community that it serves.
BH: Obviously this must be a disappointment.
BB: Yes, it is. The SX-4 is the fastest machine that we have ever
evaluated -- 15 to 25 gigaflops sustained. Had we been able to bring the SX-4 to NCAR, it would have
enabled U.S. atmospheric scientists to address problems that are currently
intractable and will remain intractable until comparable computing power is
available.
Our best option for matching the performance and cost-to-performance ratio of
the SX-4 is highly parallel microprocessor systems, such as the
Hewlett-Packard SPP2000 that we acquired this spring. This machine has
been undergoing a process of upgrades. We expect to make it available to all
users by October 1.
We're doing a number of experiments to
evaluate its overall capability, and so far it looks promising. We can
routinely get 2 gigaflops out of it and we have surpassed 10 on one occasion.
It offers very good performance per unit of cost.
How would you characterize our relationship with Cray at this point?
I think it's professional. It has been throughout, for the most part, and as
we move on now to this era of highly parallel nonvector computing, they are a
potential supplier. We'll give them the same objective consideration that we
have in the past.
Let's talk about the highly parallel nonvector era. What does that imply as
far as challenges and potential benefits?
The most significant thing it implies is the potential to have as much
computing capability as we would have had with the SX-4. It'll be a year or
so later, but nevertheless it gives us the potential to stay in league with
our peer organizations around the world, who, as I've noted on various
occasions, by the end of this year will have systems that sustain 20 to 80
gigaflops. It's important that we have comparable capability. This
technology is our best hope for achieving that.
Can you explain in a nutshell the difference between vector and nonvector
machines?
The vector machines operate on strings of numbers, and as a consequence, the
CPU, memory, and various other components can be coordinated in such a fashion
as to achieve very high performance.
The microprocessors [i.e., nonvector processors] today have, in theory, peak
performances comparable to the vector processors, but they use another
strategy -- cacheing -- in order to enhance their performance.
How has the SCD planning process unfolded through the drawn-out procurement
process?
As soon as the antidumping investigation was launched, we realized that the
SX-4 might never be available, so we put in place a number of interim steps.
We brought the C90 into the Climate System Laboratory, and that made it
possible for the Climate System Model (CSM) project to make a lot of progress.
We brought in a new J90se computer from Cray Research to replace shavano, the old Y-MP,
which was beginning to have reliability problems due to its age.
We've had highly parallel systems on the floor for experimentation and
first-hand evaluation throughout the '90s, but we realized that if the
SX-4 was not going to be available, then our best option would be highly
parallel nonvector technology. We put in gear last fall a process to acquire
the latest technology in this area, and that culminated with installation of
the HP SPP last spring.
At the same time, we're using some of HP's computers in Dallas, correct?
We only have a 64-processor system here, so we do have access to bigger
systems at other sites.
How do you think climate modeling at NCAR will adapt to the transition toward
nonvector technology? Could we be using supercomputers in other nations, as
we did in Japan in a collaboration with NEC earlier this year?
Warren Washington and coworkers have a parallel coupled model that runs
on highly parallel systems, including the HP SPP. This is one of the models
that we will use to evaluate the SPP. In fact, we expect that the SPP will
soon be used to support research with this model. On the other hand, the CSM will need some
significant modifications to use the SPP and similiar systems.
Part of NCAR's strategic plan includes broadening our national and
international collaborations. When scientifically appropriate,
such collaborations can include access to supercomputers offsite, including
computers located in other countries.
Meanwhile, we have the C90 downstairs. It's still a solid 5-gigaflop
machine running the CSM quite well. It'll be here for at least another year
and probably at least two more years. So I don't think there's any particular
crisis with the model as it presently exists.
It does sound like SCD will survive this ordeal and continue to be a community
resource.
The data-handling capabilities in SCD are almost unmatched, and we have a
very respectable computing capability today with the C90. We anticipate
bringing in another supercomputer for the community. I think we're as good as
most U.S. supercomputing centers, and if we're successful with the highly
parallel nonvector technology, we will be able to match the computing
capability of our international peer organizations. It'll take about a year
to a year and a half to get there. Five years from now, people may look back
on all this as actually a fortuitous development.
Clearly the past year has been a tough period for SCD staff.
It's created a certain amount of apprehension. [I'm hearing] similar
apprehensions among some of the scientists. They're very concerned, as was
evident at the director's retreat back in June, that we maintain a good
computing capability.
However, you do have a sense of where SCD is headed now that you haven't
really had for a year.
We're out of limbo now, and we know what we have to do next. The SX-4 is
behind us.
It sounds like we'll be going in a different direction from almost any other
major atmospheric science computing center.
Not really. The U.K. and German weather services are both using highly
parallel nonvector systems. The European Centre for Medium-Range Weather
Forecasts and Meteo-France are using highly parallel systems with vector
processors. The only way to get the kind of performance that this community
needs is through parallelism. No matter what we do, we're going to go
parallel.
Note: A hardcopy version of this article is available in the September UCAR Staff Notes Monthly.
|