Supercomputing system advances
The Supercomputer Systems Group (SSG) of the High Performance Systems
section of SCD is responsible for systems engineering, support, and
administration of all production computational systems managed by SCD. While
the group continued performing routine system maintenance on existing
supercomputer systems, the introduction and integration of new Distributed
Shared Memory (DSM) and Parallel Vector Processor (PVP) systems into the
Climate Simulation Laboratory and Community Computing environments were the
primary focus of activities in FY1998.
The most significant project accomplishments for SSG during FY1998 were:
- Installed and integrated the Silicon Graphics Cray Origin2000 (ute)
into the Climate Simulation Laboratory environment
- Installed and integrated the Cray J90se/24-1024 (chipeta) into the
Community Computing environment
- Integrated the Hewlett Packard SPP-2000 (sioux) into the Community
Computing environment
- Enhanced the Batch Priority Scheduler, and a subset Batch Dedicated
Scheduler was developed for DSM architectures
- Enhanced support and utilization of the Network Queuing Environment and
File Transfer Agent
- Enhanced supercomputer system monitoring and reporting
- Maintained and enhanced, where appropriate, support for current
supercomputers
- Conducted Year 2000 planning, testing, and remediation activities
In addition, SSG continued to closely monitor all compute servers and made
changes as needed to ensure that we continue to run highly tuned, effective
systems that will yield the maximum utilization and throughput that these
systems can provide.
Installation and integration of the Origin2000 (ute) into the CSL
SCD installed a new 128-processor Silicon Graphics Cray Origin2000 system
during the spring of 1998, and SSG played a major role in the
installation, system tuning, HiPPI testing, porting of local codes,
system debugging, and testing of Silicon Graphics products such as
compilers, batch systems, and system performance tools.
In addition, SCD acquired a small four-processor Origin2000 system (mouache)
for system testing purposes prior to the installation of the large system.
This system has been used extensively by SSG staff and the Mass Storage
System Group (MSSG) to conduct Irix Operating System tests and NCAR Mass
Storage System (MSS) testing and to evaluate a number of other system
capabilities prior to the introduction of those capabilities on the
larger, production Origin2000.
SSG installed, tested, and implemented the following IRIX system components:
- Network Queuing Environment
- Checkpoint and restart reliability, both batch and interactive
- Irix accounting capabilities
- Capabilities embodied in basic UNIX system administration and
monitoring utilities like sar, nfs, ntp, nfsstat, traceroute, ftp, etc.
- Network tuning
- Filesystem performance tuning, striping, data rates, etc.
- Kernel tuning and system-service performance evaluation
- Irix support of resource management capabilities and user limits
(interactive and batch)
- Porting of the Batch Priority Scheduler, User Master File, NCAR
scrubber, and system monitoring (user "sysmon") software
- Shutdown and startup
- System access restriction
SSG facilitated the system installation and made the Origin2000 a viable
production compute server for the Climate Simulation Laboratory (CSL).
Installation and integration of the Cray J90se/24-1024 (chipeta) into
the Community
A second Cray J90se with 24 CPUs and a billion words of memory was added to
the Community supercomputing resources during FY1998. The installation and
acceptance of the system was flawless, largely due to the fact that the
system was configured identically to its predecessor J90se (ouray). On 24
March 1998, this second J90se system was put into production and became
fully utilized within four hours, and has effectively remained so since. In
keeping with the convention of recognizing American Indian tribes and
leaders established for the J90 series systems at NCAR, this system was
named "chipeta"; Chipeta was Chief Ouray's wife.
Integrating the SPP-2000 (sioux) into the Community
Hewlett Packard's (HP) SPP-2000 Exemplar has been at NCAR since late April
1997. SCD spent the first few months of FY1998 evaluating the system to
determine its readiness for introduction into the Community supercomputer
resource "pool." Though there were numerous operating system feature
deficiencies, the system was placed into a "friendly" user state, then into a
"limited production" state by spring 1998. The operating system provides no
accounting capabilities, thus SCD has left the system unallocated but
available to any Community users interested in obtaining an account and
using the system.
During the testing and evaluation of sioux, SSG addressed and identified the
following system problems:
- System performance tuning in the areas of disk I/O and memory
allocation (both local and global memory). The disk I/O performance was
done; reads and writes to local filesystems routinely demonstrated
40-MB/second rates.
- We continue to work with HP on the checkpoint/restart problems, but
things do not look promising.
- A subset of BPS that controls batch queues and enforces CPU wall-clock
limits was developed and installed.
SSG continued to closely monitor sioux and added, changed, or tuned
resources as necessary to ensure the best possible system performance and
job throughput. To date, this system has been relatively well utilized and
utilization has been increasing with time, but only by a small number of
Community users; SCD had hoped to encourage more Community users to jump
onto the DSM "bandwagon" by not charging for sioux's use.
Batch Priority Scheduler (BPS)
BPS was initially developed in 1996 to meet SCD's job scheduling needs on
the Parallel Vector Processor (PVP) systems that could not be met with the
vendor-supplied batch queuing software. Some features include round-robin by
proposal scheduling and priority-based pre-emptive scheduling. Also,
near-dedicated jobs -- jobs that need almost all of a computer's resources to
run -- can be automatically scheduled, as can jobs with other special needs
(such as very high memory requirements). Calendar-based scheduling options
were added to BPS in 1997 to start and stop batch queues automatically on
certain days or at certain times of the day. This feature also gives us the
ability, for example, to give higher priority to low-memory jobs during the
day and high-memory jobs at night.
During FY1998, BPS managed production workloads on all the Cray supercomputers
and was ported in early FY1998 to the Silicon Graphics Power Challenge
(winterpark). This port has provided SSG with more control over the queues
and batch jobs on winterpark and allows us to implement queue and job
policies as needed.
Several enhancements were made during FY1998, including:
- A subset of BPS was developed to schedule jobs within the DSM
environment. This new DSM-based version of BPS was renamed BDS (Batch
Dedicated Scheduler). This allowed SSG to schedule in such a way that
the machine is utilized as much as possible while attempting to
minimize any degradation (over truly dedicated) in the performance of
individual jobs, and to prevent processor oversubscription.
- BDS was ported to and placed into production on the Origin2000 (ute).
This provided us with scheduling capabilities similar to what we have
recently provided on the Crays and what was implemented on the HP
SPP-2000 (sioux).
Network Queuing System and File Transfer Agent (FTA)
The primary purpose of the NQE Client/FTA project was to replace MASnet
remote batch job submission. In addition, the NQE Client/FTA project
transfers reliance from a locally developed and supported product to an
off-the-shelf vendor-supported product. MASnet remote batch job submission
relies on obsolete USCP (UNICOS Station Call Processor) hooks in the NQS
code. Continued vendor support of USCP hooks in NQS is uncertain.
NQE clients and FTA enable a user to submit, delete, and status-check an NQE
batch job from a remote system, such as a desktop workstation or
departmental server. In addition, the NQE batch job's standard out and
standard error files are returned to the system from which the job was
submitted.
FTA is the underlying transport mechanism used for
moving files between NQE clients and NQE execution servers. FTA has been
configured to provide reliable transport service, meaning that FTA transfers
that have failed due to network problems are retried until the transfer is
successful. FTA has also been configured to use peer-to-peer authorization,
which allows FTA file transfers to take place without sending passwords
across the network.
NQE clients and FTA were installed on the MIGS, meeker, and niwot systems in
SCD and on several MMM, CGD, HAO, and ACD hosts. FTA was configured on the
Cray C90 (antero), Cray J90/20 (aztec), Cray J90se/24 (ouray and chipeta),
Cray J90/16 (paiute), and Silicon Graphics Power Challenge (winterpark),
and the primary NQE transfer agent on these hosts was set to FTA.
System monitoring and reporting
SSG maintains a suite of system monitoring utilities (known collectively as
"sysmon") on all compute servers; these utilities monitor the servers and
log critical system information. Currently the sysmon software routinely
sends SSG brief reports on system utilization, error and warning conditions,
and system daemon status. This software also keeps track of MSS activities
on the supercomputers and alerts SSG and the SCD Computer Production Group
(CPG) staff when anomalous conditions occur.
Sysmon has been a very useful tool for SSG and CPG. SSG did some
enhancements to further automate the operation and monitoring of
supercomputer systems.
In addition, in late FY1998, the High Performance Systems section of SCD, in
cooperation with CPG, developed and began the operational deployment of
additional system monitoring capabilities which are integrated with
commercial paging services. Experience to date has indicated that this
additional notification capability may free CPG staff from some of the more
mundane system-operation tasks while providing an even more timely alert
mechanism to potential problems with the production supercomputers, Mass
Storage System, and server systems.
System support for current supercomputers
SSG continued to provide, as its primary responsibility, system support for
the current production supercomputers, and delivered the same level of
support as we have in the past. However, our resources were divided between
that and learning how to manage the new Distributed Shared Memory (DSM)
systems like the Origin2000 and the SPP-2000 supercomputers in the NCAR
computational environment.
SSG tracked vendor system releases and upgraded all the supercomputers to
the latest levels of software. For instance, UNICOS 10.0, which adds
Year-2000 compliance and enhanced reliability was installed on all
Crays during the spring of 1998. More information appears in the
Maintenance of the existing production
supercomputer environment report.
Year 2000 planning and testing
SCD conducted some assessments and evaluations during FY1998 and
concentrated on upgrading all the supercomputer systems to be Year-2000
compliant. More testing will be done in FY1999. More detail appears in
SCD's Year 2000 planning and testing report
and in SCD's
Y2K
Overview and Plans.
In brief, SSG's Y2K activities during FY1998 included:
- Upgraded all Cray UNICOS systems to UNICOS 10.0 and Y2K-compliant I/O
subsystem software, and later versions
- Upgraded all Silicon Graphics Irix systems to Irix 6.5, and later versions
- Upgraded the HP SPP-2000 from SPP-UX 5.2.2.134 to SPP-UX 5.3
In addition, now that these systems have been upgraded, SSG intends to
conduct a sequence of careful, isolated tests of these systems for Year 2000
problems that may exist in operational, administrative, and/or usage
procedures.