blueice links

Documentation

blueice queues and charges

Change your blueice password

 

CISL links

CISL home page

Help for users

Computers

MSS

Research data

Visualization and Enabling Technologies

 


blueice main page

blueice cluster

Overview

Blueice is an IBM clustered Symmetric MultiProcessing (SMP) system based on the POWER5+ processor. The purpose of blueice is to provide a high-performance, scalable, parallel, production platform for NCAR Community Computing and Climate Simulation Lab (CSL) scientists and programmers to run their numerically intensive jobs.

Hardware

  • Processors: 1,744 POWER5+ processors with a 1.9-GHz clock cycle; each can perform four floating-point operations per cycle. One hundred forty-four processors are reserved for interactive sessions, for the share and debug queues, and for handling I/O and communications with the NCAR network and Mass Store System. Sixteen hundred processors are dedicated to batch job processing, providing a peak of 12.16 TFLOPs for batch computing. Blueice utilizes the dual-core POWER5+ chip.
  • The processors are grouped 16 per node, referred to as a 16-way node. There are 112 nodes dedicated as follows:
    • 100 nodes reserved for batch workload
    • 2 nodes reserved for user logins
    • 2 nodes reserved for shared interactive workload
    • 2 nodes reserved for debugging and system services
    • 6 nodes reserved for I/O and MSS connectivity (8-way nodes)

  • Memory: Two interactive nodes, each with 64 GB of memory,
    25 batch nodes, each with 64 GB of memory, and
    75 batch nodes, each with 32 GB of memory.
  • Memory caches:
    • Level 1 cache: 32 KB data; 64 KB instructions; two-way associative per processor
    • Level 2 cache: 1.9 MB L2 cache per processor pair
    • Level 3 cache: 36 MB L3 cache per processor pair

    Note: The POWER5+ microprocessor chip contains two processors (also called "cores"). On blueice, L2 and L3 cache are shared by the two processors on the chip. This is different from bluevista, where the POWER5 microprocessor chip has only one processor, and L2 and L3 caches are not shared with other processors.

  • RAID disk storage capacity: 150 TBytes. By default, each blueice user will have 5 GB of space in their home directory and 250 GB of space in /ptmp. Scrubbing of old data in /ptmp begins when /ptmp usage reaches 85% and continues until usage drops to 70%. This /ptmp scrubbing policy is identical to that of bluevista.
  • High Performance Switch (also known as "interconnect fabric"): The IBM pSeries High Performance Switch (HPS), previously known as the Federation switch, provides a single-link, unidirectional, point-to-point communication peak bandwidth of 2,000 MB per second and latency of 5 microseconds. Each node of the system has a bidirectional, two-link interface to the HPS.
  • Connectivity to the Mass Storage System: There is Gigabit Ethernet connectivity to the MSS Storage Manager. The MSS Storage Manager writes files to the MSS disk cache and to MSS cartridges over fiberchannel and fiberchannel tape drives, respectively.
  • Login connectivity: A Gigabit Ethernet network provides login connectivity. Two of the 16-way nodes are reserved for login and command line interface work only.
  • Security considerations: blueice resides within the SCD security perimeter and can only be accessed via a CRYPTOCard.

Software

  • Operating System: AIX (IBM-proprietary UNIX)
  • Batch system: Load Sharing Facility (LSF)
  • Compilers: Fortran (95/90/77), C, C++
    (Note: The compilers will produce 64-bit APIs. To produce 32-bit APIs, set environment variable OBJECT_MODE to 32.)
  • Utilities: These include pmrinfo, spinfo, batchview, and mssview. Please refer to /bin and /usr/local/bin on blueice for a more complete list of user utilities.
  • Software libraries: These include IBM's parallel libraries for OpenMP and MPI usage. Users may also request single-threaded libraries maintained at NCAR, including Spherepack and Mudpack. SCD prefers that users download the source code for these libraries and install them for their own use.
  • Debugger: TotalView.
  • File System: General Parallel File System (GPFS), a UNIX-style file system that allows applications on multiple nodes to share file data. GPFS supports very large file systems and stripes data across multiple disks for higher performance.
  • System information commands: spinfo for general information; lslpp for information about libraries; batchview for batch jobs; bjall for more detailed information on batch jobs.

Who can use this system / Job scheduling

If you are a present user of CISL/SCD supercomputer resources and you want a blueice account, please request this via the web form at CISL Customer Support.

Batch job scheduling is done via the Load Sharing Facility (LSF) batch system. Please see the documentation section below for pointers to LSF documentation.

How to use this system

Parallel programming on blueice is done with OpenMP, MPI, and a mixture of both (hybrid).

  • To use more than one processor on a node, use OpenMP threading directives on the node, or use MPI processes on the node, or use a mixture of both.
  • To pass information between nodes, you must use MPI.
  • To take full advantage of parallelism: use OpenMP threads, MPI, or a mixture of both on the node plus use MPI between nodes.

How to get an account

All users will receive a blueice login if they have a bluevista login and have logged in to bluevista in the six months preceding December 31, 2006. This applies to CSL and Community Computing users.

Community Computing users who have General Accounting Unit (GAU) allocations are eligible to apply for an account on blueice. Community users may request a blueice login by contacting CISL Customer Support at https://cislcustomersupport.ucar.edu/evj/ExtraView. Please include the following information with your login request:

  • Your login name
  • Your project number

Queues and charging

The class (queue) structure for blueice is described in the document Queues and charging for resource usage on blueice.

Examples

Please see the directory /usr/local/examples on blueice for examples of commonly used batch and interactive jobs.

Documentation

Getting started on blueice

LSF for Bluevista Users, an introductory PowerPoint presentation for new users.

Platform Computing provides High Performance Computing (HPC) documentation for their Load Sharing Facility (LSF) batch job subsystem. To access this LSF HPC documentation, you need access instructions which are inside the UCAR security perimeter.

The following pdf file offers a presentation about Simultaneous Multi-Threading: AIX 5.3 and XL Fortran 10.1 upgrade on bluevista.

POWER5 Processor and System Evolution by Charles Grassl, IBM.

Last updated: 02/13/2007