Data center expansion advanced conceptual design

High-end computing has always been a critical facility at NCAR. Since NCAR's inception, the role of computational science in the atmospheric and related sciences has increased massively. The computing needs of the science community should therefore drive the requirement for computing facilities at NCAR.

Two trends drive the move toward large increases in computing power: model resolution and model complexity. Climate simulations that will directly resolve the mesoscale structures of the oceans and atmosphere will require petascale computing: at least 1 quadrillion floating-point operations per second (PFLOPS) of sustained processing power. As climate models become more robust, scientists are incorporating additional components, chemical species, and physical processes to improve fidelity. These trends toward increased model resolution and complexity combine to produce an explosive increase in demand for climate computing cycles.

By FY2008, it is estimated that the science demand for NCAR computing facilities will reach a total of 27.5 sustained TFLOPS, which is more than double the computing capacity of the Earth Simulator.

The supercomputers are only one component of a functional computing facility. Networking, servers, and mass storage systems must also be provisioned. Further, the data center itself must provide sufficient uninterruptible electrical power, efficiently remove the waste heat produced, and be large enough to house all the systems and support equipment. Above all, the entire facility must be reliable: outages are detrimental to scientific productivity and may even shorten the life span of the equipment.

As CMOS technology advances, power consumption trends upward. Power density inside the cabinet is also of great concern: the higher the power density, the more difficult it will be to cool the cabinet and the more quickly the system will be damaged if cooling is lost. Two factors produce the power density problem. First, individual processors are consuming increasing amounts of power. Second, processors communicate better if they are placed closer together: it increases performance to put as many processors as possible on a board and as many boards as possible in the cabinet of a parallel system.

For the cluster systems that NCAR uses, the resultant power density has been increasing dramatically. Only five years ago, 4 kilowatts (kW) in a single processing cabinet was considered high, two years later it reached 9 kW, and today 20 kW is common. Because of the current trend toward less expensive, more powerful, hotter-running computers, electrical and cooling infrastructure are now the dominant factors in planning and running a computing facility. A second factor to consider is the reliability required of the facility.

The Mesa Lab computing facility cannot be practically refitted to handle the power requirements, the heat load, or the floor space demanded by the computing equipment in NCAR's future. As a result of analyzing the Mesa Lab facility, due diligence required planners to investigate other options for providing a computing facility with a 15 to 30-year future:

  • Convert one of UCAR's buildings on the Center Green Campus
  • Lease or acquire an existing data center
  • Utilize collocation to outsource the data center function
  • Expand in another location with ground-up construction

Of these, only expansion in a new location was a viable solution. A next-generation facility design intended to handle high-heat-density equipment with the flexibility to adapt to changing technologies provides the capacity and reliability demanded by the science. The following advantages and disadvantages were identified.

Advantages:

  • This is a ground-up design intended to meet scientific computing needs for the long term.
  • The facility will be designed for handling modern high-heat-density equipment.
  • Flexibility can be built in from the beginning allowing the building to adapt as demand dictates.
  • Construction can be phased to provide floor space only as it becomes necessary.
  • Phased construction allows still more flexibility as computing and cooling technologies evolve.
  • Phased construction allows financing to be arranged at intervals.
  • Can meet long-term needs.

Disadvantages:

  • The lead time on new construction is significant with two to four years from design to commissioning being normal.

The building design is modular. The initial building shell will be constructed anticipating needs over the next 10 years, with mechanical and electrical infrastructure outfitted for the five-year requirements. NCAR's criteria for site selection will seek to minimize objective risk, provide space for complete build-out over 20 years, and provide access to utilities, water, and wide-area network connectivity.

Floor plan of proposed data center
 

 

FY2005 Annual Report