This tutorial will discuss and deliver techniques to derive optimal system and application performance on IBM clusters. While the information may be interleaved throughout the day, the topics to be covered can be grouped into two categories: optimal use of IBM compilers and tuning IBM clusters for system performance.
Valuable information will be provided on using the IBM XL Compilers (C/C++/Fortran) and techniques for improving cache utilization in HPC applications.
Four areas will be included:
1) Overview and intelligent use of IBM XL Compiler options
2) Techniques (including examples) for tuning applications for better cache utilization
3) Power5 specific performance issues
4) Blue Gene specific performance issues
Following discussion of compiler capabilities, the focus will shift to techniques to tune for system performance.
Topics to be covered are:
1) Profiling and measuring performance bottlenecks
2) IBM HPS switch and other networks
3) Large, medium, small pages
4) Memory affinity, processor binding
5) SMT (simultaneous multithreading)
6) I/O tuning
7) AIX versus Linux issues |