PAPI Tutorials
Introduction
PAPI is a portable hardware performance counter library developed by the Innovative Computing Labratory at the University of Tennessee. The goal of the PAPI project is to provide a consistent interface to the hardware performance counters found on most modern microprocessors. PAPI can be used to measure a variety of performance characteristics across a diverse field of computer architectures, and can be a very effective tool for understanding the performance of your code.
This page contains examples demonstrating how to instrument applications with PAPI. The PAPI library can be called from C/C++ and Fortran. I chose to write the examples on this page in Fortran90 because the documentation on the PAPI home page is more C oriented and has fewer Fortran examples.
There are seven basic examples on this page:
Basics - How to set up counters and compile/link with PAPI. Events - How to list available events on your system. Hardware - How to get some hardware info from PAPI. Timers - How to use PAPI timers to time code sections. Cache - How to count cache misses with PAPI. FLOPS - How to measure FLOPS with PAPI. Threads - How to use PAPI with threaded code.
The examples are quite general and it should be simple to adapt any of the examples to count different sets of events. The examples have been built and tested on IBM Power3 and Power4 systems at NCAR. The techniques used in the examples should work on any platform on which PAPI is available, however not all platforms support every event type so it is important to check which events are available on your system before you start using PAPI.
For additional resources check the related links page.