Using Hardware Performance Monitor (HPM) Toolkit: A primer
last update: 01/05/2009

Simple serial example

This is a brief example using HPMSTART and HPMSTOP around an exponential sum loop. This example presents the run script and the HPM Toolkit output.

Run script:

#! /bin/csh
cat << 'EOF' > it.F
      program main
      implicit none
#include "/usr/include/f_hpm.h"
      integer i
      real sum
! Initialize hpmtoolkit:
      call f_hpminit(100,"main")
      sum=0.0
      call f_hpmstart(99, "Sum EXP")
      do i=1,10000000
         sum=sum+exp(.00000001*i)
      end do
! Stop instrumentation after compute loop:
! Generate hardware analysis output file:
      call f_hpmstop(99)
      call f_hpmterminate(100)
      stop
      end
'EOF'
xlf95_r -I/usr/include -qarch=auto -O3 -qstrict -oit it.F -L/usr/lib -lhpm_r -lpmapi
./it
rm it*

Output on bluefire POWER6:


Total execution time of instrumented code (wall time): 0.012953 seconds

 ########  Resource Usage Statistics  ########  

 Total amount of time in user mode            : 0.022298 seconds
 Total amount of time in system mode          : 0.003244 seconds
 Maximum resident set size                    : 2204 Kbytes
 Average shared memory use in text segment    : 0 Kbytes*sec
 Average unshared memory use in data segment  : 35 Kbytes*sec
 Number of page faults without I/O activity   : 513
 Number of page faults with I/O activity      : 1
 Number of times process was swapped out      : 0
 Number of times file system performed INPUT  : 0
 Number of times file system performed OUTPUT : 0
 Number of IPC messages sent                  : 0
 Number of IPC messages received              : 0
 Number of signals delivered                  : 0
 Number of voluntary context switches         : 16
 Number of involuntary context switches       : 0

 #######  End of Resource Statistics  ########

 Instrumented section: 99 - Label: Sum EXP - process: 100
 file: it.F, lines: 9 <--> 15
  Count: 1
  Wall Clock Time: 0.01282 seconds
  Total time in user mode: 0.0127582178996599 seconds

 Set: 1
 Counting duration: 0.012809500 seconds
  PM_FPU_1FLOP (FPU executed one flop instruction )          :        10000000
  PM_FPU_FMA (FPU executed multiply-add instruction)         :               0
  PM_FPU_FSQRT_FDIV (FPU executed FSQRT or FDIV instruction) :               0
  PM_CYC (Processor cycles)                                  :        60014657
  PM_RUN_INST_CMPL (Run instructions completed)              :        15861666
  PM_RUN_CYC (Run cycles)                                    :        60244801


  Utilization rate                                 :          99.518 %
  Flop                                             :          10.000 Mflop
  Flop rate (flops / WCT)                          :         780.031 Mflop/s
  Flops / user time                                :         783.809 Mflop/s
  FMA percentage                                   :           0.000 %

Next page | Table of contents - HPM Toolkit primer

If you have questions about this document, please contact us by any of the methods shown on this page: CISL Customer Support.

© Copyright 2003-2009. University Corporation for Atmospheric Research (UCAR). All Rights Reserved.

Address of this page: http://www.cisl.ucar.edu/docs/ibm/hpm.toolkit/ex.serial.html