Using Hardware Performance Monitor (HPM) Toolkit:
A primer
last update:
01/05/2009
This is a brief example using HPMSTART and HPMSTOP around an exponential sum loop. This example presents the run script and the HPM Toolkit output.
Run script:
#! /bin/csh cat << 'EOF' > it.F program main implicit none #include "/usr/include/f_hpm.h" integer i real sum ! Initialize hpmtoolkit: call f_hpminit(100,"main") sum=0.0 call f_hpmstart(99, "Sum EXP") do i=1,10000000 sum=sum+exp(.00000001*i) end do ! Stop instrumentation after compute loop: ! Generate hardware analysis output file: call f_hpmstop(99) call f_hpmterminate(100) stop end 'EOF' xlf95_r -I/usr/include -qarch=auto -O3 -qstrict -oit it.F -L/usr/lib -lhpm_r -lpmapi ./it rm it*
Output on bluefire POWER6:
Total execution time of instrumented code (wall time): 0.012953 seconds ######## Resource Usage Statistics ######## Total amount of time in user mode : 0.022298 seconds Total amount of time in system mode : 0.003244 seconds Maximum resident set size : 2204 Kbytes Average shared memory use in text segment : 0 Kbytes*sec Average unshared memory use in data segment : 35 Kbytes*sec Number of page faults without I/O activity : 513 Number of page faults with I/O activity : 1 Number of times process was swapped out : 0 Number of times file system performed INPUT : 0 Number of times file system performed OUTPUT : 0 Number of IPC messages sent : 0 Number of IPC messages received : 0 Number of signals delivered : 0 Number of voluntary context switches : 16 Number of involuntary context switches : 0 ####### End of Resource Statistics ######## Instrumented section: 99 - Label: Sum EXP - process: 100 file: it.F, lines: 9 <--> 15 Count: 1 Wall Clock Time: 0.01282 seconds Total time in user mode: 0.0127582178996599 seconds Set: 1 Counting duration: 0.012809500 seconds PM_FPU_1FLOP (FPU executed one flop instruction ) : 10000000 PM_FPU_FMA (FPU executed multiply-add instruction) : 0 PM_FPU_FSQRT_FDIV (FPU executed FSQRT or FDIV instruction) : 0 PM_CYC (Processor cycles) : 60014657 PM_RUN_INST_CMPL (Run instructions completed) : 15861666 PM_RUN_CYC (Run cycles) : 60244801 Utilization rate : 99.518 % Flop : 10.000 Mflop Flop rate (flops / WCT) : 780.031 Mflop/s Flops / user time : 783.809 Mflop/s FMA percentage : 0.000 %
Next page | Table of contents - HPM Toolkit primer
If you have questions about this document, please contact us by any of the methods shown on this page: CISL Customer Support.
© Copyright 2003-2009. University Corporation for Atmospheric Research (UCAR). All Rights Reserved.
Address of this page: http://www.cisl.ucar.edu/docs/ibm/hpm.toolkit/ex.serial.html