Getting started on bluevista
last update: 4/10/2008

Program development tools

This section describes tools you can use to streamline and enhance your program development on bluevista, including source code management, code repositories, debuggers, and performance analyzers.

Building and maintaining large codes

If you maintain only one file containing source code for a small program, it is simple to move the code among several computers, then compile, build, and run. In this easy case, you may keep the details of the process in your mind for each computer. But when your source code is large, it likely will have many subroutines and functions, split into multiple files, making maintenance and recompilation more complex.

A number of tools are available for compiling and maintaining large source codes. The popular way to build large executables is to use the make utility to compile multiple files and link them together. A file called a makefile is used to set up rules to compile each file, in which order, or even to skip compilation of a file if its object file already exists. Makefiles can be used to compile files written in more than one language, such as Fortran and C.

If source code is under development, it can be useful to keep files in a repository that tracks the evolution of different versions of the source and provides the ability to revert to previous versions if necessary. Two popular repository tools are CVS and subversion.

Below we will use CVS and make to maintain our example source code. This example includes Fortran and C code for a matrix-multiplication program, a makefile, and a run script.

Maintaining codes with a CVS repository

  1. Create and initialize repository:
    1. Create a CVS directory:

      mkdir /homebv/consult1/REPOSITORY 
      
    2. Set CVS root environment variable:

    3. setenv CVSROOT /homebv/consult1/REPOSITORY (for C-shell user) 
      (export CVSROOT=/homebv/consult1/REPOSITORY for Korn shell user) 
      
    4. Initialize this repository:

    5. cvs init 
      
  2. Put codes into repository:
    1. Create MPI-parallelized matrix-multiplication directory:

      cd /ptmp/consult1 
      mkdir mpi_matrix_multiplication 
      
    2. Write program in mpi_matrix_multiplication directory, using Fox algorithm to multiply two square matrices. The first program is named fox.F90:

      cd mpi_matrix_multiplication 
      vi fox.F90 
      
    3. Put this into the CVS repository:

    4. cvs import -m "create mpi_matrix_multiplication repository" mpi_matrix_multiplication version_0 release_0
      
    5. Remove the directory named mpi_matrix_multiplication. (This directory is not needed any more, so we can remove it. Or we can move it to another name.)

  3. Maintain the code with cvs:
    1. Access the code with the "cvs checkout" command:

      cvs checkout mpi_matrix_multplication 
      

      This creates the new directory mpi_matrix_multplication with file fix.F90 in it. This is our work directory.

    2. Before adding the new file "Makefile" to the repository directory, use the command "cvs update" to verify its location:

      ? Makefile 
      

      Here "?" means Makefile is not in the repository.

    3. Schedule Makefile to be added:

      cvs add Makefile 
      
    4. Move Makefile into the repository:

      cvs commit -m "add" Makefile 
      

    See the CVS website for further instructions for how to modify a file, remove a file, make a tag, branch, and more.

    Creating a Makefile for use with make

    This sample Makefile is provided to help you write a makefile for your own program.

    We named our file Makefile because make is one of the orginal Unix tools for software engineering. Make searches for a file named "Makefile," and if not found, then it searches for "makefile," then for other choices. Many people name their file Makefile to conform to make's first choice.

    The components in Makefile are Comments, Macros, Targets, Rules, Dependency, and so on.

TotalView debugger

Totalview is available on bluevista under /usr/local/bin. Parallel programs require the use of a host list. An example script is given below for an MPI parallel program. (Note also that parallel jobs must be run in batch via LSF.) Before using Totalview, be sure that you have X Window forwarding turned on (for example, by logging on using ssh -Y).

Run this script by submitting it using bsub < script. Wait for the Totalview window to pop up. To begin debugging a parallel program, press "go" in the main window. You will see a dialog box saying that poe is a parallel program and asking if you want to stop the program. Press "yes." The source code will then be displayed so that you can set breakpoints.

#!/usr/bin/csh
#
# LSF batch script to debug an MPI code (cpi)
# under totalview
#
#BSUB -n 2                            # number of total tasks
#BSUB -o mpilsf.out.%J                # output filename (%J to add job id)
#BSUB -e mpilsf.err.%J                # error filename
#BSUB -J mpilsf.test                  # job name
#BSUB -q debug                        # queue
#BSUB -W 0:15                         # wallclock time limit
#BSUB -P 12345678                     # project number

#need to specify ip rather than us
setenv MP_EUILIB ip
setenv MP_PROCS 2

#create a list of hosts this job uses
touch ./myhosts
foreach host ($LSB_HOSTS)
   echo $host >> ./myhosts
end

setenv MP_HOSTFILE myhosts

/usr/local/bin/totalview poe -a ./cpi 

Memory debugging example

To debug for memory leaks and other memory problems, it is necessary to link in a TotalView library that replaces the malloc on the system. The following example shows how to link and run a program with memory leaks under the TotalView debugger on bluevista.

#!/usr/bin/csh
#debug.leak - compiles and runs memory debug example

setenv LIBPATH /usr/local/toolworks/tvheap_mr

xlf90 -o leak -g -q64 -qfixed leak.f -L/usr/local/toolworks/tvheap_mr \
  -L/usr/local/toolworks/totalview/lib \
    /usr/local/toolworks/totalview/lib/aix_malloctype64_5.o

cat << 'EOF1' > leak.f
      program testit
      implicit none
      integer i, ierror

      do i=1,10
         call loknlod
         print *, 'made it through loknlod ', i
      end do
      stop
      end

      subroutine loknlod
c simulates memory leak by failing to deallocate arrays
      real, allocatable:: foo(:,:)
      real, allocatable:: foo2(:,:)
      integer i,j,ierror

      print *, 'stepped into loknload'
      allocate(foo(50,100),stat=ierror)
      if(ierror /=0) then
         write(*,*)"Error trying to allocate foo"
         stop
      endif

      allocate(foo2(100,100),stat=ierror)
      if(ierror /=0) then
         write(*,*)"Error trying to allocate foo2"
         stop
      endif
      
      do j=1,50
         do i=1,100
            foo(i,j) = 6 + i
            foo2(i,j) = 100 + i
         end do
      end do
      return
      end subroutine loknlod
'EOF1'

cat << 'EOF2' > run.leak
#!/usr/bin/csh
#
# LSF batch script to do memory debugging
# under totalview
#
#BSUB -n 2                          # number of total tasks
#BSUB -o leak.lsf.out.%J            # output filename (%J to add job id)
#BSUB -e leak.lsf.err.%J            # error filename
#BSUB -J leak.lsf.test              # job name
#BSUB -q debug                      # queue
#BSUB -W 0:15                       # wallclock time limit
#BSUB -P 12345678                   # account number

#need to specify ip rather than us
setenv MP_EUILIB ip

/usr/local/bin/totalview ./leak
'EOF2'

bsub < run.leak

#When totalview window pops up, select Tools->Memory Debugging. Under
#Configuration tab, select the process called "leak". Toggle off
#button that halts when memory error occurs. Return to program window
#and set breakpoint near end (line 9). Run (Go) to breakpoint. In
#Memory Debugging window, select Leak Detection tab and follow
#instructions for generating a leak detection view of all memory
#leaks.

Program timers

Here are four easy ways to time your program.

Use the Unix command "date" in a simple script

With a simple script such as this:

   echo "start date:"
   date
   run-your-executable
   echo "end date:"
   date

The output of this script provides the start time and the end time of your program; the difference is the wall-clock your program used.

Use the Unix command "timex" from the command line or a one-line script

The output of the command

timex my_program

yields three numbers identified as real, user, and sys:
"real" is wall-clock time
"user" is the time used by user program
"sys" is time system used to load and unload your program and others.

Use built-in functions specific to C, C++ or Fortran

For C/C++, call the functions clock(), time(...), difftime(...), etc. to get different types of timing.

For Fortran, call the function date_and_time(...)

Use MPI functions specific to C, C++ or Fortran

If your are writing a MPI program, you should use the MPI functions and not the previous built-in functions

For C/C++, call the functions MPI_Wtime() MPI_Wtick()

For Fortran, call the functions MPI_WTIME() MPI_WTICK()

Memory analyzer

There are two easy ways to analyze the memory footprint of your program.

Use the standard AIX real time tools

Type ps or top (with the appropriate arguments) from the command line, to have a snapshot of the current memory usage of all the programs running. You may have to search for your job.

Use the CSG "Job Memory Usage" tool

If you would like to know the total (peak) memory usage of just your job, without continuously monitoring the memory in real time, you can use the job_memusage.exe tool. It will print on stdout the memory usage of your program, when the latter terminates. It is located under /contrib/bin and can be used like:

/contrib/bin/job_memusage.exe [--details] your-program [your-arguments]
The option "--details" enables a detailed view, which usually is not required (if present, the "--details" must placed BEFORE everything else).
If you have argument(s) to pass, you can, and it works also for output redirection, such as "<".
It works either interactively (i.e. on command line), for OpenMP, for MPI, and hybrid programs.
Command line or OpenMP example:
/contrib/bin/job_memusage.exe ./hello_world.exe 
MPI and hybrid (usually in a LSF script) example:
export MP_LABELIO=yes # if you use ksh
mpirun.lsf /contrib/bin/job_memusage.exe ./cam < namelist
or:
setenv MP_LABELIO yes # if you use csh
mpirun.lsf /contrib/bin/job_memusage.exe ./cam < namelist
When your job returns, there will be some output. For command line there will be a single line, with the total memory usage of your job. For MPI and hybrid there will be a line for every node on which your program ran, and that's why it is useful to enable the MP_LABELIO environment variable (which is not strictly required): to identify every single node among the others.

Performance analyzers

Use the AIX command "hpmcount" to analyze your code performance.

The simple way to use hpmcount is to invoke HPMCOUNT directly:

   hpmcount my_executable

When invoked in this way, hpmcount starts your application, and when execution ends it produces a report summarizing:

For more information on how to use hpmcount, see Using HPM Toolkit: A primer for details.


Next page | Table of contents - Getting started on blueice

If you have questions about this document, please contact CISL Customer Support. You can also reach us by telephone 24 hours a day, seven days a week at 303-497-1278. Additional contact methods: consult1@ucar.edu and during business hours in NCAR Mesa Lab Suite 39.

© Copyright 2006. University Corporation for Atmospheric Research (UCAR). All Rights Reserved.

Address of this page: http://www.cisl.ucar.edu/docs/bluevista/tools.html