Getting started on bluevista
last update:
4/10/2008
This section describes tools you can use to streamline and enhance your program development on bluevista, including source code management, code repositories, debuggers, and performance analyzers.
If you maintain only one file containing source code for a small program, it is simple to move the code among several computers, then compile, build, and run. In this easy case, you may keep the details of the process in your mind for each computer. But when your source code is large, it likely will have many subroutines and functions, split into multiple files, making maintenance and recompilation more complex.
A number of tools are available for compiling and maintaining large source codes. The popular way to build large executables is to use the make utility to compile multiple files and link them together. A file called a makefile is used to set up rules to compile each file, in which order, or even to skip compilation of a file if its object file already exists. Makefiles can be used to compile files written in more than one language, such as Fortran and C.
If source code is under development, it can be useful to keep files in a repository that tracks the evolution of different versions of the source and provides the ability to revert to previous versions if necessary. Two popular repository tools are CVS and subversion.
Below we will use CVS and make to maintain our example source code. This example includes Fortran and C code for a matrix-multiplication program, a makefile, and a run script.
Maintaining codes with a CVS repository
- Create and initialize repository:
Create a CVS directory:
mkdir /homebv/consult1/REPOSITORYSet CVS root environment variable:
setenv CVSROOT /homebv/consult1/REPOSITORY (for C-shell user) (export CVSROOT=/homebv/consult1/REPOSITORY for Korn shell user)Initialize this repository:
cvs init- Put codes into repository:
Create MPI-parallelized matrix-multiplication directory:
cd /ptmp/consult1 mkdir mpi_matrix_multiplicationWrite program in mpi_matrix_multiplication directory, using Fox algorithm to multiply two square matrices. The first program is named fox.F90:
cd mpi_matrix_multiplication vi fox.F90Put this into the CVS repository:
cvs import -m "create mpi_matrix_multiplication repository" mpi_matrix_multiplication version_0 release_0Remove the directory named mpi_matrix_multiplication. (This directory is not needed any more, so we can remove it. Or we can move it to another name.)
- Maintain the code with cvs:
Access the code with the "cvs checkout" command:
cvs checkout mpi_matrix_multplicationThis creates the new directory mpi_matrix_multplication with file fix.F90 in it. This is our work directory.
Before adding the new file "Makefile" to the repository directory, use the command "cvs update" to verify its location:
? MakefileHere "?" means Makefile is not in the repository.
Schedule Makefile to be added:
cvs add MakefileMove Makefile into the repository:
cvs commit -m "add" MakefileSee the CVS website for further instructions for how to modify a file, remove a file, make a tag, branch, and more.
Creating a Makefile for use with make
This sample Makefile is provided to help you write a makefile for your own program.
We named our file Makefile because make is one of the orginal Unix tools for software engineering. Make searches for a file named "Makefile," and if not found, then it searches for "makefile," then for other choices. Many people name their file Makefile to conform to make's first choice.
The components in Makefile are Comments, Macros, Targets, Rules, Dependency, and so on.
Comments
Any line that starts with "#" is a comment line.
Macros
Macros are defined as "=" pairs in Makefile. Macros are used to define compilers, commands, flags, file lists, etc. Usually the Macros are defined near the top of the Makefile for easy editing.
In our sample Makefile these are the Macros:
LN = ln -sf MAKE = make -i -r RM = /bin/rm -f MV = /bin/mv -f CP = /bin/cp -f AR = ar ru #################################################### CC = mpcc_r CFLAGS = -I/usr/local/hpmtoolkit/dist_3.1.0/pwr5_aix5/include -I. -pg -DDM_PARALLEL -DIBM FC = mpxlf90_r FFLAGS = -I. -w -qsmp=omp -O4 -qfree=f90 -qipa -pgTargets
Targets are the things you want to produce, or goals you want to achieve. A target can be to produce some object files, an executable, or remove some (object) files. Here is a target example:
fox.exe: $(OBJS) $(F_OBJS) $(FC) -o $@ $(FFLAGS) $(OBJS) $(F_OBJS) $(LOC_LIBS)The fox.exe executable will be built from object files. It will be compiled using the Fortran compiler (FC) and options defined above in the Macros.
Rules
Rules define how the target will be achieved. Here is an example of how to build an object (.o) file from a .F90 file:
.F90.o: $(RM) $*.o $*.f $(CPP) $(CPPFLAGS) $*.F90 > $*.f $(FC) -c $(FFLAGS) $*.fDependency
Dependency means that a file depends on something contained in another file.
The generated target must be newer than the dependency. Here is an example, where fox.o must be newer than index.o:fox.o: index.oIn our sample Makefile, there is a target named "deflt" that, if you just type the "make" command, make will use as the default target. Here it will generate the executable "fox.exe". If you want to remove all object files and the executable, you can just type "make clean". Then make will execute the "clean" target.
Totalview is available on bluevista under /usr/local/bin. Parallel programs require the use of a host list. An example script is given below for an MPI parallel program. (Note also that parallel jobs must be run in batch via LSF.) Before using Totalview, be sure that you have X Window forwarding turned on (for example, by logging on using ssh -Y).
Run this script by submitting it using bsub < script. Wait for the Totalview window to pop up. To begin debugging a parallel program, press "go" in the main window. You will see a dialog box saying that poe is a parallel program and asking if you want to stop the program. Press "yes." The source code will then be displayed so that you can set breakpoints.
#!/usr/bin/csh # # LSF batch script to debug an MPI code (cpi) # under totalview # #BSUB -n 2 # number of total tasks #BSUB -o mpilsf.out.%J # output filename (%J to add job id) #BSUB -e mpilsf.err.%J # error filename #BSUB -J mpilsf.test # job name #BSUB -q debug # queue #BSUB -W 0:15 # wallclock time limit #BSUB -P 12345678 # project number #need to specify ip rather than us setenv MP_EUILIB ip setenv MP_PROCS 2 #create a list of hosts this job uses touch ./myhosts foreach host ($LSB_HOSTS) echo $host >> ./myhosts end setenv MP_HOSTFILE myhosts /usr/local/bin/totalview poe -a ./cpi
To debug for memory leaks and other memory problems, it is necessary to link in a TotalView library that replaces the malloc on the system. The following example shows how to link and run a program with memory leaks under the TotalView debugger on bluevista.
#!/usr/bin/csh #debug.leak - compiles and runs memory debug example setenv LIBPATH /usr/local/toolworks/tvheap_mr xlf90 -o leak -g -q64 -qfixed leak.f -L/usr/local/toolworks/tvheap_mr \ -L/usr/local/toolworks/totalview/lib \ /usr/local/toolworks/totalview/lib/aix_malloctype64_5.o cat << 'EOF1' > leak.f program testit implicit none integer i, ierror do i=1,10 call loknlod print *, 'made it through loknlod ', i end do stop end subroutine loknlod c simulates memory leak by failing to deallocate arrays real, allocatable:: foo(:,:) real, allocatable:: foo2(:,:) integer i,j,ierror print *, 'stepped into loknload' allocate(foo(50,100),stat=ierror) if(ierror /=0) then write(*,*)"Error trying to allocate foo" stop endif allocate(foo2(100,100),stat=ierror) if(ierror /=0) then write(*,*)"Error trying to allocate foo2" stop endif do j=1,50 do i=1,100 foo(i,j) = 6 + i foo2(i,j) = 100 + i end do end do return end subroutine loknlod 'EOF1' cat << 'EOF2' > run.leak #!/usr/bin/csh # # LSF batch script to do memory debugging # under totalview # #BSUB -n 2 # number of total tasks #BSUB -o leak.lsf.out.%J # output filename (%J to add job id) #BSUB -e leak.lsf.err.%J # error filename #BSUB -J leak.lsf.test # job name #BSUB -q debug # queue #BSUB -W 0:15 # wallclock time limit #BSUB -P 12345678 # account number #need to specify ip rather than us setenv MP_EUILIB ip /usr/local/bin/totalview ./leak 'EOF2' bsub < run.leak #When totalview window pops up, select Tools->Memory Debugging. Under #Configuration tab, select the process called "leak". Toggle off #button that halts when memory error occurs. Return to program window #and set breakpoint near end (line 9). Run (Go) to breakpoint. In #Memory Debugging window, select Leak Detection tab and follow #instructions for generating a leak detection view of all memory #leaks.
Here are four easy ways to time your program.
Use the Unix command "date" in a simple script
With a simple script such as this:
echo "start date:" date run-your-executable echo "end date:" dateThe output of this script provides the start time and the end time of your program; the difference is the wall-clock your program used.
Use the Unix command "timex" from the command line or a one-line script
The output of the command
timex my_programyields three numbers identified as real, user, and sys:
"real" is wall-clock time
"user" is the time used by user program
"sys" is time system used to load and unload your program and others.Use built-in functions specific to C, C++ or Fortran
For C/C++, call the functions clock(), time(...), difftime(...), etc. to get different types of timing.
For Fortran, call the function date_and_time(...)
Use MPI functions specific to C, C++ or Fortran
If your are writing a MPI program, you should use the MPI functions and not the previous built-in functions
For C/C++, call the functions MPI_Wtime() MPI_Wtick()
For Fortran, call the functions MPI_WTIME() MPI_WTICK()
There are two easy ways to analyze the memory footprint of your program.
Use the standard AIX real time tools
Type ps or top (with the appropriate arguments) from the command line, to have a snapshot of the current memory usage of all the programs running. You may have to search for your job.
Use the CSG "Job Memory Usage" tool
If you would like to know the total (peak) memory usage of just your job, without continuously monitoring the memory in real time, you can use the job_memusage.exe tool. It will print on stdout the memory usage of your program, when the latter terminates. It is located under /contrib/bin and can be used like:
/contrib/bin/job_memusage.exe [--details] your-program [your-arguments]The option "--details" enables a detailed view, which usually is not required (if present, the "--details" must placed BEFORE everything else).
If you have argument(s) to pass, you can, and it works also for output redirection, such as "<".
It works either interactively (i.e. on command line), for OpenMP, for MPI, and hybrid programs.
Command line or OpenMP example:/contrib/bin/job_memusage.exe ./hello_world.exeMPI and hybrid (usually in a LSF script) example:export MP_LABELIO=yes # if you use ksh mpirun.lsf /contrib/bin/job_memusage.exe ./cam < namelistor:setenv MP_LABELIO yes # if you use csh mpirun.lsf /contrib/bin/job_memusage.exe ./cam < namelistWhen your job returns, there will be some output. For command line there will be a single line, with the total memory usage of your job. For MPI and hybrid there will be a line for every node on which your program ran, and that's why it is useful to enable the MP_LABELIO environment variable (which is not strictly required): to identify every single node among the others.
Use the AIX command "hpmcount" to analyze your code performance.
The simple way to use hpmcount is to invoke HPMCOUNT directly:
hpmcount my_executableWhen invoked in this way, hpmcount starts your application, and when execution ends it produces a report summarizing:
- Wall-clock time
- Statistics on resource utilization
- Information from hardware performance counters
- Derived hardware metrics
For more information on how to use hpmcount, see Using HPM Toolkit: A primer for details.
Next page | Table of contents - Getting started on blueice
If you have questions about this document, please contact CISL Customer Support. You can also reach us by telephone 24 hours a day, seven days a week at 303-497-1278. Additional contact methods: consult1@ucar.edu and during business hours in NCAR Mesa Lab Suite 39.
© Copyright 2006. University Corporation for Atmospheric Research (UCAR). All Rights Reserved.
Address of this page: http://www.cisl.ucar.edu/docs/bluevista/tools.html