| Project |
Short Description |
| 90 |
I completed the implementation, documentation and test of the memory usage tool for blueice, bluevista and bluefire. It makes possible for users to analyze the memory footprint of a given interactive, OpenMP, MPI or hybrid program. Since the tool and the user-provided program are tightly tied, if the latter crash badly, also the former could crash. I'm investigating if it's possible to make the tool more robust against user-provided program crashes. |
| 63 |
Some user (group) needs to archive large amount of data to MSS. As one MSS can be as big as 12G, while user files are usually small. So, archive these files can be problematic for user. |
| 71 |
CSG-RAL Collaboration on Cluster Parallel Application |
| 72 |
Support modules utility on coral |
| 61 |
WRF compiles OK with high optimization level, but generate unexpected results. |
| 89 |
This tool measures the bandwidth and latency of an MPI 2-nodes network. It produces raw text data which are then plotted/analyzed by a python script. The tool is able to spawn any even number of processes, in pair of send/receive tasks. At present all the sending processes are on the same node, whereas all the receiving are on the other. All of them concurrently ask for MPI resources. The core of the code is an MPI send/receive, which repeatedly sent/received packets from 16 bytes to 8Mbytes long. Each packet is sent/received N times (N ranges from 1 to 100k) per process. The tool include a shell/LSF script which schedules the jobs in the premium queue with exclusive use of the nodes, to avoid possible interference from other jobs. Results for blueice and bluevista are available. Results are ready for bluefire (re-run after bluefire was released to the users), but further investingation is needed in understaing the results (which has been the same as the preliminary runs) |
| 48 |
Mentoring a SOARS student in numerically solving linear adevction equation on the surface of a sphere, and possibly solve Shallow Water Equation on the same geometry provided time permits. The strategy is to use Runge-Kutta-Discontinuous-Galerkin approach as done by Nair et. al. in cubed sphere, we will extend it to squared sphere map. |
| 87 |
A brief introduction to the SVN version control system has been held during the HSS Monthly Staff Mtg (Wednesday 13th) |
| 65 |
Emmons has special request to run Chemical Forcast for ARCTAS. CISL was original planned to give them dedicated nodes. But it conflicts with Convection projeect, and it needs 2 nodes for 5 hours then 8 nodes for 3 hours, which makes it harder to configure. |
| 50 |
Support compilers such as PathScale and Intel on Linux platforms |
| 64 |
Help Dr. Piri and sudent Saeed Ovaysi from University of Wyoming to port their code to our supers. |
| 66 |
WRF is now NCAR's flagship model. CISL wants to know what is happening to WRF by having a WRF liaison. |
| 62 |
As bluefire ATP shares disk with blueice, disk space is limited to ATP suite. |
| 67 |
Vapor wants to extend it domain to WRF users. CSG will help. |
| 70 |
User default dotfiles |
| 68 |
CSG is documenting Frost for NCAR and teragrid user. |
| 88 |
Because of the different platform, the blueice/bluevista tool cannot be used on lightning. Possible alternatives to analyze the memory footprint of a given program are under development. A preliminary version has been completed which works with MPI programs. |
| 69 |
From Scicomp13 meeting, CSG learned there is software to monitoring LSF job on Power machines. CSG wants to test it on blueice. |
| 49 |
CSG members are working to make FFTPACK friendlier to F90 compilers. |
| 47 |
After the upgrade of bluevista to level 5300-07-01-0748 some users have reported a slowdown to the extent of about 10-12%. CSG member verified and confirmed this claim using FV CAM. |
| 55 |
NCAR TeraGrid Team and User Support |
| 84 |
Following some complains from users (Extraview Tickets), I'm investigating the issue. The "standard" distribution/namefile works fine, the namefile provided by users is under test. |
| 74 |
Compiled and installed the MPI/IO enabled versions of HDF5 and netCDF4 in /contrib. Both required some hacking in their configure scripts, which I also sent upstream. I'm working with them to have a "clean" and portable solution, and possible made the tool available to users (after some tests). |
| 58 |
DTC has asked CSG to give them suggestions to transfer data to NCAR from several sites outside (NCAR firewall). |
| 60 |
When bluefire became operational, CSG has installed LLView on bluefire. With this tool, CSG can identify users job using processor numbers on each node. |
| 73 |
Contacted the users, and adviced them. E-mail works better than ExtraView for this purpose |
| 45 |
Early this year Nancy Collins and Jeff Anderson described their computational work flow under DART. It was clear that without being able to schedule multiple parallel jobs from a single submit they will be severely limited in their computational ability. |
| 86 |
We are participating to the WAG CMS working group (WG) biweekly meetings. The evaluation phases have been completed: Requirements have been Prioritized, Use Cases have been Identified. The working group selected Drupal 6.3, and WEG set-up a test installation, to whom they gave me access. A pre-production implementation should be handed to us at the beginning of August. In the meanwhile I re-installed it on our own machine and I configured it for the hands-on demo. More info are available here: https://wiki.ucar.edu/display/wag/Drupal+6.2 |
| 76 |
The MPI implementation of the bzip2 compression algorithm has been compiled on lightning, bluevista and bluefire. Unfortunately once in a while it seems to fail (subsequent calls with exactly the same arguments usually succeeded). I contacted the author and I discussed with him the possible reason. |
| 83 |
http://mpip.sourceforge.net/ is a nice MPI profiler. We are evaluating it for our supers. The mpiP requires a compatible GNU binutils installation (or libelf or libdwarf) for source lookup and demangling features. At present we do not have any of those, so the tool is cripple. Jeph installed binutils, but the tool still does not provide the source code demangle. |
| 75 |
Introduced the user to the NCAR facilities (lightning, extraview, email, cryptocard, mailing lists, etc). Explained syntax and meaning of the LSF scripts and arguments (queue, threading, processor number). Helped with data-transfer, including a workaround during a DNS outage, to have her data transferred without waiting for the DNS being repaired. Suggested the use of mpibzip2 to quickly compress the data before the transfer (which is 1.5h long, when uncompressed). Solved her bug with cron, and helped with unattended job submissions. |
| 40 |
LSF hybrid code setup |
| 59 |
CSG is working on ASD projects. One for Nested Regional Climate Model (NRCM), and one for Mechanisms of Convetive_Wave Interations. CSG has contact users and willing to help with them about code porting, compiling issues, tuning, performance, etc. |
| 79 |
Start of the project. Contacted Michael Shay's team. Helping with data transfer, compiling and linking libraries. I helped them with a problem regarding fftw. I introduced them to TotalView debugger. |
| 46 |
CSG member started participating in Green Density project |
| 85 |
Timing, customizablility and report will be enhancements. In order to do that, I installed, configured and evaluated the textest (python) framework on blueice. I also investigated a java solution, as well as the current script-based one. I chose the java-based and I'm currently developing it. |
| 57 |
WRF developers reported that WRF produce incorrect results with higher optimization level "-O3 -qhot" on bluefire. |
| 21 |
Next steps in providing training by SCD |
| 7 |
Tar on the supers |
| 18 |
Support system information scripts on all platforms |
| 16 |
Liaison between CISL and CGD CCSM Software Engineering Group |
| 20 |
Computer training for SOARS, RESESS, and SIParCS college students |
| 10 |
Postprocessing tools |
| 11 |
CISL Resource Accounting update |
| 12 |
Prepare documentation for users. |
| 15 |
Assistance with questions sent to Consulting Office |
| 52 |
CAM runtime errors on bluevista |
| 19 |
Test and support Totalview usage on all supers |
| 54 |
User Documentation |
| 2 |
Porting assistance for bluefire |
| 17 |
Implement ExtraView trouble ticket system |
| 1 |
Real time and other special computing projects |
| 14 |
Facilitate transition to LSF. |
| 22 |
Creation of new CSG web site and collaboration tool |
| 0 |
Ranger port of CCSM4 |
| 23 |
Miscellaneous duties |
| 13 |
Maintenance and documentation of software products |
| 6 |
Run benchmark tests, assist with local software |
| 3 |
Software library installation policy |
| 9 |
Provide user documentation for LSF batch scheduling system. |
| 8 |
Test benchmark suites for regression and ATPs. |
| 4 |
CTSS Testing and HelpDesk |
| 5 |
Professional development |
| 33 |
upgrade nco to 3.9.2 |
| 32 |
experimenting with lsf resource requirements for large memory jobs |
| 26 |
need to find way to change a batch model run from SPMD to MPMD |
| 27 |
Sciparcs preparation |
| 35 |
Coordinating trip to CCR, Madison for end of April/early May '07. |
| 29 |
The DART/WRF code is experiencing a high level of memory allocation when running on blueice. This has caused node crashes (TT #27519). user not agreeable with recommendation to run on 12 PEs per node on blueice. This code does not exhibit same behavior on bluevista (but has encountered MPI_LAPI errors (TT #27685). |
| 39 |
Developed LSF reporting tool utilizing LSF API to provide access to detailed job information and provided to SSG. This tool can access more detailed information to LSF job parameters than bjobs or bhist. 1/26/07 - Complete. |
| 30 |
User encountering MPI_LAPI errors on Bluevista. First report errors of this type for any user since Jan '07. |
| 25 |
Supporting this ASD project on Bluefire. |
| 28 |
Blueiec code running with very low efficiency |
| 24 |
Supporting this ASD project on Bluefire. |
| 36 |
working towards conversion of mudpack from OpenMP parallelism to MPI paralleism for HAO user. mudpack has been a bottleneck in scaling TIEGCM since it is restricted to running on a single node under the OpenMP paradigm. Converting to MPI, if it can be done, will allow the application to continue to scale beyond 48 PEs. |
| 38 |
converting emoslib for DSS from 32-bit bluesky environemnt to 64-bit bluevista and blueice environment |
| 34 |
Held initial meeting w/CSS head to discuss CSG role in petascale initiatives. |
| 37 |
Converting echam4 from 32-bit bluesky environment to 64-bit bluevista and blueice environment for user |
| 44 |
We want Platform computing to script for us the integration layer for using LoadLeveller as backend of LSF. |
| 43 |
The asphilli was captured as inefficient user from Tom Engel's monitoring programs. |
| 56 |
Mathematica guide |
| 53 |
Software licensing |
| 51 |
Splitting large files for MSS |