The Scientific Computing Division conducted three surveys in the spring of 2001:
· A general survey of SCD computer users on the services offered by SCD
· A survey of NCAR Command Language (NCL) users
· A survey of users of SCD-provided research data
Brief descriptions of the survey population and the demographics of the respondents are given below, followed by the survey findings. A copy of each survey, detailed analyses and charts, and responses from users to open-ended questions are contained in Sections 2, 3, and 4 of this booklet.
We requested and received candid responses to our questions. Respondents were asked to elaborate on their responses via the open-ended questions, especially if they were dissatisfied with a service. The input we received will have a large impact on our future planning.
SCD surveyed 287 users of SCD computers. This included 91 principal investigators, 193 lead users from projects using 100 General Accounting Units (GAUs) or more during the calendar year 2000, plus 111 researchers who had used 500 GAUs or more during calendar year 2000. After accounting for people in more than one category, we were left with 287 unique individuals.
One hundred thirty-eight users responded, yielding a 48% response rate after one follow-up email to those who had not responded by the deadline. Forty percent of the respondents work for a university or attend a university and 35% are located at a university, 36% are UCAR/NCAR employees or visitors, 7% are affiliated with another type of organization, and 17% did not respond to any of the demographic questions.
Eighty-two percent of the respondents personally use SCD computing facilities. The other respondents supervised users of the SCD computing facility. Sixty-two percent of the respondents said they had used 500 GAUs or more during a 12-month period since 1998. Ninety-seven percent of the respondents had used SCD computing facilities during 2000, the last full year during the review period.
For the NCL survey, SCD surveyed 256 potential NCL users. These included people who had subscribed to the ncl-talk email list, people who had downloaded NCL from the web, people who attended an NCL workshop, and people who had contacted us with NCL questions. After accounting for people in more than one category, we were left with 256 unique individuals.
Forty-nine NCL users responded, yielding a 19% response rate after one follow-up email to those who had not responded by the deadline. Of these 49 respondents, 8 replied that they had never used NCL, so only 41 people actually filled out the rest of the survey. Of these 41 respondents, 46% of them work for a university or attend a university, 32% are UCAR/NCAR employees or visitors, and 22% are affiliated with another type of organization.
Sixty-six percent of the respondents personally use SCD computing facilities, 29% do not use the facilities, and 5% did not respond.
For the survey on SCD’s research datasets, we surveyed 191 people. This list included researchers who ordered datasets from SCD’s Data Support Section (DSS) or downloaded data from the DSS data server during calendar year 2000. The survey was sent to U.S. users only.
Forty-eight data users responded, yielding a 25% response rate. Sixty-one percent of the respondents work for or attend a university, and 39% are affiliated with another type of organization. Forty-nine percent are located at a university.
A number of services used by SCD’s computer users were highly rated. Services where the percentage of satisfied plus very satisfied exceeds 80% were:
· Ease of use of Cray J90s (91%)
· Ease of use of SGI Origin (ute) (87%)
· Capability of dataproc for data processing (95%)
· Ease of use of dataproc (96%)
· Ease of porting codes to dataproc (92%)
· Data analysis and visualization software on dataproc (85%)
· Capability of Mass Storage System (MSS) interfaces (83%)
· Content of SCD data archives (96%)
· SCD’s DSS web/data server (87%)
· Data from the DSS data server (83%)
· Research archive accessed from the MSS (87%)
· SCD DSS staff assistance (94%)
· Network performance to/from NCAR (86%)
· Adding users to existing accounts (94%)
· Obtaining a new computing allocation (91%)
· Accuracy and depth of consultant’s response (89%)
· Timeliness of consultant’s response (94%)
· Communication skills of consultants (94%)
Services where 10% or more of users checked either dissatisfied or very dissatisfied were:
· Ease of use of the IBM SP (18%)
· Ease of porting codes to IBM SP (21%)
· Turnaround time on the IBM SP for batch programs (15%)
· Interactive performance on the IBM SP (11%)
· Debugging on the IBM SP (38%)
· Queuing system and scheduler on the IBM SP (19%)
· Parallelization options for the IBM SP (14%)
· Capability of the Cray J90s (15%)
· Turnaround time on the Cray J90s for batch programs (32%)
· Interactive performance on the Cray J90s (11%)
· Capability of the SGI Origin (14%)
· Turnaround time on the SGI Origin for batch programs (42%)
· Queuing system and scheduler on the SGI Origin (14%)
· Capability of the Mainframe and Server Network/Internet Gateway System (MIGS) (10%)
· Ease of use of the MIGS system (11%)
· Turnaround time of the MIGS system (10%)
· Ability to locate information/documentation via the web (13%)
SCD is pleased with the large number of highly rated services, especially considering the many changes in SCD’s computational environment over the past two years. Respondents rated ease of learning the SCD computing system (70% satisfactory) at close to the same level as the 1995 level of satisfaction (74%). Seventy-nine percent of respondents rated ease of computing at SCD as satisfied or very satisfied, compared with 81% in 1995.
Chart 1, “Comparison of FY87/FY91/FY95/FY01 NSF Survey Results” (see Section 2.2) compares questions from surveys taken before the last four NSF reviews. In 1995 the computational environment was stable, with one type of high-performance computing platform. The operating system and compilers had had many years to mature and merge with the needs of the atmospheric and related sciences community. It is a credit to the SCD staff that these overall measures have continued at their high level in the face of the tremendous changes the user community has dealt with.
A major concern highlighted by the surveys is job turnaround for batch programs (i.e., model runs) on all three of SCD’s production platforms. Unfortunately, the community’s need for computing resources has outpaced the computational resources available at NCAR. We have seen a recent surge in university requests for computer time with the March 2001 allocation requests. University researchers have requested two-and-a-half times the available resources for their National Science Foundation–supported science. SCD is in the midst of a Request for Proposals (RFP) to acquire additional computational resources, which should reduce this problem; however, additional funds for computational resources are needed to satisfy current needs.
SCD continues to tune the scheduler on the IBM to improve the throughput of the system and prioritize jobs based on allocation priorities. This will include permitting SCD to easily specify users who can run long jobs of up to 200 CPU hours. We are also investigating the feasibility of providing users with a save/restart mechanism via a real-time signal.
SCD has offered assistance with code conversion and training classes on the IBM, but further steps can be taken. In April 2001, a class on the Totalview debugger was offered at NCAR. (Totalview is IBM’s recommendation for a debugger and is also the recommended debugger for Compaq, SGI, and Cray.) Based on the survey, plans are being made to repeat this class at university sites with large groups of users. We will investigate options for improving the vendor-provided documentation on Totalview. We will also investigate training options for Vampir, a visual utility for interactive debugging and optimization of message-passing codes.
There were a number of comments on response time of the MSS. This concern was not reflected in the ratings, which showed only 6% dissatisfaction with MSS response time. However, based on survey comments, we believe response time is affecting users doing a lot of data retrievals and writes. To identify the nature of this problem, SCD will evaluate the effect that increases in our computational capacity and data analysis systems have had on the MSS. Solutions may include increasing the number of data paths available, the number of MSS tape drives, the amount of local disk space on the supercomputers, or other actions needed to balance the MSS with SCD’s other computational resources.
SCD appreciates the time spent by researchers to give us feedback and suggestions for improvement. These concerns and needs will play heavily in our plans for the future.
In the general survey of SCD computer users, respondents were asked to rate the overall ease of use and capabilities of NCL using the categories very satisfied, satisfied, neutral, dissatisfied, and very dissatisfied. In the detailed NCL survey, respondents were asked to rate their satisfaction with specific NCL topics using the same categories. Respondents for whom the question did not pertain were encouraged to use the rating not applicable (N/A), and these respondents were excluded when calculating the rating percentages. Respondents were also asked to rate the importance of certain enhancements to NCL as being very useful, moderately useful, not useful, or not applicable. In this case, the not applicable answers were included with the not useful category in the calculation of rating percentages.
In the NCL survey, a number of topics were highly rated. Those where the percentage of satisfied plus very satisfied exceeded 80% include:
· Capabilities of data formats (98%)
· Capabilities of file I/O (90%)
· Capabilities of data handling (89%)
· Capabilities of data processing (89%)
· Quality of graphical output (95%)
· Ease of use of file I/O (85%)
· Ease of use of data handling (85%)
· Ease of use of data analysis (88%)
· Ease of use of graphics (85%)
· Timeliness of consulting response (100%)
· Effectiveness of consulting response (97%)
· Communication skills of consultant (100%)
· Effectiveness of Getting Started Using NCL document (87%)
· Effectiveness of the Community Climate System Model (CCSM) Graphics Tutorial (94%)
· Effectiveness of the NCL Reference Manual (84%)
In the general survey of SCD computer users, the overall ease of use of NCL was given a 46% satisfactory rating, and the overall capability of NCL was given a 70% satisfactory rating.
In the NCL survey, there were no questions that showed 10% or more respondents as either dissatisfied or very dissatisfied. In the general survey, however, the ease of use of NCL was given a 28% dissatisfied rating.
In the NCL survey, respondents were asked to rate the importance of enhancements in NCL for several specific categories. Starting with the highest ratings, 76% were for enhanced 3D graphics, 74% for handling satellite data, 67% for new data formats, 60% for a Python or Java interface to NCL, and 53% for new map projections.
Respondents were also asked to elaborate on their responses via open-ended questions, especially if they rated a topic as dissatisfied. Discussion of these responses is included in the “Discussion of Results,” below.
The NCL survey results lead us to believe that NCL is delivering very well on the most important functional capabilities. As noted above, the question on ease of use of NCL in the general survey of SCD computer users was only given a 46% satisfactory rating, while on the NCL survey all of the ease-of-use questions scored an 85% or higher approval rating. This disparity could be due to the fact that some of the general survey respondents are not aware of the significant amount of functionality and improvements to the graphics interface that have gone into NCL in the last couple of years. Some of the comments in the general survey indicate only a fleeting familiarity with NCL, which would support this argument.
From the open-ended questions, we could speculate that some of the difficulties with NCL are that it has a steep learning curve and that integrating C and Fortran routines is not trivial. Our goal with NCL is to drastically enhance scientific productivity, and ease of use is fundamental to this effort. Since survey responses indicate a high level of satisfaction with the NCL workshops (94%), we are looking into holding more workshops, both locally and at atmospheric-related conferences. To simplify the integration of C and Fortran routines, work is already in progress to create a portable script that will automate the process.
Comments regarding NCL documentation were mixed, but with a bias towards positive responses. The most common complaint is that the documentation is too spread out and thus it is hard to find information quickly. Results from specific questions in the NCL survey, however, indicate a high level of satisfaction with specific NCL documents. Efforts have already been made to create an NCL home page that provides one stop for all the documentation. Search engines have also been added to the web pages to make finding NCL-related topics easier.
There was a high rating for adding new map projections, but when given the opportunity to list the new map projections desired, only one person volunteered information. Without more data, it is hard to determine the type of enhancement desired in this area.
The ability to do formatted printing in NCL was an enhancement request brought up several times in the comments. There have been internal discussions about creating a repository to allow users to contribute C and Fortran routines that can interface with NCL to do the kind of printing they desire. A repository would also open the door for user contributions to other areas of NCL.
Future Directions. The high rating for the recognition of additional data formats seems to be tied in with the need for enhancements in visualization of satellite data. Thus far, most of our requests in this area are for Hierarchical Data Format for the Earth Observing System (HDF-EOS). Work is already well underway to support HDF-EOS, and we expect to have a robust capability in this area by early summer 2001. In addition, enhancements to our data handling software will address both HDF-4 and HDF-5.
From comments and answers to specific questions on the need for enhancements in NCL, we can surmise that there is general interest in (1) having NCL analysis and capabilities available through other languages, especially Python and, to a lesser extent, Perl, and (2) enhancing NCL’s 3D capabilities.
The lack of a Graphical User Interface (GUI) to NCL was a comment brought up multiple times in both the general and the NCL surveys. In some cases, it was mentioned that other packages were chosen over NCL because of the availability of a GUI. We spent some time developing a GUI for NCL that was extremely general. Its generality made it somewhat difficult to use and extend. We have lowered the priority of that work in favor of developing our Community Data Portal, which provides domain-specific, web-based interfaces to data and simple analysis functions. We expect to return to GUI-based standalone applications in the future, with emphasis on specific problem domains.
There was one suggestion that we explore a PHP (hypertext preprocessor) interface for NCL, which would enhance its ability to serve as a back-end web engine. This is an interesting suggestion. NCL and NCAR Graphics are already being used internally for this purpose, but also by others in the community, including Gregg Walters of SCD’s Data Support Section, who used NCL to build an award-winning real-time aviation weather site.
A review of the general survey revealed a number of comments that relate to data analysis and visualization. Specifically, six people were interested in web-based data access and analysis. Three others indicated a need for remote visualization.
Overall, we believe that NCL is growing in popularity and filling an important niche in the scientific community. Projecting towards mid-decade, we expect to see a massive data explosion that will seriously challenge the community’s ability to understand a wide variety of scientific data. Frontier research will require significant advances in the tools available for understanding the underlying processes. In our strategic roadmaps, one plan element is to develop frameworks and applications for next-generation analysis and visualization. In our internal discussions, one possible incarnation of such a framework would coalesce NCL’s functionality and 3D visualization tools with Python and GUIs as user interfaces. Our work with the Community Data Portal already leverages NCL, and we are pleased that people are finding it useful for driving web environments.
For a more detailed discussion of our future directions, please see “Analysis, Visualization, and Knowledge Development for the Earth Systems Science” in Section III(B) of the SCD NSF Review document.
Overall, we believe the survey shows that NCL is filling an important niche in the scientific community, and the commentary on future directions is consistent with our existing plans.
A number of data services were highly rated in both the general survey (GS) of SCD computer users and the separate data survey (DS). Services where the percentage of satisfied plus very satisfied exceeded 80% were:
· Content of SCD data archives (GS: 96%, DS: 93%)
· Data Support web/data server (GS: 87%, DS: 88%)
· Accessing data from the data server using FTP or a web browser (GS: 83%, DS: 91%)
· Accessing data directly from the MSS (GS: 87%, DS: 85%)
· Data prepared specifically for researcher (GS: 87%, DS: 91%)
· CDs of data for specific projects (GS: 73%, DS: 88%)
· Data support staff assistance (GS: 94%)
· Assistance in identifying needs (DS: 91%)
· Assistance in using data requested (DS: 90%)
None of the data service questions had a dissatisfied rating of 10% or greater on either the general survey of SCD computer users or the separate data survey.
We are pleased that the majority of users seem happy with the data and the data assistance they are receiving from SCD’s Data Support Section.
Respondents suggested approximately 15 datasets that we might add to the archive. We already maintain and distribute several of the requested additions. Of the remainder, we will evaluate the feasibility of acquiring and maintaining them for distribution. We currently add 20 to 25 new datasets each year. We realize that the addition of new datasets will often lead to more research in these areas.
Some respondents wanted more frequent updates of the European Centre for Meteorological Weather Forecasting (ECMWF) operational analyses. We update these analyses once a year and will probably increase this to twice a year in the near future. This increases the cost by 5%, as well as increasing the staff time required. There were also two requests for higher-resolution ECMWF data. To provide this, the cost of the ECMWF analyses, now $17K per year (which includes a recent 40% increase), would increase to $41K per year. Therefore, we do not plan to obtain the higher resolution. (ECMWF’s position is that $17K is a bargain because NCAR can pay this once and then provide the data to many in the U.S. research community.)
Respondents also requested that SCD make more data available at no cost. We do plan to increase the number of datasets available on the Data Support web server at no charge. One of the requested datasets will be the World Monthly Surface. Datasets (especially large ones) will still have a charge, but usually much lower than from archives at other sites. We will continue to look for ways to reduce costs.