NSF NCAR co-sponsors Open Hackathon to build coding skills and inspire Open Science practices
by Shira Feldman
For over a week—from February 21 to 29, 2024—nine teams of computational scientists and Earth system science researchers came together to hack their code, build their software development skills, and improve their projects. In the process, participants experienced the real-life benefits of practicing Open Science.
The annual Open Hackathon took place this year at the David Skaggs Research Center (DSRC) in Boulder, Colorado, and also remotely. The event was co-sponsored by the U.S. National Science Foundation National Center for Atmospheric Research (NSF NCAR); the National Renewable Energy Laboratory (NREL); the National Oceanic and Atmospheric Administration (NOAA); the OpenACC organization; and NVIDIA.
The intensive, multi-day, hands-on event was designed to help participants port, accelerate, and optimize a chosen application. The hackathon drew 57 participants—14 on-site and 43 remote. Participants formed nine teams, each with three to seven members, and one to three mentors from national labs and universities.
According to Daniel Howard, the event’s primary site organizer for NSF NCAR, the name Open Hackathon “reflects the community’s growing desire to participate in a way that embodies the principles of Open Science.” These principles include accessibility, shareability, author recognition, useability, and reproducibility (see more detail on Open Science below).
The event’s current title—contrasted with its previous name, GPU Hackathon—also reveals its new emphasis on any kind of acceleration, not just graphics processing units (GPUs). This includes central processing units (CPUs), speedups, and other performance gains.
The hackathon was open to the public, with organizers welcoming all who sought to take their projects to the next level. While many of this year’s participants came from sponsoring organizations and local areas, notable exceptions included a group from New York’s Clarkson University, and a mentor from Switzerland who navigated the six-hour time difference to join internationally.
Organizers hope for even broader participation next year. “We're all within our labs, universities, and institutions, so it’s hard for us to learn from each other,” said Howard. “Open hackathons are opportunities to learn from each other and cross boundaries between institutions.”
Hackathon projects achieve impressive technical gains
By any measure, the hackathon was a success. Each of the nine teams achieved a specific outcome from the hackathon, and many of these outcomes will find their ways to broader uses.
One team, called Overset Quartet, worked with TIOGA software—which performs overset grid connectivity for meshes—as part of a model for predicting wind farm wake turbulence that impacts power reduction. Using the Frontier supercomputer to test and run its code, the group attained an impressive 20x speedup in TIOGA for their pathological case, and 2x speedup for smaller-scale problems. Team members are now using their findings to improve the TIOGA software for larger-scale problems.
Another group called UXarray achieved gains that “would excite a lot of technical folks,” said Howard. For a user experience (UX) project on unstructured grid analysis in Python, the team’s Numerical Python (NumPy) implementation in some cases achieved a 90x improvement: before, the code took 89.3 seconds to process a 7.5km grid test case, and now it only takes 1.3 seconds!
Each team mentioned they found the experience positive and valuable and would do it again. In their final presentations, each team affirmatively answered the question “Was it worth it?” The Overset Quartet wrote: “Yes, we made more gains in our performance than we expected.” The GT4PY team simply responded: “Oh yeah!!!!”
The codes will likely see wider distribution beyond the specific problems they were written to solve. Many teams will refine and document their changes, then submit them as pull requests into the main GitHub depot, for use elsewhere by other people.
A tool called a performance profiler was key to the hackathon’s striking results. The profiler provides a visual interpretation of what is going on within the software. With it, users can see a timeline view of their software, as well as different subsections of code and how they run. For those interested, NSF NCAR offers more resources on performance profiling, including this hands-on session.
Another, more subtle, factor featured in the hackathon's success: the element of time. For busy scientific and computational professionals who cannot always focus narrowly on one specific aspect of their work—especially when collaboration is involved—the hackathon helped carve out this crucial time.
Howard said: “Part of the goal of the hackathon concept, which a lot of the teams find very valuable, is just to have the people coming to the events. The last week was the most intensive set of events, with three days, nine-to-five each day. Having the events and logistics planned meant the teams could dedicate 100% of time to working on code. They have an excuse: ‘I don’t need to do meetings that day, I’m just doing this thing.’ People can make significant progress on their technical projects when they’re not bogged down by work or other obligations. So that's one of the main things, just to get that dedicated time for the teams.”
Hackathon supported Open Science for broader impact
The annual hackathon has long embraced the goal of Open Science. In fact, last year’s hackathon event was featured by the White House in its Open Science mission statement, furthering its 2023 Year of Open Science initiative.
This year, on February 28, the hackathon hosted a thought-leadership seminar on Open Science. The hybrid session was led by Kristina Vrouwenvelder, Program Manager for the American Geographical Union (AGU) Open Science.
In her talk, Vrouwenvelder highlighted Open Science’s “transformative” benefits, such as “boosting your work, making your research more efficient, and advancing science.”
For software development in particular, Open Science means better documentation practices, enhanced accessibility for software and data, and improved publishing practices that more effectively recognize people’s work.
Said Howard: “it's one thing to write good software to get the science done and improve its performance. You also need to make sure that software can be shared effectively so it can be used by people beyond your immediate circle of influence.”
One hypothetical example: a participant working on a flood model could document the model well enough that an urban planner in Miami, Florida, could figure out how to run the code to showcase Miami floods—despite the planner's limited technical and scientific background.
In real life, this scenario would require very effective Open Science principles, both to explain the software and to make access to it easy and understandable. The objective of Open Science in the hackathon—and in software development as a whole—is to make the gains portable and reproducible. It strives to make good results useable, not just within the scientific community, and not just isolated for a specific use case on a specific system, but to anyone at large.
Ultimately, the hackathon aimed to improve collaboration and connection within the Earth system science field. Said Howard: “NOAA, NSF NCAR, and NREL all focus on different things within Earth system science. But we're all very much connected in terms of applications, such as addressing climate change.”
For more information on the hackathon and upcoming events, contact info@openhackathons.org.