The Geoscience Data Exchange: Unifying Earth System Science Data at NSF NCAR
by Shira Feldman
CISL is marking a transformation for Earth system science data management and sharing at NSF NCAR, with the official rollout of the Geoscience Data Exchange (GDEX). This strategic initiative unifies three NSF NCAR data repositories—the Research Data Archive, the Climate Data Gateway, and the legacy Geoscience Data Exchange repository—into a single, modernized, analysis-ready research data commons.
NSF NCAR’s Geoscience Data Exchange – Integrated Research Data Commons (GDEX).
The vision for GDEX is to democratize data by creating a trusted data ecosystem that is accessible to all and built for community leadership. Its overarching goal is to make Earth system science data more Findable, Accessible, Interoperable, and Reusable (FAIR), supporting AI/ML readiness, collaboration, and new scientific discovery.
“The Geoscience Data Exchange empowers users to access NSF NCAR’s data from a consistent location, tap into our high-end computing capabilities, and focus on their science.”
— Thomas Hauser, CISL Director
Offering a single interface to datasets produced by multiple research domains, GDEX signifies a shift toward a more collaborative approach to NSF NCAR data stewardship. This central access point streamlines the scientific process—eliminating the need to gather datasets and variables from multiple locations—and enhances the repository’s user-friendliness for everyone from novices to experts. “The big-picture goal for GDEX is to speed up the iterative process, so we arrive at research findings and innovations more quickly than we usually do,” said Thomas Hauser, CISL Director and NSF NCAR Associate Director.
Doug Schuster, the section manager of CISL’s Information Science and Services (ISS) division, emphasized the project's dedication to community and accessibility. "We wanted a resource that we can really build a community around, which didn’t previously exist. Through cyberinfrastructure, we are unifying our data services, and at the same time nurturing research communities, such as NSF NCAR’s Earth System Data Science Community of Practice, that work together more effectively on NSF NCAR's data assets, enabling more collaboration across disciplines,” said Schuster.
GDEX lays the groundwork for artificial intelligence-driven research, aiming to improve the efficiency and speed of the development of AI-based science applications. "As we develop AI models, they'll be fed by the resources that are available in GDEX,” Schuster noted.
“GDEX in many ways makes AI possible, because you can’t do AI without data to feed the AI.”
— Doug Schuster, ISS Section Manager
Another AI component is on the user side. CISL is exploring AI-based web interfaces, which will act almost like an “AI librarian.” Users will be able to have a discussion with the AI interface to search for a dataset that suits their research use case as opposed to doing a traditional keyword search. “We want to streamline and simplify the user experience as much as possible. Hopefully that's what AI can help do for us,” said Schuster. “It might be AI in a loop rather than a human in a loop in some cases.”
“We hope this new facility and capability really helps researchers get their work done in ways they couldn’t before,” said Schuster. “We intend to reduce the barrier of entry for researchers who interact with NSF NCAR datasets, whether an undergraduate student taking a class, or a professional researcher who has been using and analyzing data for thirty years.” The initiative expands the data repository and makes it more accessible, especially for U.S.-based researchers who can benefit from the capabilities of NSF NCAR’s Casper data analysis compute service, which is co-located with GDEX.
A unified data infrastructure enables seamless access to scientific data—accelerating insight through both traditional and next-generation workflows.
The ultimate goal of GDEX is to make the scientific process more efficient and effective. “GDEX improves efficiency in terms of pathways to discovery, and facilitates more impactful discoveries over time, because researchers aren't bogged down dealing with a legacy structure of silos,” said Schuster. The likely results are “more novel and impactful research findings, fostering an environment of collaboration and building upon one another's work and sharing results more easily."
The project hosted a community workshop in May 2024 that gathered different viewpoints to shape the GDEX’s vision: “We brought in participants from a wide variety of institutions, and got a range of perspectives,” said Schuster. Several participants were later involved in the 2025 Project Pythia Cook-off, a hackathon aimed at developing well-documented examples for using GDEX tools and capabilities.
A key element of GDEX is its integration with the Open Science Data Federation. In collaboration with the University of Wisconsin and other partners, the project examines various data-driven workflows to inform tool development and to provide more performant access for those accessing data via web protocols.
GDEX is also one of the first use cases for CIRRUS, the new CISL on-prem cloud. The Cloud Infrastructure for Remote Research Universities and Scientists (CIRRUS) empowers GDEX with enhanced capabilities, allows GDEX to deliver faster and more reliable services, and provides a platform for future AI features.
GDEX is a flagship project for NSF NCAR, supported by the facilities and large projects office under the Directorate. In the next year, there are plans to integrate selected datasets from the Earth Observing Laboratory (EOL) and the High Altitude Observatory (HAO) services into GDEX. This will allow these labs to access built-in capabilities like visualization services, common data access tools, and streamlined search and discovery.
“GDEX is a data watering hole, but it's also a community watering hole where we all get together,” said Schuster. When asked about his favorite aspect of the project, Schuster highlighted its innovative, groundbreaking aspect: "I think the community that we've been building, breaking down silos within CISL and across the labs, both technically and across communities, is exciting. Also, getting our staff to push their technical boundaries a bit and try new things, and just learning and growing in general."
To learn more, visit the Data Commons project page. For inquiries or opportunities to get involved, please contact Doug Schuster.