Collaborative Research: SI2-SSI: Cyberinfrastructure for Advancing Hydrologic Knowledge through Collaborative Integration of Data Science, Modeling and Analysis
Researchers across the country and around the world expend tremendous resources to gather and analyze vast stores of hydrologic data and populate a myriad of models to better understand hydrologic phenomena and find solutions to vexing water problems. Each of those researchers has limited money, time, computational capacity, data storage, and ability to put that data to productive use. What if they could combine their efforts to make collaboration easier? What if those collected data sets and processed model outputs could be used collaboratively to help advance hydrologic understanding beyond their original purpose? HydroShare is a system to advance hydrologic science by enabling the scientific community to more easily and freely share products resulting from their research, not just the scientific publication summarizing a study, but also the data and models used to create the scientific publication. HydroShare supports the sharing and publication of hydrologic data and models. This capability is necessary for community model development, execution, and evaluation and to improve reproducibility and community trust in scientific findings through transparency. As a platform for collaboration and running models on advanced computational infrastructure, HydroShare enhances the capability for data intensive research in hydrology and other aligned sciences. HydroShare is designed to help researchers easily meet the sharing requirements of data management plans while at the same time providing value added functionality that makes metadata capture more effective and helps researchers improve their work productivity. This project will extend the capabilities of the HydroShare cyberinfrastructure to enhance support for scientific methods, advance the social capabilities of HydroShare to enable improved collaborative research, integrate with 3rd party consumer data storage systems to provide more flexible and sustainable data storage. and establish an application testing environment to empower researchers to develop their own computer programs to act on and work with data in HydroShare. Empowering HydroShare users with the ability to rapidly develop web application programs opens the door to unforeseen, innovative combinations of data and models. WRF-Hydro, the framework for the NOAA National Water Model, will be used as a use case for collaboration on model development. Since WRF-Hydro is used by NOAA as part of the National Water Model (NWM), this collaboration opens possibilities for transfer of research to operations. Collectively, this functionality will provide a computing framework for transforming the practice of broad science communities to leverage advances in data science and computation and accelerate discovery.
HydroShare is a system for sharing hydrologic data and models aimed at giving hydrologists the cyberinfrastructure needed to manage data, innovate and collaborate in research to solve water problems. It addresses the challenges of sharing data and hydrologic models to support collaboration and reproducible hydrologic science through the publication of hydrologic data and models. With HydroShare users can: (1) share data and models with colleagues; (2) manage who has access to shared content; (3) share, access, visualize and manipulate a broad set of hydrologic data types and models; (4) use the web services interface to program automated and client access; (5) publish data and models to meet the requirements of research project data management plans; (6) discover and access data and models published by others; and (7) use web apps to visualize, analyze, and run models on data. This project will extend the capabilities of HydroShare to: (1) enhance support for scientific methods enabling systematic data and model analysis and hypothesis testing; (2) advance the social capabilities of HydroShare to enable improved collaborative research; (3) integrate with 3rd party consumer data storage systems to provide more flexible and sustainable data storage; and (4) establish an application testing environment to empower researchers to develop their own computer programs to act on and work with data in HydroShare. Under development since 2012 and first released in 2014, HydroShare supports the sharing and publication of hydrologic data and models. This capability is necessary for community model development, execution, and evaluation. As a platform for collaboration and cloud based computation on network servers remote from the user, HydroShare enhances the capability for data intensive research in hydrology and other aligned sciences. HydroShare is innovative from a computer science and CI perspective in the way computation and data sharing are framed as a network computing platform that integrates data storage, organization, discovery, and programmable actions through web applications (web apps). Support for these three key elements of computation allows researchers to easily employ services beyond the desktop to make data storage and manipulation more reliable and scalable, while improving ability to collaborate and reproduce results. The generation of new understanding, through integration of information from multiple sources and reuse and collaborative enrichment of research data and models, will be enhanced. Structured and systematic model process intercomparisons and alternative hypothesis testing will be enabled, bringing, through user friendly CI, the latest thinking in advancing hydrologic modeling to a broad community of earth science researchers, thereby transforming research practices and the knowledge generated from this research. Interoperability with consumer cloud storage will greatly ease entry of content into HydroShare and support its sustainability. This meshing of the rigorous metadata model of HydroShare with consumer file sharing will enhance reproducibility as well as provide an innovative mechanism for sharing and collaboration. Empowering HydroShare users with the ability to rapidly develop web apps opens the door to unforeseen, innovative combinations of data and models. WRF- Hydro will be used as a use case for collaboration on model development. WRF-Hydro provides a reach-based high resolution representation of hydrologic processes, and offers the potential to bring together scientists working at scales from research catchments on the order of 1 to 100s of square kilometers as well as those working at regional to continental scales and cut across disciplines from environmental engineering to aquatic ecologists. Since WRF-Hydro is used by NOAA as part of the National Water Model (NWM), this collaboration opens possibilities for transfer of research to operations. This project will adapt current best practices in CI for interoperability and extensibility to serve this multidisciplinary community of scientists. HydroShare has already had a broader impact, with documented rapid growth in use and uptake by other projects including in EarthCube. It will become sustainable community CI through operation as part of the Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) Water Data Center (WDC) facility. The use of WRF- Hydro/NWM, as a driving use case, will advance CI for community based model improvement. Through the Summer Young Innovators Program at the National Water Center (NWC), supported by the National Weather Service (NWS) and operated by CUAHSI, a pathway already exists to translate research findings to the operational needs of federal agencies participating in the NWC. HydroShare already touches a broad and diverse community, with user base including Native American tribes, hydrologic science students, and faculty researchers across the U.S. This proposal builds on the success of HydroShare to extend its capabilities and broaden model hypothesis testing, collaborative data sharing, and open app development across earth science research and education.