Hydraulics/Hydrology Seminar Series
Estimating Health of Ungauged Watershed using Machine Learning
Tuesday, April 25, 2017
Observed water quality data are usually sparse in both time and space. Reconstruction of water quality time series using surrogate variables such as streamflow have been demonstrated by several studies, and this reconstructed series can then be used to evaluate risk metrics such as reliability, resilience, vulnerability and watershed health. In this study we propose to use machine learning regression models such as decision trees, AdaBoost, gradient boosting machines and random forest regression to predict watershed health and other risk metrics at ungauged HUC-10 basins using watershed attributes, long-term climate data, soil data, land use and land cover data, and geographic information as predictor variables. We have tested our models over Upper Mississippi River Basin and Ohio River Basin for water quality constituents such as suspended sediment concentration, Nitrogen and Phosphorus. Results suggest that the proposed approach provides robust estimates when we have sufficient training data. It can be used as a quick screening tool by decision makers and water quality monitoring agencies to identifying critical source areas or hotpots with respect to different water quality constituents.