Jasper Vrugt | University of California, Irvine (original) (raw)

Papers by Jasper Vrugt

Research paper thumbnail of Toward improved prediction of the bedrock depth underneath hillslopes: Bayesian inference of the bottom-up control hypothesis using high-resolution topographic data

Water Resources Research, Apr 1, 2016

The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion... more The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion rates, soil moisture, and water uptake by plant roots. As hillslope interiors are very difficult and costly to illuminate and access, the topography of the bedrock surface is largely unknown. This essay is concerned with the prediction of spatial patterns in the depth to bedrock (DTB) using high-resolution topographic data, numerical modeling, and Bayesian analysis. Our DTB model builds on the bottom-up control on freshbedrock topography hypothesis of Rempe and Dietrich (2014) and includes a mass movement and bedrock-valley morphology term to extent the usefulness and general applicability of the model. We reconcile the DTB model with field observations using Bayesian analysis with the DREAM algorithm. We investigate explicitly the benefits of using spatially distributed parameter values to account implicitly, and in a relatively simple way, for rock mass heterogeneities that are very difficult, if not impossible, to characterize adequately in the field. We illustrate our method using an artificial data set of bedrock depth observations and then evaluate our DTB model with real-world data collected at the Papagaio river basin in Rio de Janeiro, Brazil. Our results demonstrate that the DTB model predicts accurately the observed bedrock depth data. The posterior mean DTB simulation is shown to be in good agreement with the measured data. The posterior prediction uncertainty of the DTB model can be propagated forward through hydromechanical models to derive probabilistic estimates of factors of safety.

Research paper thumbnail of Sworn testimony of the model evidence: Gaussian Mixture Importance (GAME) sampling

Water Resources Research, Jul 1, 2017

What is the ''best'' model? The answer to this question lies in part in the eyes of the beholder,... more What is the ''best'' model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K candidate models, M k ; k5ð1;. .. ; KÞ, and help identify which model is most supported by the observed data, Y5ðỹ 1 ;. .. ;ỹ n Þ. Here, we introduce a new and robust estimator of the model evidence, pðỸjM k Þ, which acts as normalizing constant in the denominator of Bayes' theorem and provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. However, pðỸjM k Þ is analytically intractable for most practical modeling problems. Our method, coined GAussian Mixture importancE (GAME) sampling, uses bridge sampling of a mixture distribution fitted to samples of the posterior model parameter distribution derived from MCMC simulation. We benchmark the accuracy and reliability of GAME sampling by application to a diverse set of multivariate target distributions (up to 100 dimensions) with known values of pðỸjM k Þ and to hypothesis testing using numerical modeling of the rainfall-runoff transformation of the Leaf River watershed in Mississippi, USA. These case studies demonstrate that GAME sampling provides robust and unbiased estimates of the evidence at a relatively small computational cost outperforming commonly used estimators. The GAME sampler is implemented in the MATLAB package of DREAM and simplifies considerably scientific inquiry through hypothesis testing and model selection. Plain Language Summary Science is an iterative process for learning and discovery in which competing ideas about how nature works are evaluated against observations. The translation of each hypothesis to a computational model requires specification of system boundaries, inputs and outputs, state variables, physical/behavioral laws, and material properties; this is difficult and subjective, particularly in the face of incomplete knowledge of the governing spatiotemporal processes and insufficient observed data. To guard against the use of an inadequate model, statisticians advise selecting the ''best'' model among a set of candidate ones where each might be equally plausible and justifiable a priori. Bayesian model selection uses probability theory to select among competing hypotheses; the key variable is the Bayesian model evidence, which provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. Bayesian model selection has not entered into mainstream use in Earth systems modeling due to the lack of general-purpose methods to reliably estimate the evidence. Here, we introduce a new method, called GAussian Mixture importancE (GAME) sampling. We demonstrate GAME power and usefulness for hypothesis testing using benchmark experiments with known target and numerical modeling of the rainfall-runoff transformation of the Leaf River watershed (Mississippi, USA).

Research paper thumbnail of FDCFIT: A MATLAB Toolbox of Closed-form Parametric Expressions of the Flow Duration Curve

The flow duration curve (FDC) is a signature catchment characteristic that depicts graphically th... more The flow duration curve (FDC) is a signature catchment characteristic that depicts graphically the relationship between the exceedance probability of streamflow and its magnitude. This curve is relatively easy to create and interpret, and is used widely for hydrologic analysis, water quality management, and the design of hydroelectric power plants (among others). Several mathematical formulations have been proposed to mimic the FDC. Yet, these efforts have not been particularly successful, in large part because classical functions are not flexible enough to portray accurately the functional shape of the FDC for a large range of catchments and contrasting hydrologic behaviors. In a recent paper, Sadegh et al. (2015) introduced several commonly used models of the soil water characteristic as new class of closed-form parametric expressions for the flow duration curve. These soil water retention functions are relatively simple to use, contain between two to five parameters, and mimic closely the empirical FDCs of watersheds. Here, we present a simple MATLAB toolbox for the fitting of FDCs. This toolbox, called FDC-FIT implements the different expressions introduced by Sadegh et al. (2015) and returns the optimized values of the coefficients of each model, along with graphical output of the fit. This toolbox is particularly useful for diagnostic model evaluation (Vrugt and Sadegh, 2013), as the optimized coefficients can be used as summary metrics. Two different case studies are used to illustrate

Research paper thumbnail of The stationarity paradigm revisited: Hypothesis testing using diagnostics, summary metrics, and DREAM<sub>(ABC)</sub>

Water Resources Research, Nov 1, 2015

Many watershed models used within the hydrologic research community assume (by default) stationar... more Many watershed models used within the hydrologic research community assume (by default) stationary conditions, that is, the key watershed properties that control water flow are considered to be time invariant. This assumption is rather convenient and pragmatic and opens up the wide arsenal of (multivariate) statistical and nonlinear optimization methods for inference of the (temporally fixed) model parameters. Several contributions to the hydrologic literature have brought into question the continued usefulness of this stationary paradigm for hydrologic modeling. This paper builds on the likelihood-free diagnostics approach of Vrugt and Sadegh (2013) and uses a diverse set of hydrologic summary metrics to test the stationary hypothesis and detect changes in the watersheds response to hydroclimatic forcing. Models with fixed parameter values cannot simulate adequately temporal variations in the summary statistics of the observed catchment data, and consequently, the DREAM (ABC) algorithm cannot find solutions that sufficiently honor the observed metrics. We demonstrate that the presented methodology is able to differentiate successfully between watersheds that are classified as stationary and those that have undergone significant changes in land use, urbanization, and/or hydroclimatic conditions, and thus are deemed nonstationary.

Research paper thumbnail of Sworn testimony of the model evidence: Gaussian Mixture Importance (GAME) sampling

Water Resources Research, 2017

What is the “best” model? The answer to this question lies in part in the eyes of the beholder, n... more What is the “best” model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K candidate models, , and help identify which model is most supported by the observed data, . Here, we introduce a new and robust estimator of the model evidence, , which acts as normalizing constant in the denominator of Bayes’ theorem and provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. However, is analytically intractable for most practical modeling problems. Our method, coined GAussian Mixture importancE (GAME) sampling, uses bridge sampling of a mixture distribution fitted to samples of the posterior model parameter distribution derived from MCMC simulation. We benchmark the accuracy and reliability ...

Research paper thumbnail of Toward improved prediction of the bedrock depth underneath hillslopes: Bayesian inference of the bottom‐up control hypothesis using high‐resolution topographic data

Water Resources Research, 2016

The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion... more The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion rates, soil moisture, and water uptake by plant roots. As hillslope interiors are very difficult and costly to illuminate and access, the topography of the bedrock surface is largely unknown. This essay is concerned with the prediction of spatial patterns in the depth to bedrock (DTB) using high‐resolution topographic data, numerical modeling, and Bayesian analysis. Our DTB model builds on the bottom‐up control on fresh‐bedrock topography hypothesis of Rempe and Dietrich (2014) and includes a mass movement and bedrock‐valley morphology term to extent the usefulness and general applicability of the model. We reconcile the DTB model with field observations using Bayesian analysis with the DREAM algorithm. We investigate explicitly the benefits of using spatially distributed parameter values to account implicitly, and in a relatively simple way, for rock mass heterogeneities that are very ...

Research paper thumbnail of Generalized likelihood uncertainty estimation (GLUE) using adaptive Markov Chain Monte Carlo sampling

Advances in Water Resources, Apr 1, 2008

 Users may download and print one copy of any publication from the public portal for the purpose... more  Users may download and print one copy of any publication from the public portal for the purpose of private study or research.  You may not further distribute the material or use it for any profit-making activity or commercial gain  You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Research paper thumbnail of Interactive comment on “ A novel approach to parameter uncertainty analysis of hydrological models using neural networks ” by D

Furthermore, the paper lacks a profound discussion: in current version, the results and discussio... more Furthermore, the paper lacks a profound discussion: in current version, the results and discussion part is extremely short, and I believe that a more in depth discussion on when the model is making the largest or smallest errors could be given (this is not summarized in one sentence). In the whole discussion part (and basically the setup of the paper) different ANNs could have been compared, where each ANN is based on different input data. This is currently only restricted to Rt−9a, Qt−1 and ∆Qt−1. Maybe other input variables may have been a better choice (this may be learned from an indepth study of the results of the proposed ANN: where did it really get off, and what variables may have reduced this error?).

Research paper thumbnail of Dual-domain mixing cell modelling and uncertainty analysis for unsaturated bromide and chloride transport

Chan, F., Marinova, D. and Anderssen, R.S. (eds) MODSIM2011, 19th International Congress on Modelling and Simulation., 2011

Research paper thumbnail of Bayesian analysis of the impact of rainfall data product on simulated slope failure for North Carolina locations

Computational Geosciences, 2019

In the past decades, many different approaches have been developed in the literature to quantify ... more In the past decades, many different approaches have been developed in the literature to quantify the load-carrying capacity and geotechnical stability (or the factor of safety, F s) of variably saturated hillslopes. Much of this work has focused on a deterministic characterization of hillslope stability. Yet, simulated F s values are subject to considerable uncertainty due to our inability to characterize accurately the soil mantle's properties (hydraulic, geotechnical, and geomorphologic) and spatiotemporal variability of the moisture content of the hillslope interior. This is particularly true at larger spatial scales. Thus, uncertainty-incorporating analyses of physically based models of rain-induced landslides are rare in the literature. Such landslide modeling is typically conducted at the hillslope scale using gauge-based rainfall forcing data with rather poor spatiotemporal coverage. For regional landslide modeling, the specific advantages and/or disadvantages of gaugeonly, radar-merged and satellite-based rainfall products are not clearly established. Here, we compare and evaluate the performance of the Transient Rainfall Infiltration and Grid-based Regional Slope-stability analysis (TRIGRS) model for three different rainfall products using 112 observed landslides in the period between 2004 and 2011 from the North Carolina Geological Survey database. Our study includes the Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis Version 7 (TMPA V7), the North American Land Data Assimilation System Phase 2 (NLDAS-2) analysis, and the reference "truth" Stage IV precipitation. TRIGRS model performance was rather inferior with the use of literature values of the geotechnical parameters and soil hydraulic properties from ROSETTA using soil textural and bulk density data from SSURGO (Soil Survey Geographic database). The performance of TRIGRS improved considerably after Bayesian estimation of the parameters with the DiffeRential Evolution Adaptive Metropolis (DREAM) algorithm using Stage IV precipitation data. Hereto, we use a likelihood function that combines binary slope failure information from landslide event and "null" periods using multivariate frequency distribution-based metrics such as the false discovery and false omission rates. Our results demonstrate that the Stage IV-inferred TRIGRS parameter distributions generalize well to TMPA and NLDAS-2 precipitation data, particularly at sites with considerably larger TMPA and NLDAS-2 rainfall amounts during landslide events than null periods. TRIGRS model performance is then rather similar for all three rainfall products. At higher elevations, however, the TMPA and NLDAS-2 precipitation volumes are insufficient and their performance with the Stage IV-derived parameter distributions indicates their inability to accurately characterize hillslope stability.

Research paper thumbnail of The role of uncertainty in bedrock depth and hydraulic properties on the stability of a variably-saturated slope

Computers and Geotechnics, 2017

We investigate the uncertainty in bedrock depth and soil hydraulic parameters on the stability of... more We investigate the uncertainty in bedrock depth and soil hydraulic parameters on the stability of a variably-saturated slope in Rio de Janeiro, Brazil. We couple Monte Carlo simulation of a threedimensional flow model with numerical limit analysis to calculate confidence intervals of the safety factor using a 22-day rainfall record. We evaluate the marginal and joint impact of bedrock depth and soil hydraulic uncertainty. The mean safety factor and its 95% confidence interval evolve rapidly in response to the storm events. Explicit recognition of uncertainty in the hydraulic properties and depth to bedrock increases significantly the probability of failure.

Research paper thumbnail of Accelerating Markov Chain Monte Carlo Simulation by Differential Evolution with Self-Adaptive Randomized Subspace Sampling

International Journal of Nonlinear Sciences and Numerical Simulation, 2009

Markov chain Monte Carlo (MCMC) methods have found widespread use in many fields of study to esti... more Markov chain Monte Carlo (MCMC) methods have found widespread use in many fields of study to estimate the average properties of complex systems, and for posterior inference in a Bayesian framework. Existing theory and experiments prove convergence of well-constructed MCMC schemes to the appropriate limiting distribution under a variety of different conditions. In practice, however this convergence is often observed to be disturbingly slow. This is frequently caused by an inappropriate selection of the proposal distribution used to generate trial moves in the Markov Chain. Here we show that significant improvements to the efficiency of MCMC simulation can be made by using a self-adaptive Differential Evolution learning strategy within a population-based evolutionary framework. This scheme, entitled Differential Evolution Adaptive Metropolis or DREAM, runs multiple different chains simultaneously for global exploration, and automatically tunes the scale and orientation of the proposal distribution in randomized subspaces during the search. Ergodicity of the algorithm is proved, and various examples involving nonlinearity, highdimensionality, and multimodality show that DREAM is generally superior to other adaptive MCMC sampling approaches. The DREAM scheme significantly enhances the applicability of MCMC simulation to complex, multi-modal search problems.

Research paper thumbnail of On the value of soil moisture measurements in vadose zone hydrology: A review

Water Resources Research, 2008

We explore and review the value of soil moisture measurements in vadose zone hydrology with a foc... more We explore and review the value of soil moisture measurements in vadose zone hydrology with a focus on the field and catchment scales. This review is motivated by the increasing ability to measure soil moisture with unprecedented spatial and temporal resolution across scales. We highlight and review the state of the art in using soil moisture measurements for (1) estimation of soil hydraulic properties, (2) quantification of water and energy fluxes, and (3) retrieval of spatial and temporal dynamics of soil moisture profiles. We argue for the urgent need to have access to field monitoring sites and databases that include detailed information about variability of hydrological fluxes and parameters, including their upscaled values. In addition, improved data assimilation methods are needed that fully exploit the information contained in soil moisture data. The development of novel upscaling methods for predicting effective moisture fluxes and disaggregation schemes toward integrating ...

Research paper thumbnail of Bayesian Inference of Tree Water Relations Using a Soil-Tree-Atmosphere Continuum Model

Procedia Environmental Sciences, 2013

To better understand root-soil water interactions, a mature white fir (Abies concolor) and the su... more To better understand root-soil water interactions, a mature white fir (Abies concolor) and the surrounding root zone were continuously monitored (sap flow, canopy stem water potential, soil moisture, and temperature), to characterize tree hydrodynamics. We present a hydrodynamic flow model, simulating unsaturated flow in the soil and tree with stress functions controlling spatially distributed root water uptake and canopy transpiration. Using the van Genuchten functions, we parameterize the effective retention and unsaturated hydraulic conductivity functions of the tree sapwood and soil, soil and canopy stress functions, and radial root zone distribution. To parameterize the in-situ tree water relationships, we combine a numerical model with observational data in an optimization framework, minimizing residuals between simulated and measured observational data of soil and tree canopy. Using the MCMC method, the HYDRUS model is run in an iterative process that adjusts parameters until residuals are minimized. Using these optimized parameters, the HYDRUS model simulates diurnal tree water potential and sap flow as a function of tree height, in addition to spatially distributed changes in soil water storage and soil water potential.

Research paper thumbnail of Accuracy of frequency domain analysis scenarios for the determination of complex dielectric permittivity

Water Resources Research, 2004

Frequency domain analysis of time domain reflectometry waveforms has been shown to be useful for ... more Frequency domain analysis of time domain reflectometry waveforms has been shown to be useful for more accurate water content determination, water content determination in saline soils, and determination of such difficult to measure soil properties as specific surface area and soil solution conductivity. Earlier frequency domain analysis approaches to determine frequency‐dependent dielectric properties of soils have used a variety of methods. In this paper, these methods for the determination of dielectric permittivity were compared using the Shuffled Complex Evolution Metropolis algorithm (SCEM‐UA). SCEM‐UA is a global optimization method that allows the simultaneous determination of optimal Debye parameters, which describe the dielectric permittivity as a function of frequency, and their confidence intervals. The analysis of numerically generated measurements with added instrumental noise showed that analysis of network analyzer measurements in the frequency domain potentially has ...

Research paper thumbnail of Framework for Understanding Structural Errors (FUSE): A modular framework to diagnose differences between hydrological models

Water Resources Research, 2008

The problems of identifying the most appropriate model structure for a given problem and quantify... more The problems of identifying the most appropriate model structure for a given problem and quantifying the uncertainty in model structure remain outstanding research challenges for the discipline of hydrology. Progress on these problems requires understanding of the nature of differences between models. This paper presents a methodology to diagnose differences in hydrological model structures: the Framework for Understanding Structural Errors (FUSE). FUSE was used to construct 79 unique model structures by combining components of 4 existing hydrological models. These new models were used to simulate streamflow in two of the basins used in the Model Parameter Estimation Experiment (MOPEX): the Guadalupe River (Texas) and the French Broad River (North Carolina). Results show that the new models produced simulations of streamflow that were at least as good as the simulations produced by the models that participated in the MOPEX experiment. Our initial application of the FUSE method for the Guadalupe River exposed relationships between model structure and model performance, suggesting that the choice of model structure is just as important as the choice of model parameters. However, further work is needed to evaluate model simulations using multiple criteria to diagnose the relative importance of model structural differences in various climate regimes and to assess the amount of independent information in each of the models. This work will be crucial to both identifying the most appropriate model structure for a given problem and quantifying the uncertainty in model structure. To facilitate research on these problems, the FORTRAN-90 source code for FUSE is available upon request from the lead author.

Research paper thumbnail of Vadose Zone Model–Data Fusion: State of the Art and Future Challenges

Vadose Zone Journal, 2012

Models are quantitative formulations of assumptions regarding key physical processes, their mathe... more Models are quantitative formulations of assumptions regarding key physical processes, their mathematical representations, and site‐specific relevant properties at a particular scale of analysis. Models are fused with data in a two‐way process that uses information contained in observational data to refine models and the context provided by models to improve information extraction from observational data. This process of model–data fusion leads to improved understanding of hydrological processes by providing improved estimates of parameters, fluxes, and states of the vadose zone system of interest, as well as of the associated uncertainties of these values. Notwithstanding recent progress, there are still numerous challenges associated with model–data fusion, including: (i) dealing with the increasing complexity of models, (ii) considering new and typically indirect measurements, and (iii) quantifying uncertainty. This special section presents nine contributions that address the stat...

Research paper thumbnail of Introduction to the Special Section in Vadose Zone Journal : Parameter Identification and Uncertainty Assessment in the Unsaturated Zone

Vadose Zone Journal, 2006

Research paper thumbnail of Soil Hydraulic Functions Determined from Measurements of Air Permeability, Capillary Modeling, and High‐Dimensional Parameter Estimation

Vadose Zone Journal, 2011

Prediction of flow and transport through unsaturated porous media requires knowledge of the water... more Prediction of flow and transport through unsaturated porous media requires knowledge of the water retention and unsaturated hydraulic conductivity functions. In the past few decades many different laboratory procedures have been developed to estimate these hydraulic properties. Most of these procedures are time consuming and require significant human commitment. Furthermore, multiple measurement techniques are typically required to yield an accurate characterization of the retention and hydraulic conductivity function between full and residual saturation. We present a more efficient and robust approach to estimating the hydraulic properties of porous media. Our method derives an optimized pore‐size distribution from measurements of air permeability and using recent advances in capillary modeling and high‐dimensional parameter estimation. The section diameters of different parallel capillaries representing the pore structure of a porous medium are optimized with a multi‐algorithm opt...

Research paper thumbnail of Obtaining the Spatial Distribution of Water Content along a TDR Probe Using the SCEM‐UA Bayesian Inverse Modeling Scheme

Vadose Zone Journal, 2004

Time domain reflectometry (TDR) has become one of the standard methods for the measurement of the... more Time domain reflectometry (TDR) has become one of the standard methods for the measurement of the temporal and spatial distribution of water saturation in soils. Current waveform analysis methodology gives a measurement of the average water content along the length of the TDR probe. Close inspection of TDR waveforms shows that heterogeneity in water content along the probe can be seen in the TDR waveform. We present a comprehensive approach to TDR waveform analysis that gives a quantitative estimate of the dielectric permittivity profile along the length of the probe and, therefore, the distribution of water content. The approach is based on the combination of a multisection scatter function model for the TDR measurement system with the shuffled complex evolution Metropolis algorithm (SCEM‐UA). This combined approach allows for the estimation of the 40 parameters in the transmission line model using a series of simple calibration measurements. The proof of concept is given with meas...

Research paper thumbnail of Toward improved prediction of the bedrock depth underneath hillslopes: Bayesian inference of the bottom-up control hypothesis using high-resolution topographic data

Water Resources Research, Apr 1, 2016

The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion... more The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion rates, soil moisture, and water uptake by plant roots. As hillslope interiors are very difficult and costly to illuminate and access, the topography of the bedrock surface is largely unknown. This essay is concerned with the prediction of spatial patterns in the depth to bedrock (DTB) using high-resolution topographic data, numerical modeling, and Bayesian analysis. Our DTB model builds on the bottom-up control on freshbedrock topography hypothesis of Rempe and Dietrich (2014) and includes a mass movement and bedrock-valley morphology term to extent the usefulness and general applicability of the model. We reconcile the DTB model with field observations using Bayesian analysis with the DREAM algorithm. We investigate explicitly the benefits of using spatially distributed parameter values to account implicitly, and in a relatively simple way, for rock mass heterogeneities that are very difficult, if not impossible, to characterize adequately in the field. We illustrate our method using an artificial data set of bedrock depth observations and then evaluate our DTB model with real-world data collected at the Papagaio river basin in Rio de Janeiro, Brazil. Our results demonstrate that the DTB model predicts accurately the observed bedrock depth data. The posterior mean DTB simulation is shown to be in good agreement with the measured data. The posterior prediction uncertainty of the DTB model can be propagated forward through hydromechanical models to derive probabilistic estimates of factors of safety.

Research paper thumbnail of Sworn testimony of the model evidence: Gaussian Mixture Importance (GAME) sampling

Water Resources Research, Jul 1, 2017

What is the ''best'' model? The answer to this question lies in part in the eyes of the beholder,... more What is the ''best'' model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K candidate models, M k ; k5ð1;. .. ; KÞ, and help identify which model is most supported by the observed data, Y5ðỹ 1 ;. .. ;ỹ n Þ. Here, we introduce a new and robust estimator of the model evidence, pðỸjM k Þ, which acts as normalizing constant in the denominator of Bayes' theorem and provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. However, pðỸjM k Þ is analytically intractable for most practical modeling problems. Our method, coined GAussian Mixture importancE (GAME) sampling, uses bridge sampling of a mixture distribution fitted to samples of the posterior model parameter distribution derived from MCMC simulation. We benchmark the accuracy and reliability of GAME sampling by application to a diverse set of multivariate target distributions (up to 100 dimensions) with known values of pðỸjM k Þ and to hypothesis testing using numerical modeling of the rainfall-runoff transformation of the Leaf River watershed in Mississippi, USA. These case studies demonstrate that GAME sampling provides robust and unbiased estimates of the evidence at a relatively small computational cost outperforming commonly used estimators. The GAME sampler is implemented in the MATLAB package of DREAM and simplifies considerably scientific inquiry through hypothesis testing and model selection. Plain Language Summary Science is an iterative process for learning and discovery in which competing ideas about how nature works are evaluated against observations. The translation of each hypothesis to a computational model requires specification of system boundaries, inputs and outputs, state variables, physical/behavioral laws, and material properties; this is difficult and subjective, particularly in the face of incomplete knowledge of the governing spatiotemporal processes and insufficient observed data. To guard against the use of an inadequate model, statisticians advise selecting the ''best'' model among a set of candidate ones where each might be equally plausible and justifiable a priori. Bayesian model selection uses probability theory to select among competing hypotheses; the key variable is the Bayesian model evidence, which provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. Bayesian model selection has not entered into mainstream use in Earth systems modeling due to the lack of general-purpose methods to reliably estimate the evidence. Here, we introduce a new method, called GAussian Mixture importancE (GAME) sampling. We demonstrate GAME power and usefulness for hypothesis testing using benchmark experiments with known target and numerical modeling of the rainfall-runoff transformation of the Leaf River watershed (Mississippi, USA).

Research paper thumbnail of FDCFIT: A MATLAB Toolbox of Closed-form Parametric Expressions of the Flow Duration Curve

The flow duration curve (FDC) is a signature catchment characteristic that depicts graphically th... more The flow duration curve (FDC) is a signature catchment characteristic that depicts graphically the relationship between the exceedance probability of streamflow and its magnitude. This curve is relatively easy to create and interpret, and is used widely for hydrologic analysis, water quality management, and the design of hydroelectric power plants (among others). Several mathematical formulations have been proposed to mimic the FDC. Yet, these efforts have not been particularly successful, in large part because classical functions are not flexible enough to portray accurately the functional shape of the FDC for a large range of catchments and contrasting hydrologic behaviors. In a recent paper, Sadegh et al. (2015) introduced several commonly used models of the soil water characteristic as new class of closed-form parametric expressions for the flow duration curve. These soil water retention functions are relatively simple to use, contain between two to five parameters, and mimic closely the empirical FDCs of watersheds. Here, we present a simple MATLAB toolbox for the fitting of FDCs. This toolbox, called FDC-FIT implements the different expressions introduced by Sadegh et al. (2015) and returns the optimized values of the coefficients of each model, along with graphical output of the fit. This toolbox is particularly useful for diagnostic model evaluation (Vrugt and Sadegh, 2013), as the optimized coefficients can be used as summary metrics. Two different case studies are used to illustrate

Research paper thumbnail of The stationarity paradigm revisited: Hypothesis testing using diagnostics, summary metrics, and DREAM<sub>(ABC)</sub>

Water Resources Research, Nov 1, 2015

Many watershed models used within the hydrologic research community assume (by default) stationar... more Many watershed models used within the hydrologic research community assume (by default) stationary conditions, that is, the key watershed properties that control water flow are considered to be time invariant. This assumption is rather convenient and pragmatic and opens up the wide arsenal of (multivariate) statistical and nonlinear optimization methods for inference of the (temporally fixed) model parameters. Several contributions to the hydrologic literature have brought into question the continued usefulness of this stationary paradigm for hydrologic modeling. This paper builds on the likelihood-free diagnostics approach of Vrugt and Sadegh (2013) and uses a diverse set of hydrologic summary metrics to test the stationary hypothesis and detect changes in the watersheds response to hydroclimatic forcing. Models with fixed parameter values cannot simulate adequately temporal variations in the summary statistics of the observed catchment data, and consequently, the DREAM (ABC) algorithm cannot find solutions that sufficiently honor the observed metrics. We demonstrate that the presented methodology is able to differentiate successfully between watersheds that are classified as stationary and those that have undergone significant changes in land use, urbanization, and/or hydroclimatic conditions, and thus are deemed nonstationary.

Research paper thumbnail of Sworn testimony of the model evidence: Gaussian Mixture Importance (GAME) sampling

Water Resources Research, 2017

What is the “best” model? The answer to this question lies in part in the eyes of the beholder, n... more What is the “best” model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K candidate models, , and help identify which model is most supported by the observed data, . Here, we introduce a new and robust estimator of the model evidence, , which acts as normalizing constant in the denominator of Bayes’ theorem and provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. However, is analytically intractable for most practical modeling problems. Our method, coined GAussian Mixture importancE (GAME) sampling, uses bridge sampling of a mixture distribution fitted to samples of the posterior model parameter distribution derived from MCMC simulation. We benchmark the accuracy and reliability ...

Research paper thumbnail of Toward improved prediction of the bedrock depth underneath hillslopes: Bayesian inference of the bottom‐up control hypothesis using high‐resolution topographic data

Water Resources Research, 2016

The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion... more The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion rates, soil moisture, and water uptake by plant roots. As hillslope interiors are very difficult and costly to illuminate and access, the topography of the bedrock surface is largely unknown. This essay is concerned with the prediction of spatial patterns in the depth to bedrock (DTB) using high‐resolution topographic data, numerical modeling, and Bayesian analysis. Our DTB model builds on the bottom‐up control on fresh‐bedrock topography hypothesis of Rempe and Dietrich (2014) and includes a mass movement and bedrock‐valley morphology term to extent the usefulness and general applicability of the model. We reconcile the DTB model with field observations using Bayesian analysis with the DREAM algorithm. We investigate explicitly the benefits of using spatially distributed parameter values to account implicitly, and in a relatively simple way, for rock mass heterogeneities that are very ...

Research paper thumbnail of Generalized likelihood uncertainty estimation (GLUE) using adaptive Markov Chain Monte Carlo sampling

Advances in Water Resources, Apr 1, 2008

 Users may download and print one copy of any publication from the public portal for the purpose... more  Users may download and print one copy of any publication from the public portal for the purpose of private study or research.  You may not further distribute the material or use it for any profit-making activity or commercial gain  You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Research paper thumbnail of Interactive comment on “ A novel approach to parameter uncertainty analysis of hydrological models using neural networks ” by D

Furthermore, the paper lacks a profound discussion: in current version, the results and discussio... more Furthermore, the paper lacks a profound discussion: in current version, the results and discussion part is extremely short, and I believe that a more in depth discussion on when the model is making the largest or smallest errors could be given (this is not summarized in one sentence). In the whole discussion part (and basically the setup of the paper) different ANNs could have been compared, where each ANN is based on different input data. This is currently only restricted to Rt−9a, Qt−1 and ∆Qt−1. Maybe other input variables may have been a better choice (this may be learned from an indepth study of the results of the proposed ANN: where did it really get off, and what variables may have reduced this error?).

Research paper thumbnail of Dual-domain mixing cell modelling and uncertainty analysis for unsaturated bromide and chloride transport

Chan, F., Marinova, D. and Anderssen, R.S. (eds) MODSIM2011, 19th International Congress on Modelling and Simulation., 2011

Research paper thumbnail of Bayesian analysis of the impact of rainfall data product on simulated slope failure for North Carolina locations

Computational Geosciences, 2019

In the past decades, many different approaches have been developed in the literature to quantify ... more In the past decades, many different approaches have been developed in the literature to quantify the load-carrying capacity and geotechnical stability (or the factor of safety, F s) of variably saturated hillslopes. Much of this work has focused on a deterministic characterization of hillslope stability. Yet, simulated F s values are subject to considerable uncertainty due to our inability to characterize accurately the soil mantle's properties (hydraulic, geotechnical, and geomorphologic) and spatiotemporal variability of the moisture content of the hillslope interior. This is particularly true at larger spatial scales. Thus, uncertainty-incorporating analyses of physically based models of rain-induced landslides are rare in the literature. Such landslide modeling is typically conducted at the hillslope scale using gauge-based rainfall forcing data with rather poor spatiotemporal coverage. For regional landslide modeling, the specific advantages and/or disadvantages of gaugeonly, radar-merged and satellite-based rainfall products are not clearly established. Here, we compare and evaluate the performance of the Transient Rainfall Infiltration and Grid-based Regional Slope-stability analysis (TRIGRS) model for three different rainfall products using 112 observed landslides in the period between 2004 and 2011 from the North Carolina Geological Survey database. Our study includes the Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis Version 7 (TMPA V7), the North American Land Data Assimilation System Phase 2 (NLDAS-2) analysis, and the reference "truth" Stage IV precipitation. TRIGRS model performance was rather inferior with the use of literature values of the geotechnical parameters and soil hydraulic properties from ROSETTA using soil textural and bulk density data from SSURGO (Soil Survey Geographic database). The performance of TRIGRS improved considerably after Bayesian estimation of the parameters with the DiffeRential Evolution Adaptive Metropolis (DREAM) algorithm using Stage IV precipitation data. Hereto, we use a likelihood function that combines binary slope failure information from landslide event and "null" periods using multivariate frequency distribution-based metrics such as the false discovery and false omission rates. Our results demonstrate that the Stage IV-inferred TRIGRS parameter distributions generalize well to TMPA and NLDAS-2 precipitation data, particularly at sites with considerably larger TMPA and NLDAS-2 rainfall amounts during landslide events than null periods. TRIGRS model performance is then rather similar for all three rainfall products. At higher elevations, however, the TMPA and NLDAS-2 precipitation volumes are insufficient and their performance with the Stage IV-derived parameter distributions indicates their inability to accurately characterize hillslope stability.

Research paper thumbnail of The role of uncertainty in bedrock depth and hydraulic properties on the stability of a variably-saturated slope

Computers and Geotechnics, 2017

We investigate the uncertainty in bedrock depth and soil hydraulic parameters on the stability of... more We investigate the uncertainty in bedrock depth and soil hydraulic parameters on the stability of a variably-saturated slope in Rio de Janeiro, Brazil. We couple Monte Carlo simulation of a threedimensional flow model with numerical limit analysis to calculate confidence intervals of the safety factor using a 22-day rainfall record. We evaluate the marginal and joint impact of bedrock depth and soil hydraulic uncertainty. The mean safety factor and its 95% confidence interval evolve rapidly in response to the storm events. Explicit recognition of uncertainty in the hydraulic properties and depth to bedrock increases significantly the probability of failure.

Research paper thumbnail of Accelerating Markov Chain Monte Carlo Simulation by Differential Evolution with Self-Adaptive Randomized Subspace Sampling

International Journal of Nonlinear Sciences and Numerical Simulation, 2009

Markov chain Monte Carlo (MCMC) methods have found widespread use in many fields of study to esti... more Markov chain Monte Carlo (MCMC) methods have found widespread use in many fields of study to estimate the average properties of complex systems, and for posterior inference in a Bayesian framework. Existing theory and experiments prove convergence of well-constructed MCMC schemes to the appropriate limiting distribution under a variety of different conditions. In practice, however this convergence is often observed to be disturbingly slow. This is frequently caused by an inappropriate selection of the proposal distribution used to generate trial moves in the Markov Chain. Here we show that significant improvements to the efficiency of MCMC simulation can be made by using a self-adaptive Differential Evolution learning strategy within a population-based evolutionary framework. This scheme, entitled Differential Evolution Adaptive Metropolis or DREAM, runs multiple different chains simultaneously for global exploration, and automatically tunes the scale and orientation of the proposal distribution in randomized subspaces during the search. Ergodicity of the algorithm is proved, and various examples involving nonlinearity, highdimensionality, and multimodality show that DREAM is generally superior to other adaptive MCMC sampling approaches. The DREAM scheme significantly enhances the applicability of MCMC simulation to complex, multi-modal search problems.

Research paper thumbnail of On the value of soil moisture measurements in vadose zone hydrology: A review

Water Resources Research, 2008

We explore and review the value of soil moisture measurements in vadose zone hydrology with a foc... more We explore and review the value of soil moisture measurements in vadose zone hydrology with a focus on the field and catchment scales. This review is motivated by the increasing ability to measure soil moisture with unprecedented spatial and temporal resolution across scales. We highlight and review the state of the art in using soil moisture measurements for (1) estimation of soil hydraulic properties, (2) quantification of water and energy fluxes, and (3) retrieval of spatial and temporal dynamics of soil moisture profiles. We argue for the urgent need to have access to field monitoring sites and databases that include detailed information about variability of hydrological fluxes and parameters, including their upscaled values. In addition, improved data assimilation methods are needed that fully exploit the information contained in soil moisture data. The development of novel upscaling methods for predicting effective moisture fluxes and disaggregation schemes toward integrating ...

Research paper thumbnail of Bayesian Inference of Tree Water Relations Using a Soil-Tree-Atmosphere Continuum Model

Procedia Environmental Sciences, 2013

To better understand root-soil water interactions, a mature white fir (Abies concolor) and the su... more To better understand root-soil water interactions, a mature white fir (Abies concolor) and the surrounding root zone were continuously monitored (sap flow, canopy stem water potential, soil moisture, and temperature), to characterize tree hydrodynamics. We present a hydrodynamic flow model, simulating unsaturated flow in the soil and tree with stress functions controlling spatially distributed root water uptake and canopy transpiration. Using the van Genuchten functions, we parameterize the effective retention and unsaturated hydraulic conductivity functions of the tree sapwood and soil, soil and canopy stress functions, and radial root zone distribution. To parameterize the in-situ tree water relationships, we combine a numerical model with observational data in an optimization framework, minimizing residuals between simulated and measured observational data of soil and tree canopy. Using the MCMC method, the HYDRUS model is run in an iterative process that adjusts parameters until residuals are minimized. Using these optimized parameters, the HYDRUS model simulates diurnal tree water potential and sap flow as a function of tree height, in addition to spatially distributed changes in soil water storage and soil water potential.

Research paper thumbnail of Accuracy of frequency domain analysis scenarios for the determination of complex dielectric permittivity

Water Resources Research, 2004

Frequency domain analysis of time domain reflectometry waveforms has been shown to be useful for ... more Frequency domain analysis of time domain reflectometry waveforms has been shown to be useful for more accurate water content determination, water content determination in saline soils, and determination of such difficult to measure soil properties as specific surface area and soil solution conductivity. Earlier frequency domain analysis approaches to determine frequency‐dependent dielectric properties of soils have used a variety of methods. In this paper, these methods for the determination of dielectric permittivity were compared using the Shuffled Complex Evolution Metropolis algorithm (SCEM‐UA). SCEM‐UA is a global optimization method that allows the simultaneous determination of optimal Debye parameters, which describe the dielectric permittivity as a function of frequency, and their confidence intervals. The analysis of numerically generated measurements with added instrumental noise showed that analysis of network analyzer measurements in the frequency domain potentially has ...

Research paper thumbnail of Framework for Understanding Structural Errors (FUSE): A modular framework to diagnose differences between hydrological models

Water Resources Research, 2008

The problems of identifying the most appropriate model structure for a given problem and quantify... more The problems of identifying the most appropriate model structure for a given problem and quantifying the uncertainty in model structure remain outstanding research challenges for the discipline of hydrology. Progress on these problems requires understanding of the nature of differences between models. This paper presents a methodology to diagnose differences in hydrological model structures: the Framework for Understanding Structural Errors (FUSE). FUSE was used to construct 79 unique model structures by combining components of 4 existing hydrological models. These new models were used to simulate streamflow in two of the basins used in the Model Parameter Estimation Experiment (MOPEX): the Guadalupe River (Texas) and the French Broad River (North Carolina). Results show that the new models produced simulations of streamflow that were at least as good as the simulations produced by the models that participated in the MOPEX experiment. Our initial application of the FUSE method for the Guadalupe River exposed relationships between model structure and model performance, suggesting that the choice of model structure is just as important as the choice of model parameters. However, further work is needed to evaluate model simulations using multiple criteria to diagnose the relative importance of model structural differences in various climate regimes and to assess the amount of independent information in each of the models. This work will be crucial to both identifying the most appropriate model structure for a given problem and quantifying the uncertainty in model structure. To facilitate research on these problems, the FORTRAN-90 source code for FUSE is available upon request from the lead author.

Research paper thumbnail of Vadose Zone Model–Data Fusion: State of the Art and Future Challenges

Vadose Zone Journal, 2012

Models are quantitative formulations of assumptions regarding key physical processes, their mathe... more Models are quantitative formulations of assumptions regarding key physical processes, their mathematical representations, and site‐specific relevant properties at a particular scale of analysis. Models are fused with data in a two‐way process that uses information contained in observational data to refine models and the context provided by models to improve information extraction from observational data. This process of model–data fusion leads to improved understanding of hydrological processes by providing improved estimates of parameters, fluxes, and states of the vadose zone system of interest, as well as of the associated uncertainties of these values. Notwithstanding recent progress, there are still numerous challenges associated with model–data fusion, including: (i) dealing with the increasing complexity of models, (ii) considering new and typically indirect measurements, and (iii) quantifying uncertainty. This special section presents nine contributions that address the stat...

Research paper thumbnail of Introduction to the Special Section in Vadose Zone Journal : Parameter Identification and Uncertainty Assessment in the Unsaturated Zone

Vadose Zone Journal, 2006

Research paper thumbnail of Soil Hydraulic Functions Determined from Measurements of Air Permeability, Capillary Modeling, and High‐Dimensional Parameter Estimation

Vadose Zone Journal, 2011

Prediction of flow and transport through unsaturated porous media requires knowledge of the water... more Prediction of flow and transport through unsaturated porous media requires knowledge of the water retention and unsaturated hydraulic conductivity functions. In the past few decades many different laboratory procedures have been developed to estimate these hydraulic properties. Most of these procedures are time consuming and require significant human commitment. Furthermore, multiple measurement techniques are typically required to yield an accurate characterization of the retention and hydraulic conductivity function between full and residual saturation. We present a more efficient and robust approach to estimating the hydraulic properties of porous media. Our method derives an optimized pore‐size distribution from measurements of air permeability and using recent advances in capillary modeling and high‐dimensional parameter estimation. The section diameters of different parallel capillaries representing the pore structure of a porous medium are optimized with a multi‐algorithm opt...

Research paper thumbnail of Obtaining the Spatial Distribution of Water Content along a TDR Probe Using the SCEM‐UA Bayesian Inverse Modeling Scheme

Vadose Zone Journal, 2004

Time domain reflectometry (TDR) has become one of the standard methods for the measurement of the... more Time domain reflectometry (TDR) has become one of the standard methods for the measurement of the temporal and spatial distribution of water saturation in soils. Current waveform analysis methodology gives a measurement of the average water content along the length of the TDR probe. Close inspection of TDR waveforms shows that heterogeneity in water content along the probe can be seen in the TDR waveform. We present a comprehensive approach to TDR waveform analysis that gives a quantitative estimate of the dielectric permittivity profile along the length of the probe and, therefore, the distribution of water content. The approach is based on the combination of a multisection scatter function model for the TDR measurement system with the shuffled complex evolution Metropolis algorithm (SCEM‐UA). This combined approach allows for the estimation of the 40 parameters in the transmission line model using a series of simple calibration measurements. The proof of concept is given with meas...