Assessment of uncertainty in computer experiments from Universal to Bayesian Kriging (original) (raw)

Influence of parameter estimation uncertainty in Kriging: Part 2 - Test and case study applications

Hydrology and Earth System Sciences, 2001

The theoretical approach introduced in Part 1 is applied to a numerical example and to the case of yearly average precipitation estimation over the Veneto Region in Italy. The proposed methodology was used to assess the effects of parameter estimation uncertainty on Kriging estimates and on their estimated error variance. The Maximum Likelihood (ML) estimator proposed in Part 1, was applied to the zero mean deviations from yearly average precipitation over the Veneto Region in Italy, obtained after the elimination of a non-linear drift with elevation. Three different semi-variogram models were used, namely the exponential, the Gaussian and the modified spherical, and the relevant biases as well as the increases in variance have been assessed. A numerical example was also conducted to demonstrate how the procedure leads to unbiased estimates of the random functions. One hundred sets of 82 observations were generated by means of the exponential model on the basis of the parameter values identified for the Veneto Region rainfall problem and taken as characterising the true underlining process. The values of parameter and the consequent cross-validation errors, were estimated from each sample. The cross-validation errors were first computed in the classical way and then corrected with the procedure derived in Part 1. Both sets, original and corrected, were then tested, by means of the Likelihood ratio test, against the null hypothesis of deriving from a zero mean process with unknown covariance. The results of the experiment clearly show the effectiveness of the proposed approach.

Empirical Bayesian Kriging Implemented in ArcGIS Geostatistical Analyst

2012

Obtaining reliable environmental measurements can be costly and laborious, and in many cases, environmental contaminant samples are not collected where people live or work. Th e ability to predict values where observations are not available is, therefore, very important. Interpolation is the process of obtaining a value for a variable of interest at a location where data has not been observed, using data from locations where data has been collected. Th ere are many methods for interpolating spatial data. Th ey fall into two broad classes: deterministic and probabilistic. Deterministic methods use predefi ned functions of the distance between observation locations and the location for which interpolation is required (for example, inverse distance interpolation). Probabilistic methods have a foundation in statistical theory. Th ese predictors quantify the uncertainty associated with the interpolated values. Th e requirement of providing information on prediction uncertainty limits the...

Confronting uncertainty in model-based geostatistics using Markov Chain Monte Carlo simulation

This paper demonstrates the use of Markov Chain Monte Carlo (MCMC) simulation for parameter inference in model-based soil geostatistics. We implemented the recently developed DiffeRential Evolution Adaptive Metropolis (DREAM) algorithm to jointly summarize the posterior distribution of variogram parameters and the coefficients of a linear spatial model, and derive estimates of predictive uncertainty. The DREAM method runs multiple different Markov chains in parallel and jumps in each chain are generated from a discrete proposal distribution containing a fixed multiple of the difference of the states of randomly chosen pairs of other chains. This approach automatically scales the orientation and scale of the proposal distribution, and is especially designed to maintain detailed balance and ergodicity, thereby generating an exact approximation of the posterior probability density function (pdf) of the parameters of the linear model and variogram. This approach is tested using three different data sets from Australia involving variogram estimation of soil thickness, kriging of soil pH, and spatial prediction of soil organic carbon content. The results showed some advantages of MCMC over the conventional method of moments and residual maximum likelihood (REML) estimation. The posterior pdf derived with MCMC conveys important information about parameter uncertainty, multi-dimensional parameter correlation, and thus how many significant parameters are warranted by the calibration data. Parameter uncertainty constitutes only a small part of total prediction uncertainty for the case studies considered here. The prediction accuracies using MCMC and REML are similar. The variogram estimated using conventional approaches (method of moments, and without simulation) lies within the 95% prediction uncertainty interval of the posterior distribution derived with DREAM. Altogether our results show that conventional kriging and regression-kriging still remain a viable option for production mapping.

A note on the choice and the estimation of Kriging models for the analysis of deterministic computer experiments

Applied Stochastic Models in Business and Industry, 2009

Our goal in the present work is to give an insight on some important questions to be asked when choosing a Kriging model for the analysis of numerical experiments. We are especially concerned about the cases where the size of the design of experiments is small relatively to the algebraic dimension of the inputs. We first fix the notations and recall some basic properties of Kriging. Then we expose two experimental studies on subjects that are often skipped in the field of computer simulation analysis: the lack of reliability of likelihood maximization with few data, and the consequences of a trend misspecification. We finally propose an example from a porous media application, with the introduction of an original Kriging method in which a non-linear additive model is used as external trend.

An Alternative Measure of the Reliability of Ordinary Kriging Estimates

2000

This paper presents an interpolation variance as an alternative to the measure of the reliability of ordinary kriging estimates. Contrary to the traditional kriging variance, the interpolation variance is data-values dependent, variogram dependent, and a measure of local accuracy. Natural phenomena are not homogeneous; therefore, local variability as expressed through data values must be recognized for a correct assessment of uncertainty. The interpolation variance is simply the weighted average of the squared differences between data values and the retained estimate. Ordinary kriging or simple kriging variances are the expected values of interpolation variances; therefore, these traditional homoscedastic estimation variances cannot properly measure local data dispersion. More precisely, the interpolation variance is an estimate of the local conditional variance, when the ordinary kriging weights are interpreted as conditional probabilities associated to the n neighboring data. This interpretation is valid if, and only if, all ordinary kriging weights are positive or constrained to be such. Extensive tests illustrate that the interpolation variance is a useful alternative to the traditional kriging variance.

Use of Kriging Models to Approximate Deterministic Computer Models

AIAA Journal, 2005

The use of kriging models for approximation and global optimization has been steadily on the rise in the past decade. The standard approach used in the Design and Analysis of Computer Experiments (DACE) is to use an Ordinary kriging model to approximate a deterministic computer model. Universal and Detrended kriging are two alternative types of kriging models. In this paper, a description on the basics of kriging is given, highlighting the similarities and differences between these three different types of kriging models and the underlying assumptions behind each. A comparative study on the use of three different types of kriging models is then presented using six test problems. The methods of Maximum Likelihood Estimation (MLE) and Cross-Validation (CV) for model parameter estimation are compared for the three kriging model types. A one-dimension problem is first used to visualize the differences between the different models. In order to show applications in higher dimensions, four two-dimension and a 5dimension problem are also given.

AUTO-IK: A 2D indicator kriging program for the automated non-parametric modeling of local uncertainty in earth sciences

Computers & Geosciences, 2009

Indicator kriging (IK) provides a flexible interpolation approach that is well suited for datasets where: (1) many observations are below the detection limit, (2) the histogram is strongly skewed, or (3) specific classes of attribute values are better connected in space than others (e.g. low pollutant concentrations). To apply indicator kriging at its full potential requires, however, the tedious inference and modeling of multiple indicator semivariograms, as well as the post-processing of the results to retrieve attribute estimates and associated measures of uncertainty. This paper presents a computer code that performs automatically the following tasks: selection of thresholds for binary coding of continuous data, computation and modeling of indicator semivariograms, modeling of probability distributions at unmonitored locations (regular or irregular grids), and estimation of the mean and variance of these distributions. The program also offers tools for quantifying the goodness of the model of uncertainty within a cross-validation and jack-knife frameworks. The different functionalities are illustrated using heavy metal concentrations from the well-known soil Jura dataset. A sensitivity analysis demonstrates the benefit of using more thresholds when indicator kriging is implemented with a linear interpolation model, in particular for variables with positively skewed histograms.

Interpolation with uncertain spatial covariances: A Bayesian alternative to Kriging

Journal of Multivariate Analysis, 1992

In this paper a Bayesian alternative to Kriging is developed. The latter is an important tool in geostatistics. But aspects of environmetrics make it less suitable as a tool for interpolating spatial random fields which are observed successively over time. The theory presented here permits temporal (and spatial) modeling to be done in a convenient and flexible way. At the same time model misspecilications, if any, can be corrected by additional data if and when it becomes available, and past data may be used in a systematic way to lit model parameters. Finally, uncertainty about model parameters is represented in the (posterior) distributions, so unrealistically small credible regions for the interpolants are avoided. The theory is based on the multivariate normal and related distributions, but because of the hierarchical prior models adopted, the results would seem somewhat robust with respect to the choice of these distributions and associated hyperparameters. CL' 1992 Academic Press. Inc.