Sequential Design with Mutual Information for Computer Experiments (MICE): Emulation of a Tsunami Model (original) (raw)
Related papers
Ch. 7. A review of design and modeling in computer experiments
Handbook of Statistics, 2003
In this paper, we provide a review of statistical methods that are useful in conducting computer experiments. Our focus is primarily on the task of metamodeling, which is driven by the goal of optimizing a complex system via a deterministic simulation model. However, we also mention the case of a stochastic simulation, and examples of both cases are discussed. The organization of our review separates the two primary tasks for metamodeling: (1) select an experimental design; (2) fit a statistical model. We provide an overview of the general strategy and discuss applications in electrical engineering, chemical engineering, mechanical engineering, and dynamic programming. Then, we dedicate a section to statistical modeling methods followed by a section on experimental designs. Designs are discussed in two paradigms, model-dependent and model-independent, to emphasize their different objectives. Both classical and modern methods are discussed.
A Novel Hybrid Sequential Design Strategy for Global Surrogate Modeling of Computer Experiments
SIAM Journal on Scientific Computing, 2011
Many complex real-world systems can be accurately modeled by simulations. However, high-fidelity simulations may take hours or even days to compute. Because this can be impractical, a surrogate model is often used to approximate the dynamic behavior of the original simulator. This model can then be used as a cheap, drop-in replacement for the simulator. Because simulations can be very expensive, the data points, which are required to build the model, must be chosen as optimally as possible. Sequential design strategies offer a huge advantage over one-shot experimental designs because they can use information gathered from previous data points in order to determine the location of new data points. Each sequential design strategy must perform a trade-off between exploration and exploitation, where the former involves selecting data points in unexplored regions of the design space, while the latter suggests adding data points in regions which were previously identified to be interesting (for example, highly nonlinear regions). In this paper, a novel hybrid sequential design strategy is proposed which uses a Monte Carlo-based approximation of a Voronoi tessellation for exploration and local linear approximations of the simulator for exploitation. The advantage of this method over other sequential design methods is that it is independent of the model type, and can therefore be used in heterogeneous modeling environments, where multiple model types are used at the same time. The new method is demonstrated on a number of test problems, showing that it is a robust, competitive, and efficient sequential design strategy.
Comparing designs for computer simulation experiments
2008 Winter Simulation Conference, 2008
The use of simulation as a modeling and analysis tool is wide spread. Simulation is an enabling tool for experimenting virtually on a validated computer environment. Often the underlying function for the results of a computer simulation experiment has too much curvature to be adequately modeled by a low order polynomial. In such cases finding an appropriate experimental design is not easy. This research uses prediction variance over the volume of the design region to evaluate computer simulation experiments assuming the modeler is interested in fitting a second order polynomial or a Gaussian Process model to the response data. Both space-filling and optimal designs are considered.
Designing combined physical and computer experiments to maximize prediction accuracy
Computational Statistics & Data Analysis, 2017
Combined designs for experiments involving a physical system and a simulator of the physical system are evaluated in terms of their accuracy of predicting the mean of the physical system. Comparisons are made among designs that are (1) locally optimal under the minimum integrated mean squared prediction error criterion for the combined physical system and simulator experiments, (2) locally optimal for the physical or simulator experiments, with a fixed design for the component not being optimized, (3) maximin augmented nested Latin hypercube, and (4) I-optimal for the physical system experiment and maximin Latin hypercube for the simulator experiment. Computational methods are proposed for constructing the designs of interest. For a large test bed of examples, the empirical mean squared prediction errors are compared at a grid of inputs for each test surface using a statistically calibrated Bayesian predictor based on the data from each design. The prediction errors are also studied for a test bed that varies only the calibration parameter of the test surface. Design recommendations are given.
Optimal designs for the propagation of uncertainty in computer experiments
Response surfaces, or meta-models, and design of experiments are widely used for various experimental works. But numerous physical phenomenons are studied through complex and costly numerical simulators. In such cases, the response is influenced by factors but the link between these variables is deterministic. Nevertheless, factors are often known with uncertainty and the influence of this ignorance is important for the practician. Due to the computing time, it is not possible to obtain the uncertainty of the response through a standard Monte Carlo method and an approximation of the simulator, a meta-model, is needed. We present an optimality criterion, the MC-V, in order to evaluate the probability distribution of the response with a minimal error. We chose to apply the criterion on parts of 2 real cases derived from the petroleum industry. The simulator 2 nd order polynomial meta-model and the three distributions of input factors (uniform, gaussian, triangular) are among those use...
European Journal of Operational Research, 2011
Simulated computer experiments have become a viable cost-effective alternative for controlled real-life experiments. However, the simulation of complex systems with multiple input and output parameters can be a very timeconsuming process. Many of these high-fidelity simulators need minutes, hours or even days to perform one simulation. The goal of global surrogate modeling is to create an approximation model that mimics the original simulator, based on a limited number of expensive simulations, but can be evaluated much faster. The set of simulations performed to create this model is called the experimental design. Traditionally, one-shot designs such as the Latin hypercube and factorial design are used, and all simulations are performed before the first model is built. In order to reduce the number of simulations needed to achieve the desired accuracy, sequential design methods can be employed. These methods generate the samples for the experimental design one by one, without knowing the total number of samples in advance. In this paper, the authors perform an extensive study of new and state-of-the-art space-filling sequential design methods. It is shown that the new sequential methods proposed in this paper produce results comparable to the best one-shot experimental designs available right now.
Recent Advances in Computer Experiment Modeling
2014
OF THE DISSERTATION Recent Advances in Computer Experiment Modeling by YUFAN LIU Dissertation Director: Ying Hung This dissertation develops methodologies for analysis of computer experiments and its related theories. Computer experiments are becoming increasingly important in science and Gaussian process (GP) models are widely used in the analysis of computer experiments. This dissertation focuses on two settings where massive data are observed on irregular grids or quantiles of correlated data are of interests. In this dissertation, we first develop Latin Hypercube Design-based Block Bootstrap method. Then, we investigate quantiles of computer experiments in which correlated data are observed and propose penalized quantile regression with asymmetric Laplace process. The computational issue that hinders GP from broader application is recognized, especially for massive data observed on irregular grids. To overcome the computational issue, we introduce an efficient framework based on...
Adaptive exploration of computer experiment parameter spaces
2004
Computer experiments often require dense sweeps over input parameters to obtain a qualitative understanding of their response. Such sweeps can be prohibitively expensive, and are unnecessary in regions where the response is easily predicted; well-chosen designs could allow a mapping of the response with far fewer simulation runs. Thus, there is a need for computationally inexpensive surrogate models and an accompanying method for selecting small designs. We explore a non-stationary modeling methodology for addressing this need that couples stationary Gaussian process with treed partitioning. A Bayesian perspective yields an explicit measure of (non-stationary) predictive uncertainty that can be used to guide sampling. As typical experiments are high-dimensional and require large designs, a careful but thrifty implementation is essential. Thus, the statistical computing details which make our methodology efficient are outlined in detail. Classic non-stationary data analyzed in recent literature is used to validate our model, and the benefit of adaptive sampling is illustrated through our motivating example which involves the computational fluid dynamics simulation of a NASA reentry vehicle.
Sequential Design of Computer Experiments to Minimize Integrated Response Functions
2000
Abstract: In the last ten to fifteen years many,phenomena,that could only be studied using physical experiments can now,be studied by computer experiments. Advances in the mathematical modeling of many physical processes, in algorithms for solving mathematical systems, and in computer speeds, have combined to make it possible to replace some physical experiments with computer experiments. In a computer experiment, a