Variogram-Based Proper Scoring Rules for Probabilistic Forecasts of Multivariate Quantities* (original) (raw)

Comments on: Assessing probabilistic forecasts of multivariate quantities, with an application to ensemble predictions of surface winds

TEST, 2008

We discuss methods for the evaluation of probabilistic predictions of vector-valued quantities, that can take the form of a discrete forecast ensemble or a density forecast. In particular, we propose a multivariate version of the univariate verification rank histogram or Talagrand diagram that can be used to check the calibration of ensemble forecasts. In the case of density forecasts, Box's density ordinate transform provides an attractive alternative. The multivariate energy score generalizes the continuous ranked probability score. It addresses both calibration and sharpness, and can be used to compare deterministic forecasts, ensemble forecasts and density forecasts, using a single loss function that is proper. An application to the University of Washington mesoscale ensemble points at strengths and deficiencies of probabilistic short-range forecasts of surface wind vectors over the North American Pacific Northwest.

Rejoinder on: Assessing probabilistic forecasts of multivariate quantities, with an application to ensemble predictions of surface winds

TEST, 2008

We discuss methods for the evaluation of probabilistic predictions of vector-valued quantities, that can take the form of a discrete forecast ensemble or a density forecast. In particular, we propose a multivariate version of the univariate verification rank histogram or Talagrand diagram that can be used to check the calibration of ensemble forecasts. In the case of density forecasts, Box's density ordinate transform provides an attractive alternative. The multivariate energy score generalizes the continuous ranked probability score. It addresses both calibration and sharpness, and can be used to compare deterministic forecasts, ensemble forecasts and density forecasts, using a single loss function that is proper. An application to the University of Washington mesoscale ensemble points at strengths and deficiencies of probabilistic short-range forecasts of surface wind vectors over the North American Pacific Northwest.

Assessing probabilistic forecasts of multivariate quantities, with an application to ensemble predictions of surface winds

TEST, 2008

We discuss methods for the evaluation of probabilistic predictions of vector-valued quantities, that can take the form of a discrete forecast ensemble or a density forecast. In particular, we propose a multivariate version of the univariate verification rank histogram or Talagrand diagram that can be used to check the calibration of ensemble forecasts. In the case of density forecasts, Box's density ordinate transform provides an attractive alternative. The multivariate energy score generalizes the continuous ranked probability score. It addresses both calibration and sharpness, and can be used to compare deterministic forecasts, ensemble forecasts and density forecasts, using a single loss function that is proper. An application to the University of Washington mesoscale ensemble points at strengths and deficiencies of probabilistic short-range forecasts of surface wind vectors over the North American Pacific Northwest.

Combining Spatial Statistical and Ensemble Information in Probabilistic Weather Forecasts

Forecast ensembles typically show a spread-skill relationship, but they are also often underdispersive, and therefore uncalibrated. Bayesian model averaging (BMA) is a statistical postprocessing method for forecast ensembles that generates calibrated probabilistic forecast products for weather quantities at individual sites. This paper introduces the spatial BMA technique, which combines BMA and the geostatistical output perturbation (GOP) method, and extends BMA to generate calibrated probabilistic forecasts of whole weather fields simultaneously, rather than just weather events at individual locations. At any site individually, spatial BMA reduces to the original BMA technique. The spatial BMA method provides statistical ensembles of weather field forecasts that take the spatial structure of observed fields into account and honor the flow-dependent information contained in the dynamical ensemble. The members of the spatial BMA ensemble are obtained by dressing the weather field forecasts from the dynamical ensemble with simulated spatially correlated error fields, in proportions that correspond to the BMA weights for the member models in the dynamical ensemble. Statistical ensembles of any size can be generated at minimal computational cost. The spatial BMA technique was applied to 48-h forecasts of surface temperature over the Pacific Northwest in 2004, using the University of Washington mesoscale ensemble. The spatial BMA ensemble generally outperformed the BMA and GOP ensembles and showed much better verification results than the raw ensemble, both at individual sites, for weather field forecasts, and for forecasts of composite quantities, such as average temperature in National Weather Service forecast zones and minimum temperature along the Interstate 90 Mountains to Sound Greenway.

Investigation of ensemble variance as a measure of true forecast variance

Monthly weather review, 2011

The uncertainty in meteorological predictions is of interest for applications ranging from economic to recreational to public safety. One common method to estimate uncertainty is by using meteorological ensembles. These ensembles provide an easily quantifiable measure of the uncertainty in the forecast in the form of the ensemble variance. However, ensemble variance may not accurately reflect the actual uncertainty, so any measure of uncertainty derived from the ensemble should be calibrated to provide a more reliable ...

Multivariate Probabilistic Analysis and Predictability of Medium-Range Ensemble Weather Forecasts

Monthly Weather Review, 2014

Ensemble weather forecasting has been operational for two decades now. However, the related uncertainty analysis in terms of probabilistic postprocessing still focuses on single variables, grid points, or stations. Inevitable dependencies in space and time and between variables are often ignored. To address this problem, two probabilistic postprocessing methods are presented, which are multivariate versions of Gaussian fit and kernel dressing, respectively. The multivariate case requires the estimation of a full rank, invertible covariance matrix. For this purpose, a Graphical Least Absolute Shrinkage and Selection Operators (GLASSO) estimator has been employed that is based on sparse undirected graphical models regularized by an L1 penalty term in order to parameterize the full rank inverse covariance. In all cases, the result is a multidimensional probability density. The forecasts used to test the approach are station forecasts of 2-m temperature and surface pressure from four ma...

Focusing on regions of interest in forecast evaluation

The Annals of Applied Statistics, 2017

Often, interest in forecast evaluation focuses on certain regions of the whole potential range of the outcome, and forecasts should mainly be ranked according to their performance within these regions. A prime example is risk management, which relies on forecasts of risk measures such as the value-atrisk or the expected shortfall, and hence requires appropriate loss distribution forecasts in the tails. Further examples include weather forecasts with a focus on extreme conditions, or forecasts of environmental variables such as ozone with a focus on concentration levels with adverse health effects. In this paper, we show how weighted scoring rules can be used to this end, and in particular that they allow to rank several potentially misspecified forecasts objectively with the region of interest in mind. This is demonstrated in various simulation scenarios. We introduce desirable properties of weighted scoring rules and present general construction principles based on conditional densities or distributions and on scoring rules for probability forecasts. In our empirical application to log-return time series, all forecasts seem to be slightly misspecified, as is often unavoidable in practice, and no method performs best overall. However, using weighted scoring functions the best method for predicting losses can be identified, which is hence the method of choice for the purpose of risk management.

Proposed strategy for ensemble prediction in the Bureau of Meteorology

2005

Ensembles provide information on forecast uncertainty, enabling better decision making where weather poses a risk or opportunity, and allowing forecasts to be usefully extended beyond the deterministic range. This paper outlines a strategy for how the Bureau can develop and enhance its ensemble prediction systems to provide accurate deterministic and probabilistic forecasts of surface and upper air fields of interest to forecasters and the public, covering Australia and surrounding waters, on time scales relevant to nowcasting, weather, and seasonal climate prediction.