Comment on "Assessing CN earthquake predictions in Italy" by M. Taroni, W. Marzocchi, P. Roselli (original) (raw)

Basic principles for evaluating an earthquake prediction method

Geophysical Research Letters, 1996

A three year continuous sample of earthquake predictions based on the observation of Seismic Electric Signals in Greece was published by Varotsos and Lazaridou [1991]. Four independent studies analyzed this sample and concluded that the success rate of the predictions is far beyond chance. On the other hand, Mulargia and Gasperini [1992] (hereafter cited as MG) claim that these predictions can be ascribed to chance. In the present paper we examine the origin of this disagreement. Several serious problems in the study of MG are pointed out, such as: 1. The probability of a prediction's being successful by chance should be approximately considered as the product of three probabilities, Pv, PE and PM, i.e., the probabilities with respect to time, epicenter and magnitude. In spite of their major importance, P•. and PM were ignored by MG. The incorporation of P•. decreases the probability for chancy success by more than a factor of 10 (when P•. is taken into account it can be shown that the VAN predictions cannot be ascribed to chance). 2. MG grossly overestimated the number of earthquakes that should have been predicted, by taking different thresholds for earthquakes and predictions. With such an overestimation, MG' s procedure can "reject" even an ideally perfect earthquake prediction method. 3. MG's procedure did not take into account that the predictions were based on three different types of electrical precursors with different lead-times. 4. MG applied a Poisson distribution to the time series of earthquakes but included a large number of aftershocks. 5. The backward time correlation between predictions and earthquakes claimed by MG is due to misinterpretation of the text of some predictions and an incorrect use of aftershocks. Although even the discussion of the first problem alone is enough to invalidate the claims of MG, we also discuss the other four problems because MG violated some basic principles even in the time domain alone. The results derived in this paper are of general use when examining whether a correlation between earthquakes and various geophysical phenomena is beyond chance or not.

Comment on 'Testing earthquake prediction methods: "The West Pacific short-term forecast of earthquakes with magnitude MwHRV ≥5.8" by V.G. Kossobokov

Tectonophysics, 2006

In his paper Kossobokov investigates the efficiency of our short-term forecast for two western Pacific regions. Although we agree with the basic results of his evaluation that the forecast statistics is much better than a random guess, we have reservations about his definition of earthquake prediction, some of his tests, and his interpretation of the test results. We distinguish between deterministic earthquake predictions and statistical forecasts. We argue that some techniques used by Kossobokov may not be appropriate for testing our forecasts and discuss other testing methods, based on the likelihood function. We demonstrate that Kossobokov's null hypothesis may be biased, and this bias can influence some of his conclusions. We show that contrary to Kossobokov's statement, our algorithm predicts mainshocks when they are preceded by foreshocks.

Comparison of Natural and Predicted Earthquake Occurrence in Seismologically Active Areas for Determination of Statistical Significance

2008

Robert K. Vincent, Advisor successfully predicted 100 earthquakes in the Western Pacific Rim including China, Japan, Taiwan, and Philippine, using a temperature anomaly method. Their model is based on a predicted increase of ground temperatures in the lower atmosphere from 2 to 8 days before an earthquake of with a Richter Scale magnitude of 5 or greater. Mixed gases, such as CO 2 and CH 4 , in different ratios under the action of a transient electric field, cause the temperature of the lower atmosphere to increase up to 6 °C, while solar radiation only increases temperature by 3 °C. The authors detected the thermal anomalies using ground-based evidence and thermal infrared anomalies in METEOSAT thermal infrared image data. Despite their apparent success at predicting the earthquakes, they did not compare their prediction with the natural rate of occurrence in the area, which experiences an earthquake of Richter magnitude greater than 4 every week.

Stability of intermediate-term earthquake predictions with respect to random errors in magnitude: the case of central Italy

2002

The influence of random magnitude errors on the results of intermediate-term earthquake predictions is analyzed in this study. The particular case of predictions performed using the algorithm CN in central Italy is considered. The magnitudes of all events reported in the original catalog (OC) are randomly perturbed within the range of the expected errors, thus generating a set of randomized catalogs. The results of predictions for the original and the randomized catalogs, performed following the standard CN rules, are then compared. The average prediction quality of the algorithm CN appear stable with respect to magnitude errors up to ±0.3 units. Such a stable prediction is assured if the threshold setting period corresponds to a time interval sufficiently long and representative of the seismic activity within the region, while if the threshold setting period is too short, the average quality of CN decreases linearly for increasing maximum error in magnitude.

Earthquake prediction: the null hypothesis

Geophysical Journal International, 1997

The null hypothesis in assessing earthquake predictions is often, loosely speaking, that the successful predictions are chance coincidences. To make this more precise requires specifying a chance model for the predictions and/or the seismicity. The null hypothesis tends to be rejected not only when the predictions have merit, but also when the chance model is inappropriate. In one standard approach, the seismicity is taken to be random and the predictions are held fixed. 'Conditioning' on the predictions this way tends to reject the null hypothesis even when it is true, if the predictions depend on the seismicity history. An approach that seems less likely to yield erroneous conclusions is to compare the predictions with the predictions of a 'sensible' random prediction algorithm that uses seismicity up to time t to predict what will happen after time t. The null hypothesis is then that the predictions are no better than those of the random algorithm. Significance levels can be assigned to this test in a more satisfactory way, because the distribution of the success rate of the random predictions is under our control. Failure to reject the null hypothesis indicates that there is no evidence that any extra-seismic information the predictor uses (electrical signals for example) helps to predict earthquakes.

On the Use of Receiver Operating Characteristic Tests for Evaluating Spatial Earthquake Forecasts

Spatial forecasts of triggered earthquake distributions have been ranked using receiver operating characteristic (ROC) tests. The test is a binary comparison between regions of positive and negative forecast against positive and negative presence of earthquakes. Forecasts predicting only positive changes score higher than Coulomb methods, which predict positive and negative changes. I hypothesize that removing the possibility of failures in negative forecast realms yields better ROC scores. I create a "perfect" Coulomb forecast where all earthquakes only fall into positive stress change areas and compare with an informationless all-positive forecast. The "perfect" Coulomb forecast barely beats the informationless forecast, and adding as few as four earthquakes occurring in the negative stress regions causes the Coulomb forecast to be no better than an informationless forecast under a ROC test. ROC tests also suffer from data imbalance when applied to earthquake forecasts because there are many more negative cases than positive. Plain Language Summary Recent studies have evaluated the Coulomb stress change method, a popular technique for calculating where future earthquakes will occur, against alternative stress change representations. Spatial forecasts were compared with receiver operating characteristic tests, which rank methods based on the number of true and false positive and negative forecast cases. Coulomb stress changes, which predict areas of positive and negative stress change fare poorly against methods that only produce positive forecast areas. Methods that forecast negative cases (earthquake suppression) have to be nearly perfect to score well in a receiver operating characteristic test against even an informationless all-positive forecast, because there are no possible false negatives. There is also a general data imbalance problem with using ROC tests for earthquake forecasts because there are almost always many more negative cases (places with no earthquakes).

Gambling scores in earthquake prediction analysis

The number of successes 'n' and the normalized measure of space-time alarm 'tau' are commonly used to characterize the strength of an earthquake prediction method and the significance of prediction results. To evaluate better the forecaster's skill, it has been recently suggested to use a new characteristic, the gambling score R, which incorporates the difficulty of guessing each target event by using different weights for different alarms. We expand the class of R-characteristics and apply these to the analysis of results of the M8 prediction algorithm. We show that the level of significance 'alfa' strongly depends (1) on the choice of weighting alarm parameters, (2) on the partitioning of the entire alarm volume into component parts, and (3) on the accuracy of the spatial rate of target events, m(dg). These tools are at the disposal of the researcher and can affect the significance estimate in either direction. All the R-statistics discussed here corrob...

Testing earthquake predictions

Institute of Mathematical Statistics Collections, 2008

Statistical tests of earthquake predictions require a null hypothesis to model occasional chance successes. To define and quantify 'chance success' is knotty. Some null hypotheses ascribe chance to the Earth: Seismicity is modeled as random. The null distribution of the number of successful predictionsor any other test statistic-is taken to be its distribution when the fixed set of predictions is applied to random seismicity. Such tests tacitly assume that the predictions do not depend on the observed seismicity. Conditioning on the predictions in this way sets a low hurdle for statistical significance. Consider this scheme: When an earthquake of magnitude 5.5 or greater occurs anywhere in the world, predict that an earthquake at least as large will occur within 21 days and within an epicentral distance of 50 km. We apply this rule to the Harvard centroid-moment-tensor (CMT) catalog for 2000-2004 to generate a set of predictions. The null hypothesis is that earthquake times are exchangeable conditional on their magnitudes and locations and on the predictions-a common "nonparametric" assumption in the literature. We generate random seismicity by permuting the times of events in the CMT catalog. We consider an event successfully predicted only if (i) it is predicted and (ii) there is no larger event within 50 km in the previous 21 days. The P-value for the observed success rate is < 0.001: The method successfully predicts about 5% of earthquakes, far better than 'chance,' because the predictor exploits the clustering of earthquakes-occasional foreshocks-which the null hypothesis lacks. Rather than condition on the predictions and use a stochastic model for seismicity, it is preferable to treat the observed seismicity as fixed, and to compare the success rate of the predictions to the success rate of simple-minded predictions like those just described. If the proffered predictions do no better than a simple scheme, they have little value.

Testing earthquake prediction algorithms: statistically significant advance prediction of the largest earthquakes in the Circum-Pacific, 1992–1997

Physics of the Earth and Planetary Interiors, 1999

. Algorithms M8 and MSc i.e., the Mendocino Scenario were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 q , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were w published before we started our experiment Keilis-Borok, V.I., . Premonitory activation of seismic flow: Algorithm M8

Earthquake forecasting and its verification

2005

No proven method is currently available for the reliable short time prediction of earthquakes (minutes to months). However, it is possible to make probabilistic hazard assessments for earthquake risk. These are primarily based on the association of small earthquakes with future large earthquakes. In this paper we discuss a new approach to earthquake forecasting. This approach is based on a pattern informatics (PI) method which quantifies temporal variations in seismicity. The output is a map of areas in a seismogenic region ("hotspots") where earthquakes are forecast to occur in a future 10-year time span. This approach has been successfully applied to California, to Japan, and on a worldwide basis. These forecasts are binary-an earthquake is forecast either to occur or to not occur. The standard approach to the evaluation of a binary forecast is the use of the relative operating characteristic (ROC) diagram, which is a more restrictive test and less subject to bias than maximum likelihood tests. To test our PI method, we made two types of retrospective forecasts for California. The first is the PI method and the second is a relative intensity (RI) forecast based on the hypothesis that future earthquakes will occur where earthquakes have occurred in the recent past. While both retrospective forecasts are for the ten year period 1 January 2000 to 31 December 2009, we performed an interim analysis 5 years into the forecast. The PI method out performs the RI method under most circumstances.