Statistical tests Research Papers - Academia.edu (original) (raw)

In this paper we consider the question of uncertainty of detected patterns in data mining. In particular, we develop statistical tests for patterns found in continuous data, indicating the significance of these patterns in terms of the... more

In this paper we consider the question of uncertainty of detected patterns in data mining. In particular, we develop statistical tests for patterns found in continuous data, indicating the significance of these patterns in terms of the probability that they have occurred by chance. We examine the performance of these tests on patterns detected in several large data sets, including

The integration of technology into content area learning requires from teachers to constantly balance the mastery of technology with content area mastery because, greater the mastery of any technology of the classroom teacher be less... more

The integration of technology into content area learning requires from teachers to constantly balance the mastery of technology with content area mastery because, greater the mastery of any technology of the classroom teacher be less effort is needed for classroom use and student mastery during the teaching and learning mathematics process. GeoGebra is a dynamic mathematics software for schools that joins geometry, algebra, and calculus, it is an interactive geometry system, a new technology for teaching and learning mathematics. This book confirms that, the new method of use of GeoGebra software in teaching and learning math has increased the level of math knowledge and skills over the traditional method used in teaching and learning process. The experiments, carried out in several secondary schools, have provided evidence that the new teaching and learning method in mathematics based on GeoGebra software causes increase in the level of knowledge and skills in mathematics. In addit...

The integration of technology into content area learning requires from teachers to constantly balance the mastery of technology with content area mastery because, greater the mastery of any technology of the classroom teacher be less... more

The integration of technology into content area learning requires from
teachers to constantly balance the mastery of technology with content
area mastery because, greater the mastery of any technology of the
classroom teacher be less effort is needed for classroom use and student mastery during the teaching and learning mathematics process. GeoGebra is a dynamic mathematics software for schools that joins geometry, algebra, and calculus, it is an interactive geometry system, a new technology for teaching and learning mathematics. This book confirms that, the new method of use of GeoGebra software in teaching and learning math has increased the level of math knowledge and skills over the traditional method used in teaching and learning process. The
experiments, carried out in several secondary schools, have provided
evidence that the new teaching and learning method in mathematics
based on GeoGebra software causes increase in the level of knowledge
and skills in mathematics. In addition to these results, the book contains
very useful information for teachers and students on statistical concepts, tools, inferences, and survey evaluation methodology based on qualitative variables.

A t-test is a statistic that checks if two means (averages) are reliably different from each other. Looking at the means may tell us there is a difference, but it doesn’t tell us if it’s reliable. Comparing means don’t tell if there is a... more

A t-test is a statistic that checks if two means (averages) are reliably different from each other.
Looking at the means may tell us there is a difference, but it doesn’t tell us if it’s reliable. Comparing means don’t tell if there is a reliable difference.
For eg. If person A and person B flip a coin 100 times. Person A gets heads 52 times and person B gets heads 49 times. This does not tell us that Person A gets reliably more Heads than B, or whether he will get more heads than B if he flipped the coins 100 times again. There is no difference, its just chance.

The performance of a tracking filter can be evaluated in terms of the filterpsilas optimality conditions. Testing for optimality is necessary because the estimation error covariance as provided by the filter is not a reliable indicator of... more

The performance of a tracking filter can be evaluated in terms of the filterpsilas optimality conditions. Testing for optimality is necessary because the estimation error covariance as provided by the filter is not a reliable indicator of performance, which is known to be ldquooptimisticrdquo (inconsistent) particularly when there are model mismatches and target maneuvers. The conventional root-mean square (RMS) error value and its variants are widely used for performance evaluation in simulation and testing but it is not feasible for real-time operations where the ground truth is hardly available. One approach for real-time reliability assessment is optimality self online monitoring (OSOM) investigated in this paper. Statistical tests for optimality conditions are formulated. Simulation examples are presented to illustrate their possible use in evaluation and adaptation.

How to select appropriate statistical test.pdf

The aim of this study is to analyze the trend and variability of precipitation and streamflow in Kunduz River Basin which is located to northeastern part of Afghanistan. The Mann Kendall and Sen's Slope statistical test were applied to... more

The aim of this study is to analyze the trend and variability of precipitation and streamflow in Kunduz River Basin which is located to northeastern part of Afghanistan. The Mann Kendall and Sen's Slope statistical test were applied to understand the precipitation variability for 1961-2010 and about one-decade recorded streamflow respectively. However, the monthly precipitation illustrated significant downward trend in spring months and upward trend in summer season, the calculated annual precipitation represented decreasing trend in the river basin. The statistical analysis of monthly and annual river flow depicted dropping values of stream discharge as well which prove the correlation of both important variables. Therefore, the calculated time series of both hydro-climate elements showed decreasing, the basin experienced drying, the decisionmakers must consider proper water resource management project to reduce the negative implication of the change and boost the temporal water resource governance as well.

This paper focuses on analyzing decadal and long-term changes in temperature and rainfall in Mali over the last 80 years of the 20th century. In this study, a database of these elements on 33 weather stations across the country has been... more

This paper focuses on analyzing decadal and long-term changes in temperature and rainfall in Mali over the last 80 years of the 20th century. In this study, a database of these elements on 33 weather stations across the country has been designed. The first step was the submission of data sets to quality control via statistical tests. Afterwards, they were successfully restored and used to estimate climatic normals with the sliding average method. The comparison of rainfall distribution maps shows a decrease of 115 mm (or - 18,3%) during the period from 1921 to 2000, resulting in a displacement of isohyets to the south of 135 km. This allowed us to understand the evolution of temperature and rainfall over several decades of the last century in Mali.

The most popular hypothesis testing procedure, the likelihood ratio test, is known to be highly non-robust in many real situations. Basu et al. (2013a) provided an alternative robust procedure of hypothesis testing based on the density... more

The most popular hypothesis testing procedure, the likelihood ratio test, is known to be highly non-robust in many real situations. Basu et al. (2013a) provided an alternative robust procedure of hypothesis testing based on the density power divergence; however, although the robustness properties of the latter test were intuitively argued for by the authors together with extensive empirical substantiation of the same, no theoretical robustness properties were presented in this work. In the present paper we will consider a more general class of tests which form a superfamily of the procedures described by Basu et al. (2013a). This superfamily derives from the class of S-divergences recently proposed by Basu et al. (2013a). In this context we theoretically prove several robustness results of the new class of tests and illustrate them in the normal model. All the theoretical robustness properties of the Basu et al. (2013a) proposal follows as special cases of our results.

H έννοια της αξιοπιστίας σε ένα πρόβληµα εκτίµησης σχετίζεται µε τις αρχικές υποθέσεις που αφορούν στις µετρήσεις, στις µαθηµατικές σχέσεις που συνδέουν τις µετρήσεις µε τις επιλεγµέ-νες άγνωστες παραµέτρους και στη διαδικασία υπολογισµού... more

H έννοια της αξιοπιστίας σε ένα πρόβληµα εκτίµησης σχετίζεται µε τις αρχικές υποθέσεις που αφορούν στις µετρήσεις, στις µαθηµατικές σχέσεις που συνδέουν τις µετρήσεις µε τις επιλεγµέ-νες άγνωστες παραµέτρους και στη διαδικασία υπολογισµού αυτών των παραµέτρων. Αν αυτές οι αρχικές υποθέσεις δεν ισχύουν, τότε η ολική διαδικασία οδηγείται σε λανθασµένα αποτελέ-σµατα. H ορθότητα των αρχικών υποθέσεων καθώς και κάθε στατιστική υπόθεση σχετική µε τα αποτελέσµατα της ανάλυσης των παρατηρήσεων και της εκτίµησης των αγνώστων παραµέ-τρων, ελέγχεται µε την εφαρµογή του στατιστικού ελέγχου της γενικής υπόθεσης. Επειδή στις συνήθεις εφαρµογές η πιο συνηθισµένη αιτία για αναξιόπιστα αποτελέσµατα είναι τα σφάλµατα των µετρήσεων, για το λόγο αυτό και η έννοια της αξιοπιστίας συνδέεται κυρίως µε την ύπαρξη χονδροειδών σφαλµάτων στις µετρήσεις. Στην περίπτωση ειδικών εφαρµογών όπου απαιτείται υψηλή ποιότητα αποτελεσµάτων, είναι εξίσου σηµαντική η αξιολόγηση της αξιοπιστίας κάθε στατιστικού ελέγχου που εφαρµόζεται στα αποτελέσµατα της εκτίµησης και των σχετικών υπο-λογισµών. Στην εργασία αυτή αρχικά γίνεται αναφορά στην αξιοπιστία της σάρωσης δεδοµέ-νων, αλλά η έννοια της αξιοπιστίας επεκτείνεται και στον έλεγχο της γενικής υπόθεσης, όπως εφαρµόζεται στην ανάλυση των µετρήσεων και την εκτίµηση των αγνώστων παραµέτρων. Λέξεις κλειδιά: στατιστικός έλεγχος, σάρωση δεδοµένων, εσωτερική αξιοπιστία, εξωτερική αξιο-πιστία. Abstract The reliability related to an estimation problem, is associated with the initial assumptions about the measurements, the mathematical relationships that link the measurements to the unknown parameters selected and finally, the process of calculating these parameters. If these initial assumptions are not valid, then the whole process led to incorrect results. The accuracy of the initial assumptions and any statistical assumption on the results of the analysis of observations and assessment of unknown parameters, are all controlled by the application of statistical testing of the general case. For usual applications (with average accuracy standards), the most common cause of unreliable results is the errors in the measurements. Therefore, the concept of reliability is mainly linked to the existence of coarse errors in measurements. On the other hand, in the case of special applications which demand high accuracy standards, it is equally important to assess the credibility of any statistical test applied to the results of the assessment and calculation. This paper initially refers to the reliability of the data snooping, but the concept of reliability is extended to test the general case (general hypothesis), as applied both to the analysis of measurements and to the estimation of unknown parameters.

The R package micompr implements a procedure for assessing if two or more multivariate samples are drawn from the same distribution. The procedure uses principal component analysis to convert multivariate observations into a set of... more

The R package micompr implements a procedure for assessing if two or more multivariate samples are drawn from the same distribution. The procedure uses principal component analysis to convert multivariate observations into a set of linearly uncorrelated statistical measures, which are then compared using a number of statistical methods. This technique is independent of the distributional properties of samples and automatically selects features that best explain their differences. The procedure is appropriate for comparing samples of time series, images, spectrometric measures or similar high-dimension multivariate observations.

A multiple sensor system is considered under binary hypothesis environments. All sensors are assumed independent, and the observed data is also independent. Received data is quantized and then sent to the fusion center to determine... more

A multiple sensor system is considered under binary hypothesis environments. All sensors are assumed independent, and the observed data is also independent. Received data is quantized and then sent to the fusion center to determine whether a target is present. The Sequential Probability Ratio Test is employed in the fusion center. The objective to find an optimal system by minimizing the expected number of observations. Both two-level and four-level quantizer are used in the process of finding the optimal system. Numerical evaluations are made to find the quantizer which minimize the expected number of observations that are required to decide the presence of the target. System simulations are also performed to confirm the results.

There has been recent agitation amongst estate surveyors and valuers that auctioneering ought to be an aspect of their practice. This crave for supremacy appears contestable as they have not had an exclusive preserve in this aspect of... more

There has been recent agitation amongst estate surveyors and valuers that
auctioneering ought to be an aspect of their practice. This crave for supremacy appears
contestable as they have not had an exclusive preserve in this aspect of practice coupled
with the participation of other professionals. This study thereby aimed at discovering
the prospect of estate surveyors and valuers in auctioneering amongst various
stakeholders. Questionnaires were distributed to one hundred and eighty-three (183)
estate surveyors and valuers in Lagos State, thirty-nine (39) auction houses in Lagos
state and eleven (11) government agencies who require the services of auctioneers. The
use of descriptive and inferential statistics such as the Relative Importance Index (RII),
Chi-square test and Kruskal-Wallis test of significance were used in the analysis of the
data. It was revealed that although the proportion of estate surveyors and valuers
engaged in auctioneering are quite minimal, they are still substantial compared to other

The last assessment report on climate change (AR5) published by the IPCC in September 2013, affirmed that warming of the climate system as unequivocal and unprecedented since the 1950s. Statistical techniques and climate models evaluate... more

The last assessment report on climate change (AR5) published by the IPCC in September 2013, affirmed that warming of the climate system as unequivocal and unprecedented since the 1950s. Statistical techniques and climate models evaluate the main changes projecting possible scenarios planned for the twenty-first century. The objective is to show time series reconstruction techniques, variability analysis and parameters to identifying climate change as well as the use of climate models data for the projection of possible future scenarios.

Background. The identification of a location-, scale-and shape-sensitive test to detect differentially expressed features between two comparison groups represents a key point in high dimensional studies. The most commonly used tests refer... more

Background. The identification of a location-, scale-and shape-sensitive test to detect differentially expressed features between two comparison groups represents a key point in high dimensional studies. The most commonly used tests refer to differences in location, but general distributional discrepancies might be important to reveal differential biological processes. Methods. A simulation study was conducted to compare the performance of a set of two-sample tests, i.e. Student's t, Welch's t, Wilcoxon-Mann-Whitney (WMW), Podgor-Gastwirth PG2, Cucconi, Kolmogorov-Smirnov (KS), Cramer-von Mises (CvM), Anderson-Darling (AD) and Zhang tests (Z K, Z C and Z A) which were investigated under different distributional patterns. We applied the same tests to a real data example. Results. AD, CvM, Z A and Z C tests proved to be the most sensitive tests in mixture distribution patterns, while still maintaining a high power in normal distribution patterns. At best, the AD test showed a loss in power of ~ 2% in the comparison of two normal distributions, but a gain of ~ 32% with mixture distributions with respect to the parametric tests. Accordingly, the AD test detected the greatest number of differentially expressed features in the real data application. Conclusion. The tests for the general two-sample problem introduce a more general concept of 'differential expression', thus overcoming the limitations of the other tests restricted to specific moments of the feature distributions. In particular, the AD test should be considered as a powerful alternative to the parametric tests for feature screening in order to keep as many discriminative features as possible for the class prediction analysis.

This work tackles the problem of whether the dissociation between two performances in a single-case study should be computed as the difference between the raw or between the standardized (e.g. z) scores. A wrong choice can lead to serious... more

This work tackles the problem of whether the dissociation between two performances in a single-case study should be computed as the difference between the raw or between the standardized (e.g. z) scores. A wrong choice can lead to serious inflation of the probability of finding false dissociations and missing true dissociations. Two common misconceptions are that (i) standardized scores are a universally valid choice, or (ii) raw scores can be subtracted when the two performances concern the same “task/test”, otherwise standardized scores are better.
These and other rules are shown to fail in specific cases and a solution is proposed in terms of in-depth analysis of the meaning of each score. The scores that should be subtracted are those that better reflect “deficit severities” – the latent, unobservable degrees of damage to the cognitive systems that are being compared. Thus explicit theoretical modelling of the investigated cognitive function(s) – the “scenario” – is required. A flowchart is provided that guides such analysis, and shows how a given neuropsychological scenario leads to the selection of an appropriate statistical method for detecting dissociations, introducing the critical concept of “deficit equivalence criterion” – the definition of what exactly a non-dissociation should look like. One further, overlooked problem concerning standardized scores in general (as measures of effect size, of which neuropsychological dissociations are just one example) is that they cannot be meaningfully compared if they have different reliabilities.
In conclusion, when studying dissociations, increases in false-positive and false-negative risks are likely to occur when no explicit neuropsychological theory is offered that justifies the definition of what are to be
considered as equivalent deficit severities in both performances, and which would lead to appropriate selection of raw, standardized, or any other type of score. More generally, the choice of any measure in any research context needs explicit theoretical modelling, without which statistical risks cannot be controlled.

In this paper we introduce a general framework for automatic construction of empirical tests of randomness. Our new framework generalises and improves a previous approach ( ˇSvenda et al., 2013) and it also provides a clear statistical... more

In this paper we introduce a general framework for automatic construction of empirical tests of randomness. Our new framework generalises and improves a previous approach ( ˇSvenda et al., 2013) and it also provides a clear statistical interpretation of its results. This new approach was tested on selected stream ciphers from the eSTREAM competition. Results show that our approach can lay foundations to randomness testing and it is comparable to the Statistical Test Suite developed by NIST. Additionally, the proposed approach is able to perform randomness analysis even when presented with sequences shorter by several orders of magnitude than required by the NIST suite. Although the Dieharder battery still provides a slightly better randomness analysis, our framework is able to detect non-randomness for stream ciphers with limited number of rounds (Hermes, Fubuki) where both above-mentioned batteries fail.

While body movement patterns recorded by a smartphone accelerometer are now well understood to be discriminative enough to separate users, little work has been done to address the question of if or how the position in which the phone is... more

While body movement patterns recorded by a smartphone accelerometer are now well understood to be discriminative enough to separate users, little work has been done to address the question of if or how the position in which the phone is held affects user authentication. In this work, we show through a combination of supervised learning methods and statistical tests, that there are certain users for whom exploitation of information of how a phone is held drastically improves classification performance. We propose a two-stage authentication framework that identifies the location of the phone before performing authentication, and show its benefits based on a data-set of 55 users. Our work represents a major step towards bridging the gap between accelerometer-based authentication systems analyzed from the context of a laboratory environment and a real accelerometer-based authentication system in the wild where phone positioning cannot be assumed.

The paper presents a simple true random number generator (TRNG) which can be embedded in digital Application Specific Integrated Circuits (ASICs) and Field Programmable Logic Devices (FPLDs). As a source of randomness, it uses on-chip... more

The paper presents a simple true random number generator (TRNG) which can be embedded in digital Application Specific Integrated Circuits (ASICs) and Field Programmable Logic Devices (FPLDs). As a source of randomness, it uses on-chip noise generated in the internal analog phase- locked loop (PLL) circuitry. In contrast with traditionally used free running oscillators, it uses a novel method of randomness extraction based on two rationally related synthesized clock signals. The generator has been developed for embedded cryptographic applications, where it significantly increases the system security, but it can be used in a wide range of applications. The quality of TRNG output is confirmed by applying special statistical tests, which pass even for high output bit-rates of several hundreds of kilobits per second.

This paper investigates the profitability of technical trading rules in the Athens Stock Exchange (ASE), utilizing the FTSE Large Capitalization index over the seven-year period 2005-2012, which was before and during the Greek crisis. The... more

This paper investigates the profitability of technical trading rules in the Athens Stock Exchange (ASE), utilizing the FTSE Large Capitalization index over the seven-year period 2005-2012, which was before and during the Greek crisis. The technical rules that will be explored are the simple moving average, the envelope (parallel bands) and the slope (regression). We compare technical trading strategies in the spirit of Brock, Lakonishok, and LeBaron (1992), employing traditional t-test and Bootstrap methodology under the Random Walk with drift, AR(1) and GARCH(1,1) models. We enrich our analysis via Fourier analysis technique (FFT) and more statistical tests. The results provide strong evidence on the profitability of the examined technical trading rules, even during recession period (2009-2012), and contradict the Efficient Market Hypothesis.

In this paper we describe a general framework to decompose three-way association measures for contingency tables, in particular symmetric and non-symmetric measures will be discussed, like Pearson's index, Marcotorchino's index, whose... more

In this paper we describe a general framework to decompose three-way association measures for contingency tables, in particular symmetric and non-symmetric measures will be discussed, like Pearson's index, Marcotorchino's index, whose special case is the Gray–Williams index, and a new non-symmetric one will be proposed, called the Delta index. After showing the orthogonal decomposition of these indices, practical examples illustrating the different decompositions will be given.

This is the third part of a series of reports dealing with the optimal significance level and optimal sample size in statistical hypothesis testing. In this opportunity, the optimal testing procedure introduced in the previous reports is... more

This is the third part of a series of reports dealing with the optimal significance level and optimal sample size in statistical hypothesis testing. In this opportunity, the optimal testing procedure introduced in the previous reports is used for analyzing large samples (consisting of more than 100 elements). A correction in the determination of critical D-values for large samples is proposed, considering the effect of sample size on the fluctuation of the D-value. In addition, a small drift in the analysis of F-tests for large samples was detected and corrected. It was also possible to evidence that conclusions based on P-values usually flip simply by increasing the sample size, whereas conclusions based on D-values are more consistent and reliable, independently of the sample size. Commonly, one may think that the result of hypothesis testing is better as the sample size increases. However, for extremely large samples numerical difficulties arise which may lead to inconclusive results.