Comparison of multivariate distributions using quantile–quantile plots and related tests (original) (raw)
Related papers
Multivariate quantile-quantile plots and related tests using spatial quantiles
2010
The univariate quantile-quantile (Q-Q) plot is a well-known graphical tool for examining whether two data sets are generated from the same distribution or not. It is also used to determine how well a specified probability distribution fits a given sample. In this article, we consider an extension of Q-Q plot for multivariate data based on spatial quantiles introduced and studied by Chaudhuri (1996) and Koltchinskii (1997). The usefulness of the proposed graphical tool is illustrated on different real and simulated data some of which are fairly high dimensional. We also propose some statistical tests for distributions of multivariate samples based on spatial quantiles and study their performance compared to some other tests available in the existing literature.
Visualizing multiple quantile plots
Journal of Computational and Graphical Statistics, 2012
Multiple-quantile plots provide a powerful graphical method for comparing the distributions of two or more populations. This article develops a method of visualizing triple-quantile plots and their associated confidence tubes, thus extending the notion of a quantile–quantile (QQ) plot to three dimensions. More specifically, we consider three independent one-dimensional random samples with corresponding quantile functions Q 1, Q 2, and Q 3. The triple-quantile (QQQ) plot is then defined as the three-dimensional curve Q(p) = (Q 1(p), Q 2(p), Q 3(p)), where 0 < p < 1. The empirical likelihood method is used to derive simultaneous distribution-free confidence tubes for Q. We apply our method to an economic case study of strike durations and to an epidemiological study involving the comparison of cholesterol levels among three populations. These data as well as the Mathematica code for computation of the tubes are available in the online supplementary materials.
A Tutorial on Quantile-Quantile Plots
2018
This is a tutorial on quantile-quantile plots, a technique for determining if different data sets originate from populations with a common distribution. The technique can be used to determine if a data set is normally distributed, and to optimize the transformation parameter of variance-stabilizing Box-Cox transformation models. An Excel link to a reproducible example is provided.<br>
Quantile Analysis: A Method for Characterizing Data Distributions
Applied Spectroscopy, 1988
I common problem in chem- istry. Quantile-quantile (QQ) plots provide a useful way to attack this problem. These graphs are often used in the form of the normal prob- ability plot, to determine whether the residuals from a fitting process are randomly distributed and therefore whether an assumed model fits the data at hand. By comparing the integrals of two
Journal of Statistical Theory and Practice, 2017
We introduce some new approaches for the graphical assessment of distribution of the data which supplement the existing graphical methods. Analogous to Q-Q plots and P-P plots, we introduce plots based on arc lengths and area of surface of revolution of the density function. Thus our method indirectly not only makes use of density assumed but also of the derivatives thereof. We illustrate by using several examples that these plots help us identify correct distribution and also rule out the incorrect possibilities. We further consider the problem of assessing the behavior of the data towards the tail and develop graphical tools to identify the closest potential probability distribution for the tail. Examples based on real data are provided.
A modified Q-Q plot for large sample sizes
Comunicaciones en Estadística, 2015
The Q-Q plot is a graphical tool for assessing the fit of observed data to a theoretical distribution, in which every single observation in the data is represented by a symbol (usually a dot). In many occasions, due either to natural variations of the data or to a large sample size, the Q-Q plot could be interpreted as a sign of failure of the proposed model. An alternative is to consider a special set of characteristics of the data such as the sample quantiles that, jointly with its theoretical counterparts, allow the user to effectively compare both. We propose and illustrate a modified Q-Q plot that helps to visualise the differences between the observed quantiles and its corresponding theoretical values, and overcome some technical problems of the traditional Q-Q plot.
A modified Q-Q plot for large sample sizes 1 Gráfico Q-Q modificado para grandes tamaños de muestra
The Q-Q plot is a graphical tool for assessing the goodness-of-fit of observed data to a theoretical distribution in which every single observation in the data is represented by a symbol. In many occasions, due to either natural variations of the data or to a large sample size, the Q-Q plot could be interpreted as a sign of failure of the proposed model. One alternative is to consider a special set of characteristics of the data such as the sample quantiles that, jointly with its theoretical counterparts , allow the user to effectively compare both. We propose and illustrate a modified Q-Q plot that helps to visualise the differences between the observed quantiles and their corresponding theoretical values, and overcome some technical problems of the traditional Q-Q plot.
A multivariate control quantile test using data depth
Computational Statistics & Data Analysis, 2013
The objective of this article is to present a depth based multivariate control quantile test using statistically equivalent blocks (DSEBS). Given a random sample {x 1 ,. .. , x m } of R d-valued random vectors (d ≥ 1) with a distribution function (DF) F , statistically equivalent blocks (SEBS), a multivariate generalization of the univariate sample spacings, can be constructed using a sequence of cutting functions h i (x) to order x i , i = 1,. .. , m. DSEBS are data driven, center-outward layers of shells whose shapes reflect the underlying geometric features of the unknown distribution and provide a framework for selection and comparison of cutting functions. We propose a control quantile test, using DSEBS, to test the equality of two DF s in R d. The proposed test is distribution free under the null hypothesis and well defined when d ≥ max(m, n). A simulation study compares the proposed statistic to depth-based Wilcoxon rank sum test. We show that the new test is powerful in detecting the differences in location, scale and shape (skewness or kurtosis) changes in two multivariate distributions.
Functional, randomized and smoothed multivariate quantile regions
Journal of Multivariate Analysis, 2021
The mass transportation approach to multivariate quantiles in Chernozhukov et al. [3] was modified in Faugeras and Rüschendorf [8] by a two steps procedure. In the first step, a mass transportation problem from a spherical reference measure to the copula is solved and combined in the second step with a marginal quantile transformation in the sample space. Also, generalized quantiles given by suitable Markov morphisms are introduced there. In the present paper, this approach is further extended by a functional approach in terms of membership functions, and by the introduction of randomized quantile regions. In addition, in the case of continuous marginals, a smoothed version of the empirical quantile regions is obtained by smoothing the empirical copula. All three extended approaches give empirical quantile ares of exact level and improved stability. The resulting depth areas give a valid representation of the central quantile areas of a multivariate distribution and provide a valuable tool for their analysis.