Getting to know scatter plot (original) (raw)

Getting to Know The Scatter Plot

ods. Where two features in a study can be measured accurately, a visual presentation such as a scatter plot may indicate an interesting relationship, if it does not seem random. This helps researchers to understand the relation between different variables in a particular dataset. It also aids investigators to make the appropriate decision about how to further analyze the data by applying the most suitable statistical models. The chief aim of the present article therefore, is to examine one of the most powerful graphical diagrams for data visualization i.e. a scatter plot using a real public health dataset.

Transactions on Visualization and Computer Graphics Scatterplots : Tasks , Data , and Designs

2017

Traditional scatterplots fail to scale as the complexity and amount of data increases. In response, there exist many design options that modify or expand the traditional scatterplot design to meet these larger scales. This breadth of design options creates challenges for designers and practitioners who must select appropriate designs for particular analysis goals. In this paper, we help designers in making design choices for scatterplot visualizations. We survey the literature to catalog scatterplot-specific analysis tasks. We look at how data characteristics influence design decisions. We then survey scatterplot-like designs to understand the range of design options. Building upon these three organizations, we connect data characteristics, analysis tasks, and design choices in order to generate challenges, open questions, and example best practices for the effective design of scatterplots.

The Many Faces of a Scatterplot

Journal of the American Statistical Association, 1984

The scatterplot is one of our most powerful tools for data analysis. Still, we can add graphical information to scatterplots to make them considerably more powerful. These graphical additions, faces of sorts, can enhance capabilities that scatterplots already have or can add whole new capabilities that faceless scatterplots do not have at all. The additions we discuss here-some new and some old-are (a) sunflowers, (b) category codes, (c) point cloud sizings, (d) smoothings for the dependence of y on x (middle smoothings, spread smoothings, and upper and lower smoothings), and (e) smoothings for the bivariate distribution of x and y (pairs of middle smoothings, sumdifference smoothings, scale-ratio smoothings, and polar smoothings). The development of these additions is based in part on a number of graphical principles that can be applied to the development of statistical graphics in general.

Scatter plotting in multivariate data analysis

Journal of Chemometrics, 2003

In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement for understanding. There are many techniques of plotting data/parameters/residuals. These have to be understood and visualization has to be made clearly and interpreted correctly. In this paper the classical favourites in chemometrics, scatter plots, are looked into more deeply and some criticism based on recent literature references is formulated for situations of principal component analysis, PARAFAC three-way analysis and regression by partial least squares. Biplots are also afforded some attention. Examples from near-infrared spectroscopy are given as illustrations.

Scatterplots: Basics, enhancements, problems and solutions

The scatter plot is a basic tool for presenting information on two continuous variables. While the basic plot is good in many situations, enhancements can increase its utility. I also go over tools to deal with the problem of overplotting.

Evaluation on interactive visualization data with scatterplots

Visual Informatics, 2020

Scatterplots and scatterplot matrix methods have been popularly used for showing statistical graphics and for exposing patterns in multivariate data. A recent technique, called Linkable Scatterplots, provides an interesting idea for interactive visual exploration which provides a set of necessary plot panels on demand together with interaction, linking and brushing. This article presents a controlled study with a mixed-model design to evaluate the effectiveness and user experience on the visual exploration when using a Sequential-Scatterplots who a single plot is shown at a time, Multiple-Scatterplots who number of plots can be specified and shown, and Simultaneous-Scatterplots who all plots are shown as a scatterplot matrix. Results from the study demonstrated higher accuracy using the Multiple-Scatterplots visualization, particularly in comparison with the Simultaneous-Scatterplots. While the time taken to complete tasks was longer in the Multiple-Scatterplots technique, compared with the simpler Sequential-Scatterplots, Multiple-Scatterplots is inherently more accurate. Moreover, the Multiple-Scatterplots technique is the most highly preferred and positively experienced technique in this study. Overall, results support the strength of Multiple-Scatterplots and highlight its potential as an effective data visualization technique for exploring multivariate data.

The early origins and development of the scatterplot

Journal of the history of the behavioral sciences, 2005

Of all the graphic forms used today, the scatterplot is arguably the most versatile, polymorphic, and generally useful invention in the history of statistical graphics. Its use by Galton led to the discovery of correlation and regression, and ultimately to much of present multivariate statistics. So, it is perhaps surprising that there is no one widely credited with the invention of this idea. Even more surprising is that there are few contenders for this title, and this question seems not to have been raised before. This article traces some of the developments in the history of this graphical method, the origin of the term scatterplot, the role it has played in the history of science, and some of its modern descendants. We suggest that the origin of this method can be traced to its unique advantage: the possibility to discover regularity in empirical data by smoothing and other graphic annotations to enhance visual perception.

How to visualize public health data? Part one: Box plot and map

Health care professionals including family physicians increasingly become involved in public health data analyses. Data visualisation is the to disclose complex structures within data. The chief aim of the present series of two, is to discuss the pros and cons of two ways of data visualisation i.e. box plot and map using a real public health data example.

Generalized scatter plots

Information …, 2010

Scatter Plots are one of the most powerful and most widely used techniques for visual data exploration. A well-known problem is that scatter plots often have a high degree of overlap, which may occlude a significant portion of the data values shown. In this paper, we propose the generalized scatter plot technique, which allows an overlap-free representation of large data sets to fit entirely into the display. The basic idea is to allow the analyst to optimize the degree of overlap and distortion to generate the bestpossible view. To allow an effective usage, we provide the capability to zoom smoothly between the traditional and our generalized scatter plots. We identify an optimization function that takes overlap and distortion of the visualization into acccount. We evaluate the generalized scatter plots according to this optimization function, and show that there usually exists an optimal compromise between overlap and distortion. Our generalized scatter plots have been applied successfully to a number of real-world IT services applications, such as server performance monitoring, telephone service usage analysis and financial data, demonstrating the benefits of the generalized scatter plots over traditional ones.