The Many Faces of a Scatterplot (original) (raw)

Transactions on Visualization and Computer Graphics Scatterplots : Tasks , Data , and Designs

2017

Traditional scatterplots fail to scale as the complexity and amount of data increases. In response, there exist many design options that modify or expand the traditional scatterplot design to meet these larger scales. This breadth of design options creates challenges for designers and practitioners who must select appropriate designs for particular analysis goals. In this paper, we help designers in making design choices for scatterplot visualizations. We survey the literature to catalog scatterplot-specific analysis tasks. We look at how data characteristics influence design decisions. We then survey scatterplot-like designs to understand the range of design options. Building upon these three organizations, we connect data characteristics, analysis tasks, and design choices in order to generate challenges, open questions, and example best practices for the effective design of scatterplots.

The early origins and development of the scatterplot

Journal of the history of the behavioral sciences, 2005

Of all the graphic forms used today, the scatterplot is arguably the most versatile, polymorphic, and generally useful invention in the history of statistical graphics. Its use by Galton led to the discovery of correlation and regression, and ultimately to much of present multivariate statistics. So, it is perhaps surprising that there is no one widely credited with the invention of this idea. Even more surprising is that there are few contenders for this title, and this question seems not to have been raised before. This article traces some of the developments in the history of this graphical method, the origin of the term scatterplot, the role it has played in the history of science, and some of its modern descendants. We suggest that the origin of this method can be traced to its unique advantage: the possibility to discover regularity in empirical data by smoothing and other graphic annotations to enhance visual perception.

Applying Statistical Graphics to Multivariate Data

1986

Graphical techniques for displaying, ex~n1ng, and anaLyzing multivariable observations are discussed. Graphical .methods thai;: reveal important features of data serve tO complement and illuminate formal statistical inferences. Recently developed graphical displays having practical value for applied work with high-dimensional data are emphasized. Star plots, faces, and trees are examples of such methods. The strengths and weaknesses of these and other techniques for dealing with data from applied situations will be treated"and compared.

On visualisation of statistical data

Proceedings. 1997 IEEE Conference on Information Visualization (Cat. No.97TB100165), 1997

In tmm~~~ trpplicutions u need often urises to represent mrmeric duta in u form which /IUS more visltcrl impact. Whether the dutu consists of demographic information or is jwt u listing vfj?nancial hxsiness trends, thell interpretation and meaning is simpler to comprehend throlrgh a pictorial representation than otherwise. Itl,Ji7ct. the rerlzlirement in practice is such that viszrrrlisation needs to take pluce on-the-jI)l. This implies that the process (?ftrtrtI.~~~rtllitlg .stutic dutu into u diugrummutic form needs to he &zumic.

Scatterplots: Basics, enhancements, problems and solutions

The scatter plot is a basic tool for presenting information on two continuous variables. While the basic plot is good in many situations, enhancements can increase its utility. I also go over tools to deal with the problem of overplotting.

Generalized scatter plots

Information …, 2010

Scatter Plots are one of the most powerful and most widely used techniques for visual data exploration. A well-known problem is that scatter plots often have a high degree of overlap, which may occlude a significant portion of the data values shown. In this paper, we propose the generalized scatter plot technique, which allows an overlap-free representation of large data sets to fit entirely into the display. The basic idea is to allow the analyst to optimize the degree of overlap and distortion to generate the bestpossible view. To allow an effective usage, we provide the capability to zoom smoothly between the traditional and our generalized scatter plots. We identify an optimization function that takes overlap and distortion of the visualization into acccount. We evaluate the generalized scatter plots according to this optimization function, and show that there usually exists an optimal compromise between overlap and distortion. Our generalized scatter plots have been applied successfully to a number of real-world IT services applications, such as server performance monitoring, telephone service usage analysis and financial data, demonstrating the benefits of the generalized scatter plots over traditional ones.