multivariate visualization Research Papers - Academia.edu (original) (raw)

2025, IEICE Technical Report; IEICE Tech. Rep.

2025, IEICE technical report. Speech

Antonymic adjectives in discourse can be used to express opposing properties of a meaning dimension. An example of this can be found in wine tasting notes, where the characteristics of the wine are described using a number of different... more

Antonymic adjectives in discourse can be used to express opposing properties of a meaning dimension. An example of this can be found in wine tasting notes, where the characteristics of the wine are described using a number of different dimensions. In this paper, we examine the change in use over time of antonym pairs in wine tasting notes. In particular, we analyze the change in the use of thick and thin by word co-occurrence frequency, generating visualizations for diachronic analysis.

2025, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending... more

One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.

2025, Journal of Archaeological Science

Simultaneous analysis of relationships between multiple artifact classes is required for characterization of many types of activity areas. This paper illustrates improved forms of multivariate visualization, spatial analysis and... more

Simultaneous analysis of relationships between multiple artifact classes is required for characterization of many types of activity areas. This paper illustrates improved forms of multivariate visualization, spatial analysis and integration of experimental results that are possible with GIS based photomapping. Techniques are demonstrated through analysis of a hearth associated artifact scatter exposed during excavations of a Late Archaic pithouse at Jiskairumoko, Peru. A multivariate density raster is created and additive color visualization is used for simultaneous display of three artifact distributions. Performing unconstrained clustering in a GIS, space is classified by simultaneous relative density relationships between multiple object types.

2024, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis

One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending... more

One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.

2024, Visualization and Data Analysis 2011

Chromatography is a technique used to separate and quantify the components in a complex chemical mixture. We have created a 3D visualization system capable of comparing the chemical properties of chromatographic systems. The visualization... more

Chromatography is a technique used to separate and quantify the components in a complex chemical mixture. We have created a 3D visualization system capable of comparing the chemical properties of chromatographic systems. The visualization system combines scatter plots, parallel coordinates, and specialized glyphs to assist in the analysis of chromatographic data and comparisons of multiple systems. Using this tool, numerous separation systems can be readily compared simultaneously-greatly facilitating the ability to select systems that are likely to produce desired separations during method development.

2024

We present a practical and general-purpose approach to large and complex visual data analysis where visualization processing, rendering and subsequent human interpretation is constrained to the subset of data deemed interesting by the... more

We present a practical and general-purpose approach to large and complex visual data analysis where visualization processing, rendering and subsequent human interpretation is constrained to the subset of data deemed interesting by the user. In many scientific data analysis applications, "interesting" data can be defined by compound Boolean range queries of the form (temperature > 1000) AND (70 < pressure < 90). As data sizes grow larger, a central challenge is to answer such queries as efficiently as possible. Prior work in the visualization community has focused on answering range queries for scalar fields within the context of accelerating the search phase of isosurface algorithms. In contrast, our work describes an approach that leverages state-of-the-art indexing technology from the scientific data management community called "bitmap indexing." Our implementation, which we call "DEX" (short for dextrous data explorer), uses bitmap indexing to efficiently answer multivariate, multidimensional data queries to provide input to a visualization pipeline. We present an analysis overview and benchmark results that show bitmap indexing offers significant storage and performance improvements when compared to previous approaches for accelerating the search phase of isosurface algorithms. More importantly, since bitmap indexing supports complex multidimensional, multivariate range queries, it is more generally applicable to scientific data visualization and analysis problems. In addition to benchmark performance and analysis, we apply DEX to a typical scientific visualization problem encountered in combustion simulation data analysis.

2023, 25th International Conference on Software Engineering, 2003. Proceedings.

This paper proposes automated support for classi ying reported software failures in order to facilitate prioritizing them and diagnosing their causes. A classification strategy is presented that involves the use of supervised and... more

This paper proposes automated support for classi ying reported software failures in order to facilitate prioritizing them and diagnosing their causes. A classification strategy is presented that involves the use of supervised and unsupervised pattern classification and multivariate visualization. These techniques are applied to profiles o f failed executions in order to group together failures with the same or similar causes. The resulting classification is then used to assess the frequency and severity o f failures caused by particular defects and to help diagnose those defects. The results of applying the proposed classification strategy to failures of three large subject programs are reported. These results indicate that the strategy can be effective.

2023, 15th International Symposium on Software Reliability Engineering

Recent research has addressed the problem of providing automated assistance to software developers in classifying reported instances of software failures so that failures with the same cause are grouped together. In this paper, two new... more

Recent research has addressed the problem of providing automated assistance to software developers in classifying reported instances of software failures so that failures with the same cause are grouped together. In this paper, two new tree-based techniques are presented for refining an initial classification of failures. One of these techniques is based on the use of dendrograms, which are rooted trees used to represent the results of hierarchical cluster analysis. The second technique employs a classification tree constructed to recognize failed executions. With both techniques, the tree representation is used to guide the refinement process. We also report the results of experimentally evaluating these techniques on several subject programs.

2023

Aggregating items can simplify the display of huge quantities of data values at the cost of losing information about the attribute values of the individual items. We propose a distribution glyph, in both two-and three-dimensional forms,... more

Aggregating items can simplify the display of huge quantities of data values at the cost of losing information about the attribute values of the individual items. We propose a distribution glyph, in both two-and three-dimensional forms, which specifically addresses the concept of how the aggregated data is distributed over the possible range of values. It is capable of displaying distribution, variability and extent information for up to four attributes at a time of multivariate, clustered data. User studies validate the concept, showing that both glyphs are just as good as raw data and the 3D glyph is better for answering some questions.

2023

As data is being generated each and every time in the world, the importance of data mining and visualization will always be on increase. Mining helps to extract significant insight from large volume of data. After that we need to present... more

As data is being generated each and every time in the world, the importance of data mining and visualization will always be on increase. Mining helps to extract significant insight from large volume of data. After that we need to present that data in such a way so that it can be understood by everyone and for that visualization is used. Most common way to visualize data is chart and table. Visualization is playing important role in decision making process for industry. Visualization makes better utilization of human eyes to assist his brain so that datasets can be analyzed and visual presentation can be prepared. Visualization and Data Mining works as complement for each other. Here in this paper we present anatomy of Visualization process.

2023, IEEE Transactions on Knowledge and Data Engineering

In this study, a conceptually simple, yet flexible and extendable strategy to contrast two different color images is introduced. The proposed approach is based on the multivariate Wald-Wolfowitz test, a nonparametric test that assesses... more

In this study, a conceptually simple, yet flexible and extendable strategy to contrast two different color images is introduced. The proposed approach is based on the multivariate Wald-Wolfowitz test, a nonparametric test that assesses the commonality between two different sets of multivariate observations. It provides an aggregate gauge of the match between color images, taking into consideration all the (selected) low-level characteristics, while alleviating correspondence issues. We show that a powerful measure of similarity between two color images can emerge from the statistical comparison of their representations in a properly formed feature space. For the sake of simplicity, the RGB-space is selected as the feature space, while we are experimenting with different ways to represent the images within this space. By altering the feature-extraction implementation, complementary ways to portray the image content appear. The reported results, from the application on a diverse collection of images, clearly demonstrate the effectiveness of our method, its superiority over previous methods, and suggest that even further improvements can be achieved along the same line of research. It is not only the unifying character that makes our strategy appealing, but also the fact that the retrieval performance does not increase continuously with the amount of details in the image representation. The latter sets an upper limit to the computational demands and reminds of performance plateaus reached by novel approaches in information retrieval.

2023, Lecture Notes in Computer Science

We introduce an iterative feature-based transfer function design that extracts and systematically incorporates multivariate featurelocal statistics into a texture-based volume rendering process. We argue that an interactive multivariate... more

We introduce an iterative feature-based transfer function design that extracts and systematically incorporates multivariate featurelocal statistics into a texture-based volume rendering process. We argue that an interactive multivariate feature-local approach is advantageous when investigating ill-defined features, because it provides a physically meaningful, quantitatively rich environment within which to examine the sensitivity of the structure properties to the identification parameters. We demonstrate the efficacy of this approach by applying it to vortical structures in Taylor-Green turbulence. Our approach identified the existence of two distinct structure populations in these data, which cannot be isolated or distinguished via traditional transfer functions based on global distributions.

2023, Lecture Notes in Computer Science

Knowledge extraction from data volumes of ever increasing size requires ever more flexible tools to facilitate interactive query. Interactivity enables real-time hypothesis testing and scientific discovery, but can generally not be... more

Knowledge extraction from data volumes of ever increasing size requires ever more flexible tools to facilitate interactive query. Interactivity enables real-time hypothesis testing and scientific discovery, but can generally not be achieved without some level of data reduction. The approach described in this paper combines multi-resolution access, region-of-interest extraction, and structure identification in order to provide interactive spatial and statistical analysis of a terascale data volume. Unique aspects of our approach include the incorporation of both local and global statistics of the flow structures, and iterative refinement facilities, which combine geometry, topology, and statistics to allow the user to effectively tailor the analysis and visualization to the science. Working together, these facilities allow a user to focus the spatial scale and domain of the analysis and perform an appropriately tailored multivariate visualization of the corresponding data. All of these ideas and algorithms are instantiated in a deployed visualization and analysis tool called VAPOR, which is in routine use by scientists internationally. In data from a 1024 3 simulation of a forced turbulent flow, VAPOR allowed us to perform a visual data exploration of the flow properties at interactive speeds, leading to the discovery of novel scientific properties of the flow, in the form of two distinct vortical structure populations. These structures would have been very difficult (if not impossible) to find with statistical overviews or other existing visualization-driven analysis approaches. This kind of intelligent, focused analysis/refinement approach will become even more important as computational science moves towards petascale applications. 1 Challenges to Data Analysis A critical disparity is growing in the field of computational science: our ability to generate numerical data from scientific computations has in many cases

2023, Proceedings of the 12th International Symposium on Visual Information Communication and Interaction

2023, Information Visualization

Multidimensional projection techniques have become essential analytical tools. Typically, they map data from a high-dimensional space into a low-dimensional visual space, preserving distance or neighborhood structures on the produced... more

Multidimensional projection techniques have become essential analytical tools. Typically, they map data from a high-dimensional space into a low-dimensional visual space, preserving distance or neighborhood structures on the produced layout. Despite the advances, with faster and highly precise techniques, existing methods still carry deficiencies that impair their use as exploratory tools. An example is the mismatching that can occur between what the user considers similar/dissimilar and what is conveyed by the visual representation. Recently, a class of projection techniques aims at addressing this limitation, allowing users to control the projection process by changing the distance relationships using small data samples. Among such methods, Local Affine Multidimensional Projection has proved to be the state-of-the-art regarding the effectiveness of user intervention. Although Local Affine Multidimensional Projection has attained a relative success, it is limited to certain applica...

2023, Lecture Notes in Computer Science

When faced with complex situations, it can often be hard to put into words and accurately express it appropriately. This becomes increasingly difficult when specialist expressions are required that are not used in everyday language. The... more

When faced with complex situations, it can often be hard to put into words and accurately express it appropriately. This becomes increasingly difficult when specialist expressions are required that are not used in everyday language. The problem is faced when trying to express in words to another person the wine that you just drank, or a wine that you want to drink to a waiter at a restaurant or shop assistant. It requires the expression in words of numerous senses including complex flavors, smells, colors, and personal emotion that is felt. These expressions are often subjective, with different people having using different expressions for the same wine. In this paper, we propose the use of wine related expressions collected from the internet and clustered to generate mind maps.

2023, Expert Systems with Applications

In statistics, machine learning, and related fields, feature selection is the process of choosing a smaller subset of features to work with. This is an important topic since selecting a subset of features can help analysts to interpret... more

In statistics, machine learning, and related fields, feature selection is the process of choosing a smaller subset of features to work with. This is an important topic since selecting a subset of features can help analysts to interpret models and data, and to decrease computational runtimes. While many techniques are purely automatic, the data visualization community has produced a number of interactive approaches where users can make decisions taking into account their domain knowledge. In this paper we propose a new visualization technique based on radial axes that allows analysts to perform feature selection effectively, in contrast to previous radial axes methods. This is achieved by employing alternative scaled axes that provide insight regarding the features that have a smaller contribution to the visualizations. Therefore, analysts can use the technique to carry out interactive backwards feature elimination, by discarding the least relevant features according to the information on the plots and their expertise. Our approach can be coupled with any linear dimensionality reduction method, and can be used when performing analyses of cluster structure, correlations, class separability, etc. Specifically, in this paper we focus on combining the proposed technique with methods designed for classification. Lastly, we illustrate the effectiveness of our proposal through a case study analyzing high-dimensional medical chronic conditions data. In particular, clinicians have used the technique for determining the most important features that discriminate between patients with diabetes and high blood pressure.

2023, Vision Modeling and Visualization

T he motion of a fluid is affected by several intertwined flow aspects. Analyzing one aspect at a time can only yield partial information about the flow behavior. More details can be revealed by studying their interactions. Our approach... more

T he motion of a fluid is affected by several intertwined flow aspects. Analyzing one aspect at a time can only yield partial information about the flow behavior. More details can be revealed by studying their interactions. Our approach enables the investigation of these interactions by simultaneously visualizing meaningful flow aspects, such as swirling motion and shear strain. We adopt the notions of relevance and coherency. Relevance identifies locations where a certain flow aspect is deemed particularly important. The related piece of information is visualized by a specific visual entity, placed at the corresponding location. Coherency instead represents the homogeneity of a flow property in a local neighborhood. It is exploited in order to avoid visual redundancy and to reduce occlusion and cluttering. We have applied our approach to three CFD datasets, obtaining meaningful insights.

2023, Computer Methods and Programs in Biomedicine

The paper addresses the possibility to replace cluttered multi-group scatter-plots with augmented convex hull plots. By replacing scatter-plot points with convex hulls, space is gained for visualization of descriptive statistics with... more

The paper addresses the possibility to replace cluttered multi-group scatter-plots with augmented convex hull plots. By replacing scatter-plot points with convex hulls, space is gained for visualization of descriptive statistics with error bars or confidence ellipses within the convex hulls. An informative addition to the plot is calculation of the area of convex hull divided by corresponding group size as a bivariate dispersion measure. Marginal distributions can be depicted on the sides of the main plot in established ways. Bivariate density plots might be used instead of convex hulls in the presence of outliers. Like any scatter-plot type visualization, the technique is not limited to raw data-points can be derived from any dimension reduction technique, or simple functions can be used as axes instead of original dimensions. The limited possibilities for producing such plots in existing software are surveyed, and our general and flexible implementation in Rthe publicly available chplot function-is presented. Examples based on our daily biostatistical consulting practice illustrate the technique with various options.

2023, Hygeia : Revista Brasileira de Geografia Médica e da Saúde

In recent decades, the Aedes aegypti mosquito, the vector of dengue, Zika, chikungunya and other diseases, has become an increasing concern in Brazil and other tropical countries. However, the spatial distribution of these diseases in... more

In recent decades, the Aedes aegypti mosquito, the vector of dengue, Zika, chikungunya and other diseases, has become an increasing concern in Brazil and other tropical countries. However, the spatial distribution of these diseases in Brazil is not homogeneous, implying that there are distinct risk levels and various guiding strategies for fighting these diseases. This paper presents a didactic exploratory multivariate spatial data analysis of dengue, Zika and chikungunya cases in Brazil in 2016 in the context of an undergraduate course. The students elaborated, interpreted and discussed multivariate maps using techniques such as proportional symbols, hachure density, bivariate and trivariate choropleth composition, contiguous cartograms, conditional maps and matrices of maps and graphs. The students investigated the various interpretative possibilities based on the intrinsic and extrinsic combinations of multivariate geovisualizations. Through their interpretations of the spatial r...

2022

Analyzing large amounts of complex movement data requires appropriate visual and analytical methods. This paper proposes a 2-D staricon based visualization technique for the visual exploration of multivariate movement events in a... more

Analyzing large amounts of complex movement data requires appropriate visual and analytical methods. This paper proposes a 2-D staricon based visualization technique for the visual exploration of multivariate movement events in a space-time cube. To test the proposed method, we derive multivariate events from massive real-world floating car data and visually explore spatio-temporal patterns. The experimental results show that our proposed methods are helpful in identifying interesting locations or functional areas, and assist the understanding of dynamic patterns.

2022, Lecture Notes in Computer Science

Visualization is a key issue for multivariate data analysis. Multivariate visualization is an active research topic and many efforts have been made in order to find suitable and meaningful visual representations. In this paper we present... more

Visualization is a key issue for multivariate data analysis. Multivariate visualization is an active research topic and many efforts have been made in order to find suitable and meaningful visual representations. In this paper we present a technique for data projection in multivariate datasets, named Target Data Projection (TDP). Through this technique a vector is created for each multivariate data item considering a subset of the available variables. A new scalar variable is generated projecting those vectors over a target vector that defines the direction of interest for visual analysis. End-users set up target vectors in order to explore particular relationships by means of application meaningful projections. Hence, it is possible to map a combination of multivariate data into one scalar variable for graphical representation and interaction. This technique has proved to be very flexible and useful in mine planning providing valuable information for decision making.

2022, Lecture Notes in Computer Science

Visualization is a key issue for multivariate data analysis. Multivariate visualization is an active research topic and many efforts have been made in order to find suitable and meaningful visual representations. In this paper we present... more

Visualization is a key issue for multivariate data analysis. Multivariate visualization is an active research topic and many efforts have been made in order to find suitable and meaningful visual representations. In this paper we present a technique for data projection in multivariate datasets, named Target Data Projection (TDP). Through this technique a vector is created for each multivariate data item considering a subset of the available variables. A new scalar variable is generated projecting those vectors over a target vector that defines the direction of interest for visual analysis. End-users set up target vectors in order to explore particular relationships by means of application meaningful projections. Hence, it is possible to map a combination of multivariate data into one scalar variable for graphical representation and interaction. This technique has proved to be very flexible and useful in mine planning providing valuable information for decision making.

2022

To extend the scope of multivariate data visualization, the notion of comparative visualization is introduced: it allows the comparison of visualization methods by interconnecting several different graphic displays. This linking of... more

To extend the scope of multivariate data visualization, the notion of comparative visualization is introduced: it allows the comparison of visualization methods by interconnecting several different graphic displays. This linking of visualizations, together with the possibility to interactively manipulate data, enable an analyst to display the same data set with a number of conceptually different visualization methods simultaneously and to carry out graphical operations across them. Graphical effects in different displays not only reveal information about the data themselves, they also provide the basis to investigate how the different visualization methods relate to each other. With the "VisuLab", we developed a software tool for personal computers to investigate comparative multivariate data visualization.

2022

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration,... more

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration, displacement, and speed of the athlete's feet and arms while performing a certain training exercise. We are interested in visually measuring and comparing the performance over several training sessions of the same and/or different athletes. For this, we adapt and extend several visualization methods for multivariate data. First, we use an enhanced signal plot and statistics plot to visualize the regularity of repetitions within a given exercise. Second, we use a novel texture-based signal plot to eliminate signal noise and emphasize the average repetitive pattern of the exercise. Finally, we use a signal clustering technique, visualized with a matrix plot, to detect similar exercises over long periods of time. We demonstrate our approaches with actual data from training sessions of several athletes.

2022

We present a LitViz, a webbased tool for visualizing literary data which utilizes the text2voronoi algorithm Mehler et al. (2016b) to map natural language texts onto voronoi diagrams. These diagrams can be used, for example, to visually... more

We present a LitViz, a webbased tool for visualizing literary data which utilizes the text2voronoi algorithm Mehler et al. (2016b) to map natural language texts onto voronoi diagrams. These diagrams can be used, for example, to visually differentiate between (groups of) authors. Text2voronoi utilizes the paradigm of text visualization to reconstruct text classification (e.g., authorship attribution) as a task of image classification. This means that, in contrast to conventional approaches to text classifiction, we do not directly use linguistic features, but explore visual features derived from the texts’ visualizations to perform operations on texts. We illustrate LitViz by means of 18 authors, each of whom is represented by 5 literary works.

2022

Antonymic adjectives in discourse can be used to express opposing properties of a meaning dimension. An example of this can be found in wine tasting notes, where the characteristics of the wine are described using a number of different... more

Antonymic adjectives in discourse can be used to express opposing properties of a meaning dimension. An example of this can be found in wine tasting notes, where the characteristics of the wine are described using a number of different dimensions. In this paper, we examine the change in use patterns over time of antonym pairs in wine tasting notes. We examine the change in the use of thick and thin by analyzing words that co-occur in the same tasting note as thick or thin and generate visualizations for diachronic analysis.

2022

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal... more

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal? Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

2022

The purpose of the article is to shed light on how experiences of sensory perceptions in the domains of VISION, SMELL, TASTE and TOUCH are recast into text and discourse in the genre of wine reviews. Because of the alleged paucity of... more

The purpose of the article is to shed light on how experiences of sensory perceptions in the domains of VISION, SMELL, TASTE and TOUCH are recast into text and discourse in the genre of wine reviews. Because of the alleged paucity of sensory vocabularies, in particular in the olfactory domain, it is of particular interest to investigate what resources language has to offer in order to describe those experiences. We show that the main resources are, on the one hand, words evoking properties that are applicable ...

2022

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration,... more

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration, displacement, and speed of the athlete's feet and arms while performing a certain training exercise. We are interested in visually measuring and comparing the performance over several training sessions of the same and/or different athletes. For this, we adapt and extend several visualization methods for multivariate data. First, we use an enhanced signal plot and statistics plot to visualize the regularity of repetitions within a given exercise. Second, we use a novel texture-based signal plot to eliminate signal noise and emphasize the average repetitive pattern of the exercise. Finally, we use a signal clustering technique, visualized with a matrix plot, to detect similar exercises over long periods of time. We demonstrate our approaches with actual data from training sessions of several athletes.

2022, Journal of Visual Languages & Computing

Generating the "right" visual representation for the data and task at hand remains a standing challenge in visualization research and practice. A variety of different approaches to produce visual representations have been proposed in the... more

Generating the "right" visual representation for the data and task at hand remains a standing challenge in visualization research and practice. A variety of different approaches to produce visual representations have been proposed in the past, including such noteworthy instances as visualization by example and visualization by analogy. With this paper, we add a new twist to creating visual representations by proposing a way to construct new visualization designs by blending together a number of existing visual representations, called presets. We embed this novel blending approach in suitable visual interfaces, such as a gridded canvas to be used by the casual user in the style of a palette for mixing colors, or a range of sliders to be used by the expert user in the style of a studio mixer for audio tracks. These can be employed for rapid prototyping of a specific visual representation, as well as to explore the overall design space of visual representations captured by our approach. We showcase our preset-based blending and its interfaces with examples of the design of 2D tree visualizations and product plots.

2022

Aggregating items can simplify the display of huge quantities of data values at the cost of losing information about the attribute values of the individual items. We propose a distribution glyph, in both two- and three-dimensional forms,... more

Aggregating items can simplify the display of huge quantities of data values at the cost of losing information about the attribute values of the individual items. We propose a distribution glyph, in both two- and three-dimensional forms, which specifically addresses the concept of how the aggregated data is distributed over the possible range of values. It is capable of displaying distribution, variability and extent information for up to four attributes at a time of multivariate, clustered data. User studies validate the concept, showing that both glyphs are just as good as raw data and the 3D glyph is better for answering some questions.

2022

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration,... more

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration, displacement, and speed of the athlete's feet and arms while performing a certain training exercise. We are interested in visually measuring and comparing the performance over several training sessions of the same and/or different athletes. For this, we adapt and extend several visualization methods for multivariate data. First, we use an enhanced signal plot and statistics plot to visualize the regularity of repetitions within a given exercise. Second, we use a novel texture-based signal plot to eliminate signal noise and emphasize the average repetitive pattern of the exercise. Finally, we use a signal clustering technique, visualized with a matrix plot, to detect similar exercises over long periods of time. We demonstrate our approaches with actual data from training sessions of several athletes.

2022, The International Journal of High Performance Computing Applications

In this paper we present the findings and recommendations that emerged from a one-day workshop held at Lawrence Berkeley National Laboratory (LBNL) on June 5, 2002, in conjunction with the National Energy Research Scientific Computing... more

In this paper we present the findings and recommendations that emerged from a one-day workshop held at Lawrence Berkeley National Laboratory (LBNL) on June 5, 2002, in conjunction with the National Energy Research Scientific Computing (NERSC) User Group (NUG) Meeting. The motivation for this workshop was to solicit direct input from the application science community on the subject of visualization. The workshop speakers and participants included computational scientists from a cross-section of disciplines that use the NERSC facility, as well as visualization researchers from across the country. We asked the workshop contributors how they currently visualize their results, and how they would like to do visualization in the future. We were especially interested in each individual's view of how visualization tools and services could be improved in order to better meet the needs of future computational science projects. The outcome of this workshop is a set of findings and recommend...

2022

In recent decades, the Aedes aegypti mosquito, the vector of dengue, Zika, chikungunya and other diseases, has become an increasing concern in Brazil and other tropical countries. However, the spatial distribution of these diseases in... more

In recent decades, the Aedes aegypti mosquito, the vector of dengue, Zika, chikungunya and other diseases, has become an increasing concern in Brazil and other tropical countries. However, the spatial distribution of these diseases in Brazil is not homogeneous, implying that there are distinct risk levels and various guiding strategies for fighting these diseases. This paper presents a didactic exploratory multivariate spatial data analysis of dengue, Zika and chikungunya cases in Brazil in 2016 in the context of an undergraduate course. The students elaborated, interpreted and discussed multivariate maps using techniques such as proportional symbols, hachure density, bivariate and trivariate choropleth composition, contiguous cartograms, conditional maps and matrices of maps and graphs. The students investigated the various interpretative possibilities based on the intrinsic and extrinsic combinations of multivariate geovisualizations. Through their interpretations of the spatial r...

2022

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration,... more

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration, displacement, and speed of the athlete's feet and arms while performing a certain training exercise. We are interested in visually measuring and comparing the performance over several training sessions of the same and/or different athletes. For this, we adapt and extend several visualization methods for multivariate data. First, we use an enhanced signal plot and statistics plot to visualize the regularity of repetitions within a given exercise. Second, we use a novel texture-based signal plot to eliminate signal noise and emphasize the average repetitive pattern of the exercise. Finally, we use a signal clustering technique, visualized with a matrix plot, to detect similar exercises over long periods of time. We demonstrate our approaches with actual data from training sessions of several athletes.

2022, International Journal of Statistics in Medical Research

High-throughput genomic assays are used in molecular biology to explore patterns of joint expression of thousands of genes. These methodologies had relevant developments in the last decade, and concurrently there was a need for... more

High-throughput genomic assays are used in molecular biology to explore patterns of joint expression of thousands of genes. These methodologies had relevant developments in the last decade, and concurrently there was a need for appropriate methods for analyzing the massive data generated. Identifying sets of genes and samples characterized by similar values of expression and validating these results are two critical issues related to these investigations because of their clinical implication. From a statistical perspective, unsupervised class discovery methods like Cluster Analysis are generally adopted. However, the use of Cluster Analysis mainly relies on the use of hierarchical techniques without considering possible use of other methods. This is partially due to software availability and to easiness of representation of results through a heatmap, which allows to simultaneously visualize clusterization of genes and samples on the same graphical device. One drawback of this strategy is that clusters' stability is often neglected, thus leading to over-interpretation of results. Moreover, validation of results using external datasets is still subject of discussion, since it is well known that batch effects may condition gene expression results even after normalization. In this paper we compared several clustering algorithms (hierarchical, k-means, model-based, Affinity Propagation) and stability indices to discover common patterns of expression and to assess clustering reliability, and propose a rankbased passive projection of Principal Components for validation purposes. Results from a study involving 23 tumor cell lines and 76 genes related to a specific biological pathway and derived from a publicly available dataset, are presented.

2022, Behavior Research Methods, Instruments, & Computers

The complexity of psychological science often requires the collection and analysis of multidimensional data. Such data bring about a corresponding cognitive load that has led scientists to develop techniques of scientific visualization to... more

The complexity of psychological science often requires the collection and analysis of multidimensional data. Such data bring about a corresponding cognitive load that has led scientists to develop techniques of scientific visualization to ease the burden. This paper provides an introduction to scientific visualization techniques, a framework for understanding those techniques, and an assessment of the suitability of this approach for psychology. The framework employed builds on the notion of balancing noise and smooth in statistical analysis. Widespread availability ofdesk-top computing allows psychologists to develop and manipulate complex multivariate data sets. While researchers in the physical and engineering sciences have dealt with increasing data complexity by using scientific visualization, researchers in the behavioral sciences have been slower to adopt these tools . To address this discrepancy, this paper defines scientific visualization, presents a theoretical framework for understanding visualization, and reviews a number of multivariate visualization techniques in light of this framework. Because all graphics and animations available to illustrate the concepts discussed here cannot be incorporated in this print version, a hypertext version of this paper containing these illustrations is available through World-Wide Web browsers. The primary document and supporting software can be found in the ASU resources section of the server at . ed.asu.edu/-Behrens/. We define scientific visualization as the process ofexploring and displaying data in a manner that builds a visual analogy to the physical world in the service ofuser insight and learning. This entails finding a balance between the detail of the raw data and the parsimony ofstatistical summary. Each component of this definition will now be addressed. Although most statistical training in psychology focuses on confirmatory data analysis (see , there is in statistics a well-established tradition called exploratory data analysis (EDA). Pioneered by the work of John , this tradition emphasizes the seeking of unexpected structure and the

2022, Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces

Facilitating vocabulary knowledge is a challenging aspect for language learners. Although current corpus-based reference tools provide authentic contextual clues, the plain text format is not conducive to fully illustrating some lexical... more

Facilitating vocabulary knowledge is a challenging aspect for language learners. Although current corpus-based reference tools provide authentic contextual clues, the plain text format is not conducive to fully illustrating some lexical phenomena. Thus, this paper proposes GLANCE 1 , a text visualization tool, to present a large amount of lexical phenomena using charts and graphs, aimed at helping language learners understand a word quickly and intuitively. To evaluate the effectiveness of the system, we designed interfaces to allow comparison between text and graphics presentation, and conducted a preliminary user study with ESL students. The results show that the visualized display is of greater benefit to the understanding of word characteristics than textual display.

2022, The International Journal of High Performance Computing Applications

In this paper we present the findings and recommendations that emerged from a one-day workshop held at Lawrence Berkeley National Laboratory (LBNL) on June 5, 2002, in conjunction with the National Energy Research Scientific Computing... more

In this paper we present the findings and recommendations that emerged from a one-day workshop held at Lawrence Berkeley National Laboratory (LBNL) on June 5, 2002, in conjunction with the National Energy Research Scientific Computing (NERSC) User Group (NUG) Meeting. The motivation for this workshop was to solicit direct input from the application science community on the subject of visualization. The workshop speakers and participants included computational scientists from a cross-section of disciplines that use the NERSC facility, as well as visualization researchers from across the country. We asked the workshop contributors how they currently visualize their results, and how they would like to do visualization in the future. We were especially interested in each individual's view of how visualization tools and services could be improved in order to better meet the needs of future computational science projects. The outcome of this workshop is a set of findings and recommend...

2022

We present a novel technique based on a multi-resolutional cluster analysis of earthquake patterns to investigate observed and synthetic seismic catalogs. The observed data represent seismic activities around the Japanese islands from... more

We present a novel technique based on a multi-resolutional cluster analysis of earthquake patterns to investigate observed and synthetic seismic catalogs. The observed data represent seismic activities around the Japanese islands from 1997-2003. The synthetic data were generated by numerical simulations for various cases of a heterogeneous fault governed by 3-D elastic dislocation and power-law creep. At the highest resolution, we analyze the local cluster structures in the data space of seismic events for the two types of catalogs by using an agglomerative clustering algorithm. We demonstrate that small magnitude events produce local spatio-temporal patches corresponding to neighboring large events. Seismic events, quantized in space and time, generate the multi-dimensional feature space characterized by the earthquake parameters. Using a non-hierarchical clustering algorithm and multi-dimensional scaling, we explore the multitudinous earthquakes by real-time 3-D visualization and ...

2022

Aggregating items can simplify the display of huge quantities of data values at the cost of losing information about the attribute values of the individual items. We propose a distribution glyph, in both two- and three-dimensional forms,... more

Aggregating items can simplify the display of huge quantities of data values at the cost of losing information about the attribute values of the individual items. We propose a distribution glyph, in both two- and three-dimensional forms, which specifically addresses the concept of how the aggregated data is distributed over the possible range of values. It is capable of displaying distribution, variability and extent information for up to four attributes at a time of multivariate, clustered data. User studies validate the concept, showing that both glyphs are just as good as raw data and the 3D glyph is better for answering some questions.

2022

Abstract. Knowledge extraction from data volumes of ever increasing size requires ever more flexible tools to facilitate interactive query. In-teractivity enables real-time hypothesis testing and scientific discovery, but can generally... more

Abstract. Knowledge extraction from data volumes of ever increasing size requires ever more flexible tools to facilitate interactive query. In-teractivity enables real-time hypothesis testing and scientific discovery, but can generally not be achieved without some level of data reduction. The approach described in this paper combines multi-resolution access, region-of-interest extraction, and structure identification in order to pro-vide interactive spatial and statistical analysis of a terascale data volume. Unique aspects of our approach include the incorporation of both local and global statistics of the flow structures, and iterative refinement fa-cilities, which combine geometry, topology, and statistics to allow the user to effectively tailor the analysis and visualization to the science. Working together, these facilities allow a user to focus the spatial scale and domain of the analysis and perform an appropriately tailored mul-tivariate visualization of the corresponding da...

2022

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration,... more

We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration, displacement, and speed of the athlete’s feet and arms while performing a certain training exercise. We are interested in visually measuring and comparing the performance

2022

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal... more

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal? Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

2022, International Journal of Statistics in Medical Research

High-throughput genomic assays are used in molecular biology to explore patterns of joint expression of thousands of genes. These methodologies had relevant developments in the last decade, and concurrently there was a need for... more

High-throughput genomic assays are used in molecular biology to explore patterns of joint expression of thousands of genes. These methodologies had relevant developments in the last decade, and concurrently there was a need for appropriate methods for analyzing the massive data generated. Identifying sets of genes and samples characterized by similar values of expression and validating these results are two critical issues related to these investigations because of their clinical implication. From a statistical perspective, unsupervised class discovery methods like Cluster Analysis are generally adopted. However, the use of Cluster Analysis mainly relies on the use of hierarchical techniques without considering possible use of other methods. This is partially due to software availability and to easiness of representation of results through a heatmap, which allows to simultaneously visualize clusterization of genes and samples on the same graphical device. One drawback of this strategy is that clusters' stability is often neglected, thus leading to over-interpretation of results. Moreover, validation of results using external datasets is still subject of discussion, since it is well known that batch effects may condition gene expression results even after normalization. In this paper we compared several clustering algorithms (hierarchical, k-means, model-based, Affinity Propagation) and stability indices to discover common patterns of expression and to assess clustering reliability, and propose a rankbased passive projection of Principal Components for validation purposes. Results from a study involving 23 tumor cell lines and 76 genes related to a specific biological pathway and derived from a publicly available dataset, are presented.

2022

This study investigates the relationship between evidentiality, temporality and epistemic control through detailed interpretive analysis of wine reviews written by Robert Parker, whose outstanding authority in this particular discourse... more

This study investigates the relationship between evidentiality, temporality and epistemic control through detailed interpretive analysis of wine reviews written by Robert Parker, whose outstanding authority in this particular discourse field provides an exceptionally fruitful backdrop for the exploration of credibility in discourse. The material consists of 200 entire reviews, which are divided into units based on differences in temporality, evidentiality and modes of knowing. The analysis takes into consideration linguistic markers realized in the texts as well as implicitness that emanates from general world knowledge and more specific contextual awareness. It is shown in detail how the construction of credibility in this particular instance of persuasive discourse relies on complex interrelations between explicit and implicit features of texts as well as combinations of socio-cultural factors, which taken together result in epistemic control of the depicted events, i.e. an impres...

2022

Visualization of set-valued attributes in multi-dimensional information visualization systems re-mains a relatively unexplored problem. Here we introduce a novel method for visualization set-valued attributes that we call the singleton... more

Visualization of set-valued attributes in multi-dimensional information visualization systems re-mains a relatively unexplored problem. Here we introduce a novel method for visualization set-valued attributes that we call the singleton set distribution view and integrate it into an interactive multi-dimensional attribute visualization tool utilizing bar-grams (aka equal-height histograms) as its main visual motif.