Correlation Coordinate Plots: Efficient Layouts For Correlation Tasks
Correlation Coordinate Plots: Efficient Layouts For Correlation Tasks |
Abstract
Correlation is a powerful measure of relationships assisting in estimating trends and making forecasts. It’s use is widespread, being a critical data analysis component of fields including science, engineering, and business. Unfortunately, visualization methods used to identify and estimate correlation are designed to be general, supporting many visualization tasks. Due in large part to their generality, they do not provide the most efficient interface, in terms of speed and accuracy for correlation identifying. To address this shortcoming, we first propose a new correlation task-specific visual design called Correlation Coordinate Plots (CCPs). CCPs transform data into a powerful coordinate system for estimating the direction and strength of correlation. To extend the functionality of this approach to multiple attribute datasets, we propose two approaches. The first design is the Snowflake Visualization, a focus+context layout for exploring all pairwise correlations. The second design enhances the CCP by using principal component analysis to project multiple attributes. We validate CCP by applying it to real-world data sets and test its performance in correlation-specific tasks through an extensive user study that showed improvement in both accuracy and speed of correlation identification.
Downloads
Citation
Hoa Nguyen, and Paul Rosen. Correlation Coordinate Plots: Efficient Layouts For Correlation Tasks. Communications in Computer and Information Science: Computer Vision, Imaging and Computer Graphics Theory and Applications, 2016.
Bibtex
@inproceedings{nguyen2016correlation, title = {Correlation Coordinate Plots: Efficient Layouts for Correlation Tasks}, author = {Nguyen, Hoa and Rosen, Paul}, booktitle = {Communications in Computer and Information Science: Computer Vision, Imaging and Computer Graphics Theory and Applications}, volume = {693}, pages = {264--286}, year = {2016}, abstract = {Correlation is a powerful measure of relationships assisting in estimating trends and making forecasts. It’s use is widespread, being a critical data analysis component of fields including science, engineering, and business. Unfortunately, visualization methods used to identify and estimate correlation are designed to be general, supporting many visualization tasks. Due in large part to their generality, they do not provide the most efficient interface, in terms of speed and accuracy for correlation identifying. To address this shortcoming, we first propose a new correlation task-specific visual design called Correlation Coordinate Plots (CCPs). CCPs transform data into a powerful coordinate system for estimating the direction and strength of correlation. To extend the functionality of this approach to multiple attribute datasets, we propose two approaches. The first design is the Snowflake Visualization, a focus+context layout for exploring all pairwise correlations. The second design enhances the CCP by using principal component analysis to project multiple attributes. We validate CCP by applying it to real-world data sets and test its performance in correlation-specific tasks through an extensive user study that showed improvement in both accuracy and speed of correlation identification.} }