Automatic Scatterplot Design Optimization For Clustering Identification

Automatic Scatterplot Design Optimization For Clustering Identification
Ghulam Jilani Quadri, Jennifer Adorno Nieves, Brenton M. Wiernik, and Paul Rosen
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2023

Abstract

Scatterplots are among the most widely used visualization techniques. Compelling scatterplot visualizations improve understanding of data by leveraging visual perception to boost awareness when performing specific visual analytic tasks. Design choices in scatterplots, such as graphical encodings or data aspects, can directly impact decision-making quality for low-level tasks like clustering. Hence, constructing frameworks that consider both the perceptions of the visual encodings and the task being performed enables optimizing visualizations to maximize efficacy. In this paper, we propose an automatic tool to optimize the design factors of scatterplots to reveal the most salient cluster structure. Our approach leverages the merge tree data structure to identify the clusters and optimize the choice of subsampling algorithm, sampling rate, marker size, and marker opacity used to generate a scatterplot image. We validate our approach with user and case studies that show it efficiently provides high-quality scatterplot designs from a large parameter space.

Downloads

Download the Paper Download the BiBTeX

Citation

Ghulam Jilani Quadri, Jennifer Adorno Nieves, Brenton M. Wiernik, and Paul Rosen. Automatic Scatterplot Design Optimization For Clustering Identification. IEEE Transactions on Visualization and Computer Graphics (TVCG), 2023.

Bibtex


@article{quadri2022automatic,
  title = {Automatic Scatterplot Design Optimization for Clustering Identification},
  author = {Quadri, Ghulam Jilani and Nieves, Jennifer Adorno and Wiernik, Brenton M. and
    Rosen, Paul},
  journal = {IEEE Transactions on Visualization and Computer Graphics (TVCG)},
  year = {2023},
  note = {textit{Presented at IEEE VIS 2023.}},
  abstract = {Scatterplots are among the most widely used visualization techniques.
    Compelling scatterplot visualizations improve understanding of data by leveraging visual
    perception to boost awareness when performing specific visual analytic tasks. Design
    choices in scatterplots, such as graphical encodings or data aspects, can directly
    impact decision-making quality for low-level tasks like clustering. Hence, constructing
    frameworks that consider both the perceptions of the visual encodings and the task being
    performed enables optimizing visualizations to maximize efficacy. In this paper, we
    propose an automatic tool to optimize the design factors of scatterplots to reveal the
    most salient cluster structure. Our approach leverages the merge tree data structure to
    identify the clusters and optimize the choice of subsampling algorithm, sampling rate,
    marker size, and marker opacity used to generate a scatterplot image. We validate our
    approach with user and case studies that show it efficiently provides high-quality
    scatterplot designs from a large parameter space.}
}