Modeling The Influence Of Visual Density On Cluster Perception In Scatterplots Using Topology

Modeling The Influence Of Visual Density On Cluster Perception In Scatterplots Using Topology
Ghulam Jilani Quadri, and Paul Rosen
IEEE Transactions on Visualization and Computer Graphics (IEEE InfoVis), 2021

Abstract

Scatterplots are used for a variety of visual analytics tasks, including cluster identification, and the visual encodings used on a scatterplot play a deciding role on the level of visual separation of clusters. For visualization designers, optimizing the visual encodings is crucial to maximizing the clarity of data. This requires accurately modeling human perception of cluster separation, which remains challenging. We present a multi-stage user study focusing on 4 factors—distribution size of clusters, number of points, size of points, and opacity of points—that influence cluster identification in scatterplots. From these parameters, we have constructed 2 models, a distance-based model, and a density-based model, using the merge tree data structure from Topological Data Analysis. Our analysis demonstrates that these factors play an important role in the number of clusters perceived, and it verifies that the distance-based and density-based models can reasonably estimate the number of clusters a user observes. Finally, we demonstrate how these models can be used to optimize visual encodings on real-world data.

Video

Downloads

Download the Paper Download the BiBTeX

Citation

Ghulam Jilani Quadri, and Paul Rosen. Modeling The Influence Of Visual Density On Cluster Perception In Scatterplots Using Topology. IEEE Transactions on Visualization and Computer Graphics (IEEE InfoVis), 2021.

Bibtex


@article{quadri2021multi,
  title = {Modeling the Influence of Visual Density on Cluster Perception in Scatterplots
    Using Topology},
  author = {Quadri, Ghulam Jilani and Rosen, Paul},
  journal = {IEEE Transactions on Visualization and Computer Graphics (IEEE InfoVis)},
  year = {2021},
  note = {textit{Presented at IEEE VIS 2020.}},
  abstract = {Scatterplots are used for a variety of visual analytics tasks, including
    cluster identification, and the visual encodings used on a scatterplot play a deciding
    role on the level of visual separation of clusters. For visualization designers,
    optimizing the visual encodings is crucial to maximizing the clarity of data. This
    requires accurately modeling human perception of cluster separation, which remains
    challenging. We present a multi-stage user study focusing on 4 factors---distribution
    size of clusters, number of points, size of points, and opacity of points---that
    influence cluster identification in scatterplots. From these parameters, we have
    constructed 2 models, a distance-based model, and a density-based model, using the merge
    tree data structure from Topological Data Analysis. Our analysis demonstrates that these
    factors play an important role in the number of clusters perceived, and it verifies that
    the distance-based and density-based models can reasonably estimate the number of
    clusters a user observes. Finally, we demonstrate how these models can be used to
    optimize visual encodings on real-world data.}
}