Information
- Publication Type: PhD-Thesis
- Workgroup(s)/Project(s):
- Date: May 2011
- Date (Start): October 2008
- Date (End): May 2011
- 1st Reviewer: Eduard Gröller
- 2nd Reviewer: Prof. Helwig Hauser
- Rigorosum: 9. September 2011
- First Supervisor: Eduard Gröller
- Keywords: high dimensionality, Visualization, Scalability, Interaction, Data analysis, multi-threading, scatter plots
Abstract
In many areas of science and industry, the amount of data is growing fast and often already exceeds the ability to evaluate it. On the other hand, the unprecedented amount of available data bears an enormous potential for supporting decision-making. Turning data into comprehensible knowledge is thus a key challenge of the 21st century.
The power of the human visual system makes visualization an appropriate method to
comprehend large data. In particular interactive visualization enables a discourse between
the human brain and the data that can transform a cognitive problem to a perceptual one.
However, the visual analysis of large and complex datasets involves both visual and computational
challenges. Visual limits involve perceptual and cognitive limitations of the user and
restrictions of the display devices while computational limits are related to the computational
complexity of the involved algorithms.
The goal of this thesis is to advance the state of the art in visual analysis with respect to the
scalability to large datasets. Due to the multifaceted nature of scalability, the contributions
span a broad range to enhance computational scalability, to improve the visual scalability of
selected visualization approaches, and to support an analysis of high-dimensional data.
Concerning computational scalability, this thesis describes a generic architecture to facilitate
the development of highly interactive visual analysis tools using multi-threading. The
architecture builds on the separation of the main application thread and dedicated visualization
threads, which can be cancelled early due to user interaction. A quantitative evaluation
shows fast visual feedback during continuous interaction even for millions of entries.
Two variants of scatterplots address the visual scalability of different types of data and
tasks. For continuous data, a combination of 2D and 3D scatterplots intends to combine
the advantages of 2D interaction and 3D visualization. Several extensions improve the depth
perception in 3D and address the problem of unrecognizable point densities in both 2D and
3D. For partly categorical data, the thesis contributes Hierarchical Difference Scatterplots
to relate multiple hierarchy levels and to explicitly visualize differences between them in the
context of the absolute position of pivoted values.
While comparisons in Hierarchical Difference Scatterplots are only qualitative, this thesis
also contributes an approach for quantifying subsets of the data by means of statistical moments
for a potentially large number of dimensions. This approach has proven useful as an
initial overview as well as for a quantitative comparison of local features like clusters.
As an important application of visual analysis, the validation of regression models also
involves the scalability to multi-dimensional data. This thesis describes a design study of an
approach called HyperMoVal for this task. The key idea is to visually relate n-dimensional
scalar functions to known validation data within a combined visualization. The integration
with other multivariate views is a step towards a user-centric workflow for model building.
Being the result of collaboration with experts in engine design, HyperMoVal demonstrates
how visual analysis is suitable to significantly improve real-world tasks. Positive user feedback suggests a high impact of the contributions of this thesis also outside the visualization
research community. Moreover, most contributions of this thesis have been combined in a
commercially distributed software framework for engineering applications that will hopefully
raise the awareness and promote the use of visual analysis in multiple application domains.
Additional Files and Images
Additional images and videos
Additional files
Weblinks
No further information available.
BibTeX
@phdthesis{PH-2011-LDS,
title = "Large Data Scalability in Interactive Visual Analysis",
author = "Harald Piringer",
year = "2011",
abstract = "In many areas of science and industry, the amount of data is
growing fast and often already exceeds the ability to
evaluate it. On the other hand, the unprecedented amount of
available data bears an enormous potential for supporting
decision-making. Turning data into comprehensible knowledge
is thus a key challenge of the 21st century. The power of
the human visual system makes visualization an appropriate
method to comprehend large data. In particular interactive
visualization enables a discourse between the human brain
and the data that can transform a cognitive problem to a
perceptual one. However, the visual analysis of large and
complex datasets involves both visual and computational
challenges. Visual limits involve perceptual and cognitive
limitations of the user and restrictions of the display
devices while computational limits are related to the
computational complexity of the involved algorithms. The
goal of this thesis is to advance the state of the art in
visual analysis with respect to the scalability to large
datasets. Due to the multifaceted nature of scalability, the
contributions span a broad range to enhance computational
scalability, to improve the visual scalability of selected
visualization approaches, and to support an analysis of
high-dimensional data. Concerning computational scalability,
this thesis describes a generic architecture to facilitate
the development of highly interactive visual analysis tools
using multi-threading. The architecture builds on the
separation of the main application thread and dedicated
visualization threads, which can be cancelled early due to
user interaction. A quantitative evaluation shows fast
visual feedback during continuous interaction even for
millions of entries. Two variants of scatterplots address
the visual scalability of different types of data and tasks.
For continuous data, a combination of 2D and 3D scatterplots
intends to combine the advantages of 2D interaction and 3D
visualization. Several extensions improve the depth
perception in 3D and address the problem of unrecognizable
point densities in both 2D and 3D. For partly categorical
data, the thesis contributes Hierarchical Difference
Scatterplots to relate multiple hierarchy levels and to
explicitly visualize differences between them in the context
of the absolute position of pivoted values. While
comparisons in Hierarchical Difference Scatterplots are only
qualitative, this thesis also contributes an approach for
quantifying subsets of the data by means of statistical
moments for a potentially large number of dimensions. This
approach has proven useful as an initial overview as well as
for a quantitative comparison of local features like
clusters. As an important application of visual analysis,
the validation of regression models also involves the
scalability to multi-dimensional data. This thesis describes
a design study of an approach called HyperMoVal for this
task. The key idea is to visually relate n-dimensional
scalar functions to known validation data within a combined
visualization. The integration with other multivariate views
is a step towards a user-centric workflow for model
building. Being the result of collaboration with experts in
engine design, HyperMoVal demonstrates how visual analysis
is suitable to significantly improve real-world tasks.
Positive user feedback suggests a high impact of the
contributions of this thesis also outside the visualization
research community. Moreover, most contributions of this
thesis have been combined in a commercially distributed
software framework for engineering applications that will
hopefully raise the awareness and promote the use of visual
analysis in multiple application domains.",
month = may,
address = "Favoritenstrasse 9-11/E193-02, A-1040 Vienna, Austria",
school = "Institute of Computer Graphics and Algorithms, Vienna
University of Technology ",
keywords = "high dimensionality, Visualization, Scalability,
Interaction, Data analysis, multi-threading, scatter plots",
URL = "https://www.cg.tuwien.ac.at/research/publications/2011/PH-2011-LDS/",
}