Loading [MathJax]/extensions/MathZoom.js
Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations | IEEE Conference Publication | IEEE Xplore

Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations


Abstract:

We introduce the concept of just-in-time descriptive analytics as a novel application of computational and statistical techniques performed at interaction-time to help us...Show More

Abstract:

We introduce the concept of just-in-time descriptive analytics as a novel application of computational and statistical techniques performed at interaction-time to help users easily understand the structure of data as seen in visualizations. Fundamental to just-intime descriptive analytics is (a) identifying visual features, such as clusters, outliers, and trends, user might observe in visualizations automatically, (b) determining the semantics of such features by performing statistical analysis as the user is interacting, and (c) enriching visualizations with annotations that not only describe semantics of visual features but also facilitate interaction to support high-level understanding of data. In this paper, we demonstrate just-in-time descriptive analytics applied to a point-based multi-dimensional visualization technique to identify and describe clusters, outliers, and trends. We argue that it provides a novel user experience of computational techniques working alongside of users allowing them to build faster qualitative mental models of data by demonstrating its application on a few use-cases. Techniques used to facilitate just-in-time descriptive analytics are described in detail along with their runtime performance characteristics. We believe this is just a starting point and much remains to be researched, as we discuss open issues and opportunities in improving accessibility and collaboration.
Date of Conference: 14-19 October 2012
Date Added to IEEE Xplore: 03 January 2013
ISBN Information:
Conference Location: Seattle, WA, USA
References is not available for this document.

1 Introduction

A good visualization reveals structure and patterns in data, and facilitates exploration of relationships between variables. The challenge is that as the data gets more complex (e.g. multiple dimensions, multiple datasets) inevitably representation and interaction becomes more complex. For example, for highdimensional data, representation may exhibit clutter and interactive exploration may become tedious [1]. To effectively support exploratory activities, techniques should support (1) qualitative understanding of high-level structure of data, (2) development of hypotheses for deep analysis of relationships between variables, and (3) provenance and collaboration on qualitative insight (see also [2] [3]). Our focus in this paper is (1).

Select All
1.
Peng, W., Ward, M. O., Rundensteiner, E. A. Clutter Reduction in Multi-dimensional Data Visualization using Dimension Reordering. Proc. of the IEEE Information Visualization, pp. 89-96, 2004.
2.
Seo, J., Shneiderman, B. A Rank-by-feature Framework for Unsupervised Multidimensional Data Exploration Using Low Dimensional Projections. Proc. of the IEEE Information Visualization, pp. 65-72, 2004.
3.
Guo, D., Gahegan M., Peuquet D., MacEachren, A. Breaking Down Dimensionality: An Effective Feature Selection Method for High Dimensional Clustering. Workshop on Clustering High Dimensional Data and its Applications, Proc. 3 SIAM International Conference on Data Mining, 2003.
4.
Guo, D. Coordinating Computational and Visual Approaches for Interactive Feature Selection and Multivariate Clustering. Information Visualization 2, pp. 232-246, 2003.
5.
Jong, H., Rip, A. The Computer Revolution in Science: Steps Toward the Realization of Computer-supported Discovery Environments. Artificial Intelligence 91, pp. 225-256, 1997.
6.
Wong, P.C. Visual Data Mining. IEEE Computer Graphics Applications19, pp. 20-31, 1999.
7.
Ankerst, M., Ester, M., Kriegel, H.-P. Towards an Effective Cooperation of the User and the Computer for Classification. Proc. 6th International Conf. on Knowledge Discovery and Data Mining (KDD 00), pp.179-188, 2000.
8.
Kandogan, E. Visualizing Multi-dimensional Clusters, Trends, and Outliers using Star Coordinates. Proc. of the seventh International Conference on Knowledge Discovery and Data Mining (KDD 01), pp.107-116, 2001.
9.
Jain, A., Murty, M. N., Flynn, P. J. Data Clustering: A Review. ACM Computing Surveys 31(3), pp. 264-323, 1999.
10.
Han, J., Kamber, M., Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2001.
11.
MacQueen, J. B. Some Methods for Classification and Analysis of Multivariate Observations. Proc. of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp.281-297, 1967.
12.
Sheikholeslami, G., Chatterjee, S., Zhang, A., Wavecluster: A Multi-resolution Clustering Approach for Very Large Spatial Databases. Proc. of Very Large Databases Conference, pp.428-439, 1998.
13.
Abul, A. L., Alhajj, R., Polat, F., Barker K. Cluster Validity Analysis Using Subsampling. Proc. of IEEE International Conference on Systems, Man, and Cybernetics 2, pp. 1435-1440, 2003.
14.
Shneiderman, B. Inventing Discovery Tools: Combining Information Visualization with Data Mining. Information Visualization 1(1), pp. 5-12, 2002
15.
Keim, D. A. 2002. Information Visualization and Visual Data Mining. IEEE Transactions on Visualization and Computer Graphics 8(1), pp. 1-8, January 2002.
16.
Ankerst, M., Elsen, C., Ester, M., Kriegel, H-P. Visual Classification: An Interactive Approach to Decision Tree Construction. Proc. 5th Intl. Conf. on Knowledge Discovery and Data Mining (KDD 99), pp. 392-396, 1999.
17.
Teoh, S. T., Ma, K-L. Starclass: Interactive Visual Classification using Star Coordinates. Proc. of 3rd SIAM International Conference on Data Mining, pp. 178-185, 2003.
18.
Borg, I., Groenen, P. Modern Multidimensional Scaling: Theory and Applications (2nd ed.). New York: Springer-Verlag, 2005.
19.
Kohonen, T. Self-Organizing Maps (2nd ed.). Berlin: Springer, 1997.
20.
Jolliffe, T. I. Principal Component Analysis. Springer Press, 2002.
21.
Paulovich, F. V., Nonato, L. G., Minghim, R., Levkowitz, H. Least Square Projection: A Fast High-precision Multidimensional Projection Technique and its Application to Document Mapping. IEEE Trans. Vis. Comput. Graph. 14(3), pp. 564-575, 2008.
22.
Joia, P., Paulovich, F., Coimbra, D., Cuminato, J. A., Nonato, L.G. Local Affine Multidimensional Projection. IEEE Transactions on Visualization and Computer Graphics 17, pp. 2563-2571, 2011.
23.
Ingram, S., Munzner, T., Glimmer, O. M. Multilevel MDS on the GPU. IEEE Transactions on Visualization and Computer Graphics 15, pp. 249-261, 2009.
24.
Yang, J., Ward, M. O., Rundensteiner, E. A., Huang, S. Visual Hierarchical Dimension Reduction for Exploration of High Dimensional Datasets. Proc. of the Joint Eurographics - IEEE TCVG Symposium on Visualization, pp. 19-28, 2003.
25.
Asimov, D. The Grand Tour: A Tool for Viewing Multidimensional Data. SIAM Journal of Scientific and Statistical Computing 6(1), pp. 128-143, 1985.
26.
Friedman, J., Tukey, J. W. A Projection Pursuit Algorithm for Exploratory Data Analysis. IEEE Transactions on Computers 23, pp. 881-890, 1974.
27.
Cook, D., Buja, A., Cabrera, J., Hurley, C. Grand Tour and Projection Pursuit. Journal of Computational and Graphical Statistics 23, pp.155-172, 1995.
28.
Bertini, E., Tatu, A., Keim, D. Quality Metrics in High-Dimensional Data Visualization: An Overview and Systematization. IEEE Transactions on Visualization and Computer Graphics 17(12), pp. 2203-2212, 2011.
29.
Johansson , S., Johansson, J. Interactive Dimensionality Reduction Through User-defined Combinations of Quality Metrics. IEEE Trans. On Visualization and Computer Graphics 15(6), pp. 993-1000, 2009.
30.
Yang, J., Peng, W., Ward, M. O., Rundensteiner, E. A. Interactive Hierarchical Dimension Ordering, Spacing and Filtering for Exploration of High Dimensional Datasets. IEEE Symposium on Information Visualization 2003 (InfoVis 03), pp. 105-112, 2003.
Contact IEEE to Subscribe

References

References is not available for this document.