10.3 Concepts of Interactive and Dynamic Graphics

This section will provide some deeper insights into concepts of interactive and dynamic graphics mentioned in the previous sections. [23] contains a taxonomy of interactive data visualization based on the notions of focusing, linking, and arranging views of data. [170] discusses some of the main concepts in the context of interactive graphics software.

10.3.1 Scatterplots and Scatterplot Matrices

Perhaps the most basic concepts for statistical graphics are scatterplots (see Figs. 10.1, 10.2, 10.3, and 10.4). In a simple scatterplot, we place different symbols (sometimes also called glyphs) at

- and

-positions in a two-dimensional plot area. These positions are determined by two of the variables. The type, size, and color of the symbols may depend on additional variables. Usually, explanatory information such as axes, labels, legends, and titles are added to a scatterplot. Additional information such as a regression line or a smoothed curve can be added as well.

**Figure 10.1:** Screenshot of the ''Places'' data in ArcView/XGobi. A map view of the spatial locations is displayed in ArcView at the top. The two XGobi windows at the bottom are showing scatterplots of Crime (horizontal) vs. Education (vertical) (*left*) and Recreation (horizontal) vs. Arts (vertical) (*right*). Locations of high Crime have been brushed and identified, representing some of the big cities in the U.S. Also, locations of high Education (above ) have been brushed, mostly representing locations in the northeastern U.S. All displays have been linked
$\includegraphics[width=11.7cm]{text/2-10/fig1.eps}$

**Figure 10.2:** Screenshot of the ''Places'' data in GGobi. A scatterplot of Crime (horizontal) vs. Education (vertical) is displayed at the top right, a scatterplot matrix of five of the variables is displayed at the bottom right, and a density (1D) plot of Population is displayed at the bottom left. The data has been brushed with respect to Population: one group for a Population less than , one group for a Population between and , and one group for a Population above . The scatterplot of Crime and Education seems to reveal that higher Population is associated with higher Crime and higher Education. The scatterplot matrix seems to reveal that higher Population is also associated with higher Arts and higher HealthCare. All displays have been linked
$\includegraphics[width=11.7cm]{text/2-10/fig2.eps}$

If the data consist of more than two variables (e.g., somewhere between three to ten), the data can be displayed by a scatterplot matrix (see Figs. 10.2 and 10.3) that shows all pairwise scatterplots of the variables. The essential property of a scatterplot matrix is that any adjacent pair of plots have one of their axes in common. When plotting the full array of all $n \times (n-1)$ pairwise scatterplots, each plot in the upper triangle of plots has a matching plot in the lower triangle of plots, with the exception that the axes in these pairs of plots have been flipped. Therefore, sometimes only the upper or lower triangle of scatterplots is displayed; thus gaining plotting speed and allowing each individual plot to be somewhat larger. Early examples of scatterplot matrices can be found in [46] and [48] for example. [46] initially called an array of pairwise scatterplots for three variables a draftsman's display and for four (or more) variables a generalized draftsman's display. In their (generalized) draftsman's display, each point is plotted with the same symbol. When encoding additional information through the use of different plotting symbols, [46] speak of symbolic (generalized) draftsman's displays. Today, we hardly make any distinction of these different types of displays and just speak of scatterplot matrices.

[121] and [171] discuss features good scatterplots and related interactive software should provide, e.g., meaningful axes and scales, features for rescaling and reformatting, good handling of overlapping points and missing data, panning and zooming, and querying of points. [33] describe techniques for scatterplot matrices particularly useful for large numbers of observations.

10.3.2 Brushing and Linked Brushing/Linked Views

Brushing, as introduced in [13] and [14], initially was considered as a collection of several dynamic graphical methods for analyzing data displayed in a scatterplot matrix. The central idea behind brushing is a brush, usually a rectangular area on the computer screen, that is moved by the data analyst to different positions on the scatterplot or any other graphical display. Four brushing operations were introduced in [13]: highlight, shadow highlight, delete, and label. The most commonly used brushing technique is highlighting - often in the context of linked brushing, i.e., for linked views. All points that are inside the brush in the currently selected display are highlighted, i.e., marked with a different symbol or color. Simultaneously, points that correspond to those points are automatically highlighted with the same symbol/color in all linked views.

A very useful brushing technique is the transient paint mode. As the brush is moved, the new points that come inside the brush are highlighted while points that move outside the brush are no longer highlighted.

While brushing initially was only developed for scatterplot matrices, it quickly has been adapted to other types of linked graphical displays. Linked brushing among different displays is one of the most useful techniques used within dynamic and statistical graphics. Linked brushing can be applied to graphical representation of continuous data, summary data such as histograms ([140]), or even displays of categorical data such as mosaic plots ([89], [90]). All dynamic statistical graphics software packages support linked brushing among different types of graphical displays these days.

When dealing with massive data sets, it is often beneficial to focus on particular subgroups of the data and also be able to quickly return to a previous stage of the analysis. Selection sequences ([165,91]) are an extension of the conventional linked-highlighting paradigm as they store the whole hierarchical path of a selection and allow an easy editing, redefinition, and interrogation of each selection in the path of the analysis. In a selection sequence, we can easily jump from one branch of the hierarchic selection tree to another.

10.3.3 Focusing, Zooming, Panning, Slicing, Rescaling, and Reformatting

Focusing techniques, as introduced in [25], are based on the idea that it often might be easier for a human analyst to understand several individual displays, each focused on a particular aspect of the underlying data, rather than looking at the full data set. Focusing techniques include subset selection techniques, e.g., panning and zooming or slicing, and dimensionality reduction techniques, e.g., projection. Methods for focusing can be automatic, interactive, or a combination of both. While focusing shows only part of the data at a time, it is important to display multiple linked views of the data, perhaps each focusing on a different aspect of the data, to maintain the full picture of the data.

Zooming is a technique that can be used for inspecting details of the data when overplotting arises. Zooming can be done via some kind of a magnifying glass or by manually selecting subsections of the visible axes, e.g., via sliders. The main idea behind zooming is that when several points overplot in the full display, it may indeed turn out that these points are exactly the same when zooming into the neighborhood of these points - or, what most frequently happens, that these points have a particular structure and are not exactly the same.

Panning is closely related to zooming. An analyst should know which subset of the data is currently visible. Therefore, an information plot should reveal the current location on which subregion we have zoomed.

Slicing, as described in [79] and [80], is a technique that takes sections (or slices) of a high-dimensional data set. While slicing (and projections) are useful means for an exploratory data analysis, these techniques also have their limitations. However, these limitations may be overcome by combining slicing and projections in so-called prosections ([80]). An extension of individual prosection views is the prosection matrix ([168]), some kind of a density plot summarizing multi-dimensional volumetric information. The prosection matrix is a useful representation for engineering design, allowing an analyst to interactively find a design that leads to a maximal manufacturing yield.

Rescaling is a technique that allows a user to quickly change the scale of the displayed variables, e.g., by taking the log, square root, standardize, or by mapping to a

scale. When looking at multiple variables, it might also be beneficial to have a common scale (from the minimum across all variables to the maximum across all variables). By interactively rescaling variables, an analyst may identify useful transformations for a follow-up modeling step of the data.

Reformatting includes features as simple as swapping

and

axes in a scatterplot or changing the order of coordinate axes in a parallel coordinate plot.

10.3.4 Rotations and Projections

Rotation, as introduced in [73] and later refined in [14], is a very powerful tool for understanding relationships among three or more variables. The familiar planar scatterplot is enhanced by rotation to give the illusion of a third dimension. We typically rotate plots in search of some interesting views that do not align with the plot axes and therefore cannot be seen in a scatterplot matrix. Usually, a three-dimensional point cloud representing three of the variables is shown rotating on a computer screen. The rotation shows us different views of the points and it produces a 3D effect while moving, allowing us to see depth. Basic rotation controls with a mouse have been introduced in [14].

Mathematically speaking, each rotation within a 3D space onto a 2D computer screen is based on a projection. Obviously, it is mathematically possible to project high-dimensional data onto low-dimensional subspaces and gain insights into the underlying data through dynamic visualizations of such projections. One particular example of a continuous sequence of projections, the grand tour, will be discussed in the next section. [52] discuss methods how to manually control high-dimensional data projections. [50] provides a variety of training data sets that help new users to get a visual feeling of the underlying high-dimensional data set when seen as a projection into low-dimensional space.

10.3.5 Grand Tour

Often, simple plot rotation, as discussed in the previous section, does not suffice to see all interesting views of the data. To produce a plethora of possible interesting views, the grand tour has been introduced in [7] and [21]. In [7], the grand tour has been described as ''a method for viewing multivariate statistical data via orthogonal projections onto a sequence of two-dimensional subspaces. The sequence of subspaces is chosen so that it is dense in the set of all two-dimensional subspaces.'' Some of the features the grand tour can be used for are examining the overall structure and finding clusters or outliers in high-dimensional data sets.

In the context of the grand tour, an alternating sequence of brushing, looking at additional projections from the grand tour, brushing, and so on, is referred to as the brush-tour strategy in the remainder of this chapter. We can only be sure that a cluster visible in one projection of the grand tour really is a cluster if its points remain close to each other in a series of projections and these points move similarly when the grand tour is activated. If points move apart, we probably found several subclusters instead of one larger cluster.

[181] discusses a form of the grand tour for general

-dimensional space. The algorithms for computing a grand tour are relatively computationally intensive. [190] discuss an approximate one- and two-dimensional grand tour algorithm that was much more computationally efficient than the Asimov winding algorithm. That algorithm was motivated in part by a discussion of the Andrews (multidimensional data) plot, discussed in Sect. 10.3.9, which can also be regarded as a highly restricted pseudo tour.

**Figure 10.3:** Screenshot of the ''Places'' data in CrystalVision. A parallel coordinate plot of all variables is shown as the main plot. A scatterplot matrix of all variables with a scatterplot of Crime (horizontal) vs. Education (vertical) is shown as a popup in the top right. The data has been brushed according to high and low Population. According to the parallel coordinate plot, higher Population is associated with higher Arts and HousingCost. The scatterplot of Crime and Education seems to reveal that higher Population is also associated with higher Crime and higher Education. All displays have been linked
$\includegraphics[width=11.7cm]{text/2-10/fig3.eps}$

10.3.6 Parallel Coordinate Plots

Parallel coordinate plots ([102,180])(see Fig. 10.3) are a geometric device for displaying points in high-dimensional spaces, in particular, for dimensions greater than three. The idea is to sacrifice orthogonal axes by drawing the axes parallel to each other resulting in a planar diagram where each

-dimensional point $(x_{1}, \ldots, x_{d})$ is uniquely represented by a continuous line. The parallel coordinate representation enjoys some elegant duality properties with the usual Cartesian coordinates and allows interpretations of statistical data in a manner quite analogous to two-dimensional Cartesian scatterplots. This duality of lines in Cartesian plots and points in parallel coordinates extends to conic sections. This means that an ellipse in Cartesian coordinates maps into a hyperbola in parallel coordinates. Similarly, rotations in Cartesian coordinates become translations in parallel coordinates.

The individual parallel coordinate axes represent one-dimensional projections of the data. We can isolate clusters by looking for separation between data points on any axis or between any pair of axes. Because of the connectedness of the multidimensional parallel coordinate diagram, it is usually easy to see whether or not this clustering propagates through other dimensions.

The use of parallel coordinate plots for a

-dimensional grand tour sequence, sometimes called a parallel coordinate grand tour, has been described in [181] and [184]. By using such a parallel coordinate grand tour, an analyst can find orientations where one or more clusters are evident. The general strategy for detecting clusters is the following: We begin with a static plot of the data in parallel coordinates. If there are any gaps along a horizontal axis (which incidentally does not need to coincide with the coordinate axes), then we color the individual clusters with distinct colors. Once all clusters are identified in the original coordinate system, we run the grand tour until an orientation of the axes is found in which another gap in one of the horizontal axes is found. Again we color the individual subclusters with distinct colors. This procedure is repeated until no further subclusters can be identified. This is another example of the brush-tour strategy referred to in Sect. 10.3.5. Indeed, when to stop is a matter of judgement, since the procedure can be repeated until practically every data point can be individually colored. The crucial issue, which really depends on the dynamic graphics, is to see that clusters identified in this manner track coherently with the grand tour animation. That is, data points of the same color stay together as the grand tour rotation proceeds. If they do not, then there are likely to be substructures that can be identified through further grand tour exploration.

Slopes of parallel coordinate line segments can also be used to distinguish clusters. That is, if a group of line segments slopes, say, at 45 $^{\circ}$ to the horizontal and another group slopes at, say, at 135 $^{\circ}$ to the horizontal, then even though the lines fully overlap in both adjacent parallel coordinate axes and there is no horizontal gap, these sets of lines represent two distinct clusters of points. Fortunately, when such indication of clustering exists, the grand tour will also find an orientation of axes in which there is a horizontal gap. Thus the general strategy is to alternate color brushing of newly discovered clusters with grand tour rotations until no further clusters can be easily identified.

In some software packages, the parallel axes in a parallel coordinate plot are drawn as horizontal lines (e.g., in ExplorN) while in other software packages they are drawn as vertical lines (e.g., in XGobi). While it may be argued that this makes no difference from a mathematical point of view, the wider aspect ratio in the horizontal mode coupled with a more usual sense of plotting data along an abscissa rather than along the ordinate tends to allow for an easier human interpretation. Detailed interpretations are given in [180].

10.3.7 Projection Pursuit and Projection Pursuit Guided Tours

While the grand tour, as discussed in Sect. 10.3.5, is a dynamic tool, projection pursuit ([110,78,92]), see also Chap. III.6, is a static tool. Projection pursuit results in a series of static plots of projections that are classified as ''interesting'' with respect to a particular projection pursuit index. Many projection pursuit indexes, e.g., the ones discussed in [105], [76], [83], [119], [120]), [53], and [129], are based on the idea to search for the most non-normal projections. Usually, each projection pursuit index, a function of all possible projections of the data, results in many hills and valleys. [76] suggests a projection pursuit algorithm that initially searches for relatively high values of the function and then starts derivative-based searches to find the global maximum.

The combination of grand tour and projection pursuit, called projection pursuit guided tour ([54]), helps to direct the grand tour towards ''interesting'' projections. This combination of the two methods into an interactive and dynamic framework not only shows the ''interesting'' projections but it maintains the motion so the user has a feeling how successive ''interesting'' projections have been obtained.

10.3.8 Pixel or Image Grand Tours

The idea of the pixel or image grand tour (IGT) evolved from an initial application of one-dimensional tours to image data. Multiple registered images can be regarded as a multidimensional image in which each pixel location has a vector attached to it. For example, ordinary red, green, and blue (RGB) color images are vector-valued images. The basic idea of the image tour is to apply the same one-dimensional grand tour to each vector for all pixel locations in an image. This combines the vectors into a scalar function of time which can be rendered as a time-varying gray-scale image. The [190] algorithm generalizes easily to two dimensions, so that an alternate approach to the IGT is to project the multidimensional vector into two dimensions and render the image as a false color image with two complementary colors such as red and cyan. It should be noted that red and cyan are complementary colors in the RGB color model used for most computer monitors whereas red and green are complementary colors in the conventional color model, introduced by the Commission Internationale de l'Éclairage (CIE) in 1931. A detailed comparison of these two and other color models can be found in [74], Chap. 13. The initial discussion of the IGT was given by [189]. Additional examples of the IGT can be found in [160].

Currently, the IGT software, written in C++ by Qiang Luo, is available for Silicon Graphics, Inc., (SGI) workstations. To obtain a fast rendering rate of large images, the software intensively uses SGI hardware features such as the $\alpha$ -channel hardware. There exists also a MATLAB version of the IGT written by Wendy Martinez. Both versions of the IGT software are not accessible through a Web site but can be obtained from the corresponding software developers.

10.3.9 Andrews Plots

The Andrews (multidimensional data) plot, as introduced in [1] is based on a series of Fourier interpolations of the coordinates of multi-dimensional data points. Points that are close in some metric will tend to have similar Fourier interpolations and therefore will tend to cluster in the Andrews plot. Thus, the Andrews plot is an informative graphical tool most useful to detect clustering.

Ideas underlying the Andrews plot and the grand tour are quite similar. However, in contrast to the grand tour, the Andrews plot is a static plot while the grand tour is dynamic. Although dynamic renditions of the Andrews plot exist, and these sometimes also are (incorrectly) referred to as one-dimensional grand tour ([61]), the Andrews plot is not a grand tour since it cannot sweep out all possible directions as pointed out in [190]. Three-dimensional generalizations of the Andrews plot and other pseudo grand tours have been introduced in [190] as well.

10.3.10 Density Plots, Binning, and Brushing with Hue and Saturation

[33] present techniques for visualizing data in scatterplots and scatterplot matrices when the data consists of a large number of observations, i.e., when overplotting of points frequently occurs using standard techniques. A key idea to address in the visualization of a large number of observations is based on the estimation and representation of densities. For this purpose, the data is often binned into an $n \times n$ matrix for two-dimensional representation (or an $n \times n \times n$ matrix for three-dimensional representation). Possibilities to visualize the number of data points in each bin can be based on gray-scale (or color) density representations or by symbol area such as using differently sized hexagon symbols, where the area of the plot symbol is proportional to the count in each bin. [27] further extends these ideas and presents additional low-dimensional displays for data that consist of a large number of observations. [137] provides a general overview on techniques for density estimation, including averaged shifted histograms (ASH) and kernel density estimators, including possible visualization techniques via contour surfaces, (transparent) $\alpha$ -level contours, and contour shells. Further details on multivariate density estimation and visualization can be found in Chap. III.4.

[187] use hue and saturation for plotting and brushing. For each individual point, the hue is almost fully desaturated with black. When points are overplotted, the hue components are added. The saturation level should be interactively adjustable by the analyst. If many points overplot, the pixel will be fully saturated. If fewer points overplot, the pixel will be shown in a less saturated color. Often, computer hardware devices such as the $\alpha$ -channel allow the blending of pixel intensities with no speed penalties. When using saturation for parallel coordinate plots and the level of saturation corresponds with the degree of overplotting, this creates a kind of parallel coordinate density plot ([187], [188]).

**Figure 10.4:** Screenshot of the ''Places'' data in Mondrian. The variables Crime, Education, and Population have been discretized for this figure. A mosaic plot of Crime (first vertical division, grouped as below (*left*) and above (*right*)), Education (first horizontal division, grouped as to (*top*), below (*middle*), and above (*bottom*)), and Population (second vertical division, grouped as to (*left*), below (*middle*), and above (*right*)) is displayed at the top right. A histogram of Transportation is shown at the *bottom left*, boxplots of HealthCare and Arts are shown at the bottom middle, and a scatterplot of Climate (horizontal) vs. HousingCost (vertical) is shown at the *bottom right*. The mosaic plot shows that Crime, Education, and Population are not independent. The different displays show how average Transportation (that has been brushed in the histogram) is related to the other variables. All displays have been linked
$\includegraphics[width=11.7cm]{text/2-10/fig4.eps}$

10.3.11 Interactive and Dynamic Graphics for Categorical Data

Although categorical data are quite common in the real world, little research has been done for their analysis and visualization when compared to quantitative data. However, there exist useful interactive and dynamic graphics for categorical data ([127,166]). For example, brushing and linking of categorical data represented via bar charts and pie charts can be as useful as for quantitative data ([96]). Modified bar charts where the same height is used for each category and the width is varied according to the number of counts are called spine plots ([96]). When interactively highlighting a category of interest, spine plots allow the analyst to visually compare the proportions in the different subcategories by looking at the heights of the highlighted areas. Examples of interactive graphics for categorical data such as spine plots and interactive mosaic plots (see Fig. 10.4) can be found in Hofmann ([89], [90]). [175]) discuss spreadplots (and their implementation in ViSta), a method for laying out and simultaneously controlling graphics for categorical data.

[15] present a collection of papers dealing with the visualization of categorical data. Main topics include graphics for visualization, correspondence analysis, multidimensional scaling and biplots, and visualization and modeling. Several of these approaches benefit from interactive and dynamic graphics.