next up previous contents index
Next: 10.7 Outlook Up: 10. Interactive and Dynamic Previous: 10.5 Interactive 3D Graphics

Subsections



10.6 Applications in Geography, Medicine, and Environmental Sciences


10.6.1 Geographic Brushing and Exploratory Spatial Data Analysis

Linking statistical plots with geography for analyzing spatially referenced data has been discussed widely in recent years. Monmonier ([117], [118]) describe a conceptual framework for geographical representations in statistical graphics and introduce the term geographic brushing in reference to interacting with the map view of geographically referenced data. But geographic brushing does not only mean pure interaction with the map. In addition, this term has a much broader meaning, e.g., finding neighboring points and spatial structure in a geographic setting.

The idea to apply interactive and dynamic graphics for EDA in a spatial (geographic) context resulted in the term exploratory spatial data analysis (ESDA). However, ESDA is more than just EDA applied to spatial data. Specialized ESDA methods have been developed that take the special nature of spatial data explicitly into account. ESDA is discussed in more details in Anselin ([2], [3]) and [75], Chap. 4. [70] provides examples for the use of dynamic and interactive parallel coordinate plots for the exploration of large spatial and spatio-temporal data bases.

Many software solutions have been developed that link geographic displays with interactive statistical software packages. In [116], a grand tour is linked to an image to assess the clustering of landscape types in the band space of a LandSat image taken over Manaus, Brazil. In [33] and [118], a scatterplot matrix is linked to a map view. In REGARD map views are linked with histograms and scatterplots and, moreover, diagnostic plots for assessing spatial dependence are also available. Another exploratory system that links histograms and scatterplots with latitude and longitude (and depth) coordinates is discussed in [113]. In [40], (bivariate) ray-glyph maps have been linked with scatterplots. [122] discusses the interactive analysis of spatial data, mostly environmental and disease data, under ISP. [106] report on an interface between the image program MTID and XGobi, used for the exploratory analysis of agricultural images. [67] provide an overview on existing multivariate (statistical) displays for geographic data. Other developments are the cartographic data visualizer, cdv ([69]), where a variety of plots are linked with geography, the Space-Time-Attribute Creature/Movie, STAC/M ([126]), that searches for patterns in Geographic Information System (GIS) data bases under the control of a Genetic Algorithm, and an exploratory spatial analysis system in XLisp-Stat ([19]).

In combination with the GIS ArcView, XGobi and XploRe also have been used to detect structure and abnormalities in geographically referenced data sets such as satellite imagery, forest health monitoring, and precipitation data (Cook et al., [57], [58]; Symanzik et al., [158], [152], [155]) (see Fig. 10.1). In addition to the ArcView/XGobi/XploRe environment, there are several other examples where GIS's and (graphical) statistical packages have been linked. [197] demonstrate how S and the GRASS GIS can be jointly used for archaeological site classification and analysis. [138] links STATA with ArcView. The spatial data analysis software SpaceStat has been linked with ARC/INFO ([6]) and with ArcView ([4], [5]). In [82], the designing of a software system for interactive exploration of spatial data by linking to ARC/INFO has been discussed, and in [202], a spatial statistical analysis module implemented in ArcView using Avenue has been discussed. [115] describes the S+GISLink, a bidirectional link between ARC/INFO and S-PLUS, and [9] describes the S+Grassland link between S-PLUS and the Grassland GIS. Finally, a comparison of the operational issues of the SpaceStat/ArcView link and the S+Grassland link has been given in [10].


10.6.2 Interactive Micromaps

Over the last decade, researchers have developed many improvements to make statistical graphics more accessible to the general public. These improvements include making statistical summaries more visual and providing more information at a time. Research in this area involved converting statistical tables into plots ([28,37]), new ways of displaying geographically referenced data ([40]), and, in particular, the development of linked micromap (LM) plots (see Fig. 10.6), often simply called micromaps ([41]; Carr et al. [38], [39]). LM plots were first presented in a poster session sponsored by the ASA Section on Statistical Graphics at the 1996 Joint Statistical Meetings in Chicago (''Presentation of Data in Linked Attribute and Geographic Space'' by Anthony R. Olsen, Daniel B. Carr, Jean-Yves P. Courbois, and Suzanne Pierson). More details on the history of LM plots and their connection to other research can be found in these early references on micromaps. Recent references on LM plots ([43,29]) focus on their use for communicating summary data from health and environmental studies. Sample S-PLUS code, data files, and resulting plots from Daniel B. Carr's early micromap articles can be accessed at ftp://galaxy.gmu.edu/pub/dcarr/newsletter/micromap/.

Figure 10.6: Linked micromap plot of the ''Places'' data, adapted from Daniel B. Carr's sample S-PLUS code. The variables Education and Crime have been summarized at the state level for this figure. For each of the $ 50$ states (plus Washington, D.C.), the minimum, median, and maximum of the Education and Crime indexes have been obtained for the cities that geographically belong to a state. It should be noted that for several of the states data exist for only one city. The map and statistical displays have been sorted with respect to decreasing median Education Index. The zig-zag curve of the related median Crime Index is an indicator of little correlation between these two variables. Numerically, the ecological correlation between median Education Index and median Crime Index is almost equal to zero
\includegraphics[width=11.7cm]{text/2-10/fig6.eps}

Linked micromap plots provide a new statistical paradigm for the viewing geographically referenced statistical summaries in the corresponding spatial context. The main idea behind LM plots is to focus the viewer's attention on the statistical information presented in a graphical display. Multiple small maps are used to provide the appropriate geographic reference for the statistical data.

Initially, LM plots were only static representations on paper. The next stage of LM plots was aimed at interactive displays on the Web. Eventually, generalized maps for all states in the U.S. and several counties were automatically created for use on the U.S. Environmental Protection Agency (EPA) Cumulative Exposure Project (CEP) Web site ([161]). Most current applications of interactive LM plots on the Web make use of these generalized maps.

The idea of using micromaps on the Web was first considered for the EPA CEP Web site (formerly accessible at http://www.epa.gov/CumulativeExposure/). Initially, the EPA wanted to provide fast and convenient Web-based access to its hazardous air pollutant (HAP) data for 1990. Unfortunately, no part of the interactive CEP Web site was ever published due to concerns that the 1990 data was outdated at the intended release date in 1998. Only a static version of the CEP Web site without tables and micromaps was accessible. More details on the work related to the planned interactive CEP Web site can be found in Symanzik ([150], [151], [161]).

The U.S. Department of Agriculture (USDA) - National Agricultural Statistics Service (NASS) Research and Development Division released a Web site (http://www.nass.usda.gov/research/sumpant.htm) in September 1999 that uses interactive micromaps to display data from the 1997 Census of Agriculture. While the end user who accesses this Web site gets the impression of fully interactive graphics, this is not the case. The $ 10$ micromaps ($ 5$ crops$ \times$$ 2$ arrangements) plus one overview micromap were precalculated in S-PLUS and were stored as jpg images. It is not possible to create any new micromap display ''on the fly'' on this Web site.

The National Cancer Institute (NCI) released a Web site in April 2003 that provides interactive access to its cancer data via micromaps. This Web site is Java-based and creates micromaps ''on the fly''. [179] and [30] provide more details on the design of the NCI Web site that is accessible at http://www.statecancerprofiles.cancer.gov/micromaps.

nViZn (read envision) is a JAVA-based software development kit (SDK), currently developed and distributed by SPSS (http://spss.com/nvizn/). It is the follow-up to the Graphics Production Library (GPL), described in [42], developed within the BLS. nViZn ([196]) is based on a formal grammar for the specification of statistical graphics ([195]), see also Chap. II.11. In addition to capabilities already present in the original GPL, nViZn has many additional features. Most useful for the display of data in a geographic context are the capabilities that enable a programmer to create interactive tables and linked micromaps in nViZn. Experiences with nViZn, its advantages and current problems, and its capabilities for the display of Federal data via LM plots are described in more detail in [104], [157], and [156].

Micromap implementations that allow the user to create new LM plots ''on the fly'' often provide features to switch from one geographic region or subregion to another, choose among several variables, resort the data increasingly or decreasingly according to different statistics (such as mean, median, minimum, or maximum of the data values in the underlying geographic region), and display different graphical summaries of the data (e.g., dotplots, boxplots, confidence intervals, or even time series). So, in an interactive environment, a user might want to create a LM plot of Education and Arts (sorted by increasing maximum Education) after having studied the LM plot in Fig. 10.6 - and then immediately resort the display by decreasing maximum Arts.


10.6.3 Conditioned Choropleth Maps

Conditioned choropleth maps (CCmaps), described in Carr et al. ([43], [30]), focus on spatial displays that involve one dependent variable and two independent variables. CCmaps promote interactive hypothesis generation, common for epidemiological and environmental applications. In fact, applications from the National Center for Health Statistics (NCHS) and the EPA motivated the development of CCmaps. CCmaps are written in Java and can be freely obtained from http://www.galaxy.gmu.edu/~dcarr/ccmaps.

The main interactive component of CCmaps are partitioning sliders that allow the user to dynamically partition the study units into a  $ 3 \times 3$ layout of maps. The sliders allow a user to create, examine, and contrast subsets for the purpose of generating hypotheses about patterns in spatially referenced data. For example, in a medical application one of the sliders might control the age intervals and the second slider might control the years of active smoking in a study on cancer mortality rates across the U.S. The resulting $ 9$ maps will allow an analyst to develop hypotheses on spatial patterns within panels or among panels. Additional features of this CCmaps implementation are dynamic quantile-quantile (QQ) plots and pan and zoom widgets to allow closer inspection of data at the U.S. county level.


next up previous contents index
Next: 10.7 Outlook Up: 10. Interactive and Dynamic Previous: 10.5 Interactive 3D Graphics