GOG presumes no connection between a statistical method and a geometric representation. Histogram bins need not be represented by histograms. Tukey schematic plots (his original word for box plots) need not be represented by boxes and whiskers. Regressions need not be represented by lines or curves. Separating geometry from data (and from other graphical aspects such as coordinate systems) is what gives GOG its expressive power. We choose geometric representation objects independently of statistical methods, coordinate systems, or aesthetic attributes.
As Fig. 11.1 indicates, the geometry component
of GOG receives a varset and outputs a geometric
graph. A geometric graph is a subset of
. For
our purposes, we will be concerned with geometric graphs for
which
. Geometric graphs are enclosed in
bounded regions:
![]() |
Geometric graphs are produced by graphing functions
that have geometric names like
line
or tile
. A geometric graph is the
image of
. And a graphic, as used in the title of
this chapter, is the image of a graph under one or more
aesthetic functions. Geometric graphs are not visible. As
[6] points out, visible elements have features
not present in their geometric counterparts.
Figures 11.9 and 11.10 illustrate the exchangeability of geometry and statistical methods. The graphics are based on UN data involving 1990 estimates of female life expectancy and birth rates for selected world countries. Figure 11.9 shows four different geometric graphs - point, line, area, and bar - used to represent a confidence interval on a linear regression. Figure 11.10 shows one geometric graph used to represent four different statistical methods - local mean, local range, quadratic regression, and linear regression confidence interval.
This exchangeability produces a rich set of graphic forms with
a relatively small number of geometric graphs.
Table 11.3 contains these graphing methods. The
point graphing function produces a geometric point,
which is an
-tuple. This function can also produce
a finite set of points, called a multipoint or a
point cloud. The set of points produced by
point
is called a point graph.
The line graphing function function is a bit more
complicated. Let
be a bounded region in
. Consider the function
, where
, with the
following additional properties:
The area graphing function produces a graph
containing all points within the region under the
line graph. The bar
graphing function
produces a set of closed intervals. An interval has two
ends. Ordinarily, however, bars are used to denote a single
value through the location of one end. The other end is
anchored at a common reference point (usually zero). The
histobar
graphing function produces a histogram
element. This element behaves like a bar except a value maps
to the area of a histobar rather than to its extent. Also,
histobars are glued to each other. They cover an interval or
region, unlike bars.
A schema is a diagram that includes both general and
particular features in order to represent a distribution. We
have taken this usage from [39], who invented the
schematic plot, which has come to be known as the box plot
because of its physical appearance. The schema
graphing function produces a collection of one or more points
and intervals.
The tile graphing function tiles a surface or
space. A tile graph covers and partitions the bounded
region defined by a frame; there can be no gaps or overlaps
between tiles. The Latinate tessellation (for tiling)
is often used to describe the appearance of the tile graphic.
A contour graphing function produces contours, or
level curves. A contour graph is used frequently in
weather and topographic maps. Contours can be used to
delineate any continuous surface.
The network graphing function joins points with
line segments (edges). Networks are representations that
resemble the edges in diagrams of theoretic graphs. Although
networks join points, a point graph is not needed in a frame
in order for a network graphic to be visible.
Finally, the path graphing function produces a
path that connects points such that each point
touches no more than two line segments. Thus, a path visits
every point in a collection of points only once. If a path is
closed (every point touches two line segments), we call it
a circuit. Paths often look like lines. There are several
important differences between the two, however. First, lines
are functional; there can be only one point on a line for any
value in the domain. Paths may loop, zigzag, and even cross
themselves inside a frame. Second, paths consist of segments
that correspond to edges, or links between nodes. This means
that a variable may be used to determine an attribute of every
segment of a path.
Figure 11.11 contains two geometric objects for representing the regression we computed in Fig. 11.8. We use a point for representing the data and a line for representing the regression line.