If we are given data in numerical form, we tend to display it also
numerically. This was done in the preceding sections:
an observation was plotted as the point
in
a two-dimensional coordinate system. In multivariate analysis we want
to understand data in low dimensions (e.g., on a 2D computer screen)
although the structures are hidden in high dimensions. The numerical display of
data structures using coordinates therefore ends at dimensions
greater than three.
If we are interested in condensing a structure into 2D elements, we have to consider alternative graphical techniques. The Chernoff-Flury faces, for example, provide such a condensation of high-dimensional information into a simple ``face''. In fact faces are a simple way to graphically display high-dimensional data. The size of the face elements like pupils, eyes, upper and lower hair line, etc., are assigned to certain variables. The idea of using faces goes back to Chernoff (1973) and has been further developed by Bernhard Flury. We follow the design described in Flury and Riedwyl (1988) which uses the following characteristics.
1 | right eye size |
2 | right pupil size |
3 | position of right pupil |
4 | right eye slant |
5 | horizontal position of right eye |
6 | vertical position of right eye |
7 | curvature of right eyebrow |
8 | density of right eyebrow |
9 | horizontal position of right eyebrow |
10 | vertical position of right eyebrow |
11 | right upper hair line |
12 | right lower hair line |
13 | right face line |
14 | darkness of right hair |
15 | right hair slant |
16 | right nose line |
17 | right size of mouth |
18 | right curvature of mouth |
19-36 | like 1-18, only for the left side. |
First, every variable that is to be coded into a characteristic face element
is transformed into a scale, i.e., the minimum of the variable
corresponds to
and the maximum to
. The extreme positions of the
face elements therefore correspond to a certain ``grin'' or ``happy''
face element. Dark hair might be coded as
, and blond hair
as
and so on.
As an example, consider the observations 91 to 110 of the bank data.
Recall that the bank data set consists of 200 observations of
dimension 6 where, for example, is the diagonal of the note.
If we assign the six variables to the following
face elements
What happens if we include all 100 genuine and all 100 counterfeit bank notes in the Chernoff-Flury face technique? Figures 1.16 and 1.17 show the faces of the genuine bank notes with the same assignments as used before and Figures 1.18 and 1.19 show the faces of the counterfeit bank notes. Comparing Figure 1.16 and Figure 1.18 one clearly sees that the diagonal (face line) is longer for genuine bank notes. Equivalently coded is the hair darkness (diagonal) which is lighter (shorter) for the counterfeit bank notes. One sees that the faces of the genuine bank notes have a much darker appearance and have broader face lines. The faces in Figures 1.16-1.17 are obviously different from the ones in Figures 1.18-1.19.