The data set comes from a survey where 12,388 contacts with various media have been identified (Lebart, L., Morineau, A., and Piron, M.; 1995). These contacts are crossed by activities (the statistical units are the media contacts). Besides, they are crossed with some supplementary variables: sex, age and education level.
The active data is stored in the file m.dat which contains six items (columns) of media and eight activities (rows)
96 118 2 71 50 17 122 136 11 76 49 41 193 184 74 63 103 79 360 365 63 145 141 184 511 593 57 217 172 306 385 457 42 174 104 220 156 185 8 69 42 85 1474 1931 181 852 642 782The column labels are stored in file mctxt.dat as shown below
RADIO TV N_NEWS R_NEWS MAGAZ TVMAGThe vector of row labels is stored in the file mltxt.dat
la_Farmer s_busin h_manag i_manag empl skil unsk NoworkSupplementary row data are stored in the file msl.dat :
1630 1900 285 854 621 776 1667 2069 152 815 683 938 660 713 69 216 234 360 640 719 84 230 212 380 888 1000 130 429 345 466 617 774 84 391 262 263 491 761 70 402 251 245 908 1307 73 642 360 435 869 1008 107 408 336 494 901 1035 80 140 311 504 619 612 177 209 298 281The eleven supplementary row labels are stored in the file msltxt.dat :
MALE FEMALE A14-24 A25-34 A35-49 A50-64 A65+ PRIMARY SECOND H_TECH UNIVER
The next code which calls the quantlet
corresp
and analyzes the dataset
m.dat
.
library("stats") corresp("m.dat","msl.dat","null","MEDIA","mltxt.dat", "mctxt.dat","msltxt.dat","null")
We obtain the following output.
[1,] EIGENVALUES AND PERCENTAGES Contents of seig [1,] 0.0139 62.1982 62.1982 [2,] 0.0072 32.3650 94.5632 [3,] 0.0008 3.7018 98.2650 [4,] 0.0003 1.3638 99.6288 [5,] 0.0001 0.3712 100.0000The first two axes together account for 95% of total variation and are very dominant. This percentage gives an idea of the share of information accounted for by the first two principal axes.
Coordinates on different axes and other indices helpful for interpreting the results are shown in following output which also includes the coordinates and the squared correlations of supplementary items.
[1,] Row relative weights and distances to the origin Contents of spdai [1,] 0.0286 0.0032 [2,] 0.0351 0.0016 [3,] 0.0562 0.0039 [4,] 0.1015 0.0011 [5,] 0.1498 0.0009 [6,] 0.1116 0.0011 [7,] 0.0440 0.0014 [8,] 0.4732 0.0005 [1,] Coordinates of the rows Contents of scoordi [1,] -0.0015 -0.0028 0.0006 0.0001 -0.0002 [2,] -0.0006 -0.0013 0.0006 -0.0002 0.0002 [3,] 0.0039 -0.0005 0.0000 -0.0002 -0.0001 [4,] 0.0010 0.0003 0.0003 0.0002 0.0001 [5,] -0.0001 0.0009 0.0000 0.0002 0.0000 [6,] -0.0004 0.0009 0.0002 -0.0003 0.0000 [7,] -0.0011 0.0009 0.0004 0.0000 -0.0002 [8,] -0.0003 -0.0003 -0.0002 0.0000 0.0000In the following window we remark, for instance, that the relative frequency of national newspapers (N NEWS) (3-rd active column item) is very small (3.54%).
[1,] Column relative weights and distances to the origin Contents of spdaj [1,] 0.2661 0.0005 [2,] 0.3204 0.0005 [3,] 0.0354 0.0049 [4,] 0.1346 0.0014 [5,] 0.1052 0.0015 [6,] 0.1384 0.0015 [1,] Coordinates of the columns Contents of scoordj [1,] 0.0001 0.0002 0.0004 0.0000 0.0000 [2,] -0.0005 0.0000 -0.0001 -0.0001 -0.0001 [3,] 0.0049 -0.0001 -0.0002 -0.0004 0.0001 [4,] -0.0010 -0.0010 0.0000 -0.0001 0.0001 [5,] 0.0009 -0.0012 -0.0002 0.0003 0.0000 [6,] -0.0001 0.0015 -0.0002 0.0001 0.0001but its distance to the origin is very high (0.049), which tells that its profile is very specific in terms of activities. As a result it contributes 74.6% as can be seen from the following output, to the construction of the first axis. Geometrically it is very close to this axis (squared correlation is 0.99).
[1,] Contributions of the columns Contents of scontrj [1,] 0.4287 1.8037 70.3836 0.6207 0.1489 [2,] 6.5641 0.0192 10.5160 13.2700 37.5915 [3,] 74.5877 0.0189 1.8090 18.1763 1.8723 [4,] 11.5011 22.4356 0.4460 7.5324 44.6282 [5,] 6.8233 25.6080 4.4877 50.8035 1.7592 [6,] 0.0950 50.1145 12.3576 9.5970 13.9999 [1,] Squared correlations of the columns Contents of scorrj [1,] 0.0770 0.1685 0.7520 0.0024 0.0002 [2,] 0.8508 0.0013 0.0811 0.0377 0.0291 [3,] 0.9930 0.0001 0.0014 0.0053 0.0001 [4,] 0.4866 0.4940 0.0011 0.0070 0.0113 [5,] 0.3168 0.6186 0.0124 0.0517 0.0005 [6,] 0.0035 0.9587 0.0270 0.0077 0.0031The first axis is highly explained by the 3-rd active row item high manager (h manag) in the following output window:
[1,] Contributions of the rows Contents of scontri [1,] 5.6928 37.9892 17.8813 1.9590 15.8850 [2,] 1.1848 9.9793 17.6701 4.7954 28.0180 [3,] 74.9579 2.8872 0.0622 5.2257 8.5732 [4,] 8.3279 1.4964 11.7552 21.4483 17.5522 [5,] 0.2675 18.9376 0.4701 20.3081 2.1711 [6,] 1.5383 15.9009 5.0508 46.0393 0.4038 [7,] 4.4054 5.4906 8.4193 0.1767 26.8961 [8,] 3.6255 7.3188 38.6910 0.0476 0.5005 [1,] Squared correlations of the rows Contents of scorri [1,] 0.2135 0.7414 0.0399 0.0016 0.0036 [2,] 0.1538 0.6742 0.1366 0.0137 0.0217 [3,] 0.9782 0.0196 0.0000 0.0015 0.0007 [4,] 0.8022 0.0750 0.0674 0.0453 0.0101 [5,] 0.0252 0.9289 0.0026 0.0420 0.0012 [6,] 0.1383 0.7437 0.0270 0.0907 0.0002 [7,] 0.5557 0.3604 0.0632 0.0005 0.0202 [8,] 0.3722 0.3910 0.2364 0.0001 0.0003
[1,] SUPPLEMENTARY ITEMS [1,] Row relative weights and distances to the origin Contents of spdsl [ 1,] 0.1644 0.0006 [ 2,] 0.1714 0.0006 [ 3,] 0.0610 0.0012 [ 4,] 0.0614 0.0012 [ 5,] 0.0883 0.0004 [ 6,] 0.0648 0.0010 [ 7,] 0.0602 0.0016 [ 8,] 0.1010 0.0015 [ 9,] 0.0873 0.0004 [10,] 0.0805 0.0024 [11,] 0.0595 0.0026The 11-th supplementary row item university education (UNIVER) is closely linked to factor 1, see the following output:
[1,] Squared correlations of the rows Contents of scontrsi [ 1,] 0.4813 0.1104 0.0215 0.3239 0.0629 [ 2,] 0.4910 0.1025 0.0213 0.3261 0.0591 [ 3,] 0.0150 0.5609 0.0762 0.2102 0.1377 [ 4,] 0.0542 0.8704 0.0100 0.0350 0.0304 [ 5,] 0.6140 0.1026 0.0726 0.0316 0.1791 [ 6,] 0.0478 0.8030 0.0011 0.1184 0.0296 [ 7,] 0.1438 0.5840 0.1552 0.0894 0.0275 [ 8,] 0.6289 0.2446 0.0209 0.1034 0.0023 [ 9,] 0.0002 0.6872 0.0001 0.2908 0.0218 [10,] 0.0132 0.4614 0.0187 0.1283 0.3783 [11,] 0.9882 0.0033 0.0024 0.0025 0.0037 [1,] Coordinates of the rows Contents of scodsi [ 1,] 0.0004 -0.0002 0.0001 -0.0004 0.0002 [ 2,] -0.0004 0.0002 -0.0001 0.0004 -0.0002 [ 3,] 0.0001 0.0009 0.0003 0.0006 -0.0004 [ 4,] 0.0003 0.0011 0.0001 0.0002 -0.0002 [ 5,] 0.0003 0.0001 0.0001 0.0001 0.0001 [ 6,] -0.0002 -0.0009 0.0000 -0.0003 0.0002 [ 7,] -0.0006 -0.0012 -0.0006 -0.0005 0.0003 [ 8,] -0.0012 -0.0007 -0.0002 -0.0005 0.0001 [ 9,] 0.0000 0.0004 0.0000 0.0002 0.0001 [10,] 0.0003 0.0017 0.0003 0.0009 -0.0015 [11,] 0.0026 -0.0002 0.0001 0.0001 -0.0002
It is clear in this analysis that main trait (first axis) is that the contact of national newspapers corresponds, in a highly significant way, to high manager and (or) people with university education.
The second axis characterizes mostly an opposition between TV magazines (TVMAG) (associated with employer, worker , and the younger people) and magazine (MAGAZ), and regional newspapers (R NEWS) associated with farmer, small business (s busin) and older people (A50-64, A65+). Figure 13.2 summarizes this set of associations.
The positions of items on Figure 13.2 explain a nuance interpretation on the second axis: the employer and worker, people of middle level education (SECOND), associated in particular with the young (A25-34, A14-24) (contact media such as TV magazine), are opposed to small business and farmers, who are primarily older (A50-64, A65+) with less education (PRIMARY) and contact media such as magazine (MAGA) and regional newspapers (R NEWS).