Once you have clicked on the XploRe icon, three windows open up on the screen.
These windows are
Let's carry out some elementary calculations to get familiar with XploRe . Assume that you wish to calculate the sum of 2 numbers, e.g. 1 and 2. Then you have to follow these steps:
The screen changes as follows:
The outcomes of the above sequence of operations are
The typical basic steps for a statistical data analysis in XploRe are
Let's load our first data set. The ASCII file pullover.dat contains data on pullovers sales in 10 time periods. The four columns of pullover.dat correspond to four variables. These are the number of pullovers sold, the price (in DM), costs for advertising (in DM), and the presence of a shop assistant (in hours per period), see Data Sets (B.10) for more information.
We read the data file pullover.dat into XploRe by entering
x=read("pullover")at the command line. With this instruction, we have assigned the contents of the data file pullover.dat to the XploRe variable x. We can print the contents of x by issuing
xat the command line. This shows
Contents of x [ 1,] 230 125 200 109 [ 2,] 181 99 55 107 [ 3,] 165 97 105 98 [ 4,] 150 115 85 71 [ 5,] 97 120 0 82 [ 6,] 192 100 150 103 [ 7,] 181 80 85 111 [ 8,] 189 90 120 93 [ 9,] 172 95 110 86 [10,] 170 125 130 78As an example of a statistical analysis of the data, let's compute the
mean(x)returns
Contents of mean [1,] 172.7 104.6 104 93.8in the output window. This shows that during the 10 considered periods, 172.7 pullovers have been sold on average per period, the average price was 104.6 DM, the average advertising costs were 104 DM and the shop assistant was present for 93.8 hours on average.
In the previous example, we applied the
XploRe
built-in function
mean
which provides the sample mean of the data. Apart from
the built-in functions,
XploRe
offers libraries (= quantlibs) of functions
(= quantlets) that must be loaded before usage.
In general, the statistical analysis of data comprises the following steps:
We continue our analysis with the pullover data. The first column of this data set contains measurements on the sales of ``classic blue'' pullovers in different shops whereas the second column contains the corresponding prices. Let's say we are interested in the relation between prices and sales.
We read the data again and now select the price and sales columns (second and first columns) only:
x=read("pullover") x=x[,2|1]One of the strengths of XploRe is the graphical exploration of data. A scatter plot of the data should give us a first impression on the relation of both variables. We will show the scatter plot by means of the function
library("plot") plot(x)
The last instruction creates a display, i.e. a new graphics window, which contains the scatter plot:
Looking at this scatter plot, it is difficult to find a clear tendency in the relation between price and sales. It is the task of regression analysis -- discussed in Regression (4) -- to determine the appropriate functional relation between variables. We will now use one of the regression methods introduced there:
regx=grlinreg(x) plot(x,regx)
The resulting plot in Figure 1.1 shows the regression line regx and the data x as circles. The regression line has a negative slope. We can conclude that (on average) the number of sold pullovers decreases if the price of the pullover increases. However, this result may be influenced by the two extreme observations in the upper right and lower right of the figure. XploRe can easily identify such ``outliers''. For example, the instruction
x=paf(x,(100<x[,2])&&(x[,2]<200))would only keep those lines of x where the sales observation is above 100 and below 200. You could now redo the previous regression in order to see how the regression line changes.
XploRe offers several ways to produce quality graphics for publication. You can modify the objects in a plot (point and line style, title and axes labels) and finally save the graphical display in different file formats.
Let's continue with the regression example from the previous subsection. We can improve the graphic by several graphic tools. For example,
x=setmask(x,"large","blue","cross") plot(x) setgopt(plotdisplay,1,1,"xlabel","price","ylabel","sales")will show the data points as blue crosses and the axes labels with the appropriate names of the variables. We can set a title for the display in the same way:
setgopt(plotdisplay,1,1,"title","Pullover Data")The final plot is shown in Figure 1.2.
Graphical displays can be printed or saved to a file. If you click on the display plotdisplay, the Print menu will appear in the menu bar of XploRe . This menu offers three choices: to Printer prints the display directly on your default printer, to Bitmap file ... saves the display to a Windows Bitmap file, to PostScript file ... saves the display to a PostScript file. The two latter menu items open a file manager box, where you can enter the file name. Here you see the resulting PostScript plot:
PostScript files can also be printed by an XploRe instruction:
print(plotdisplay,"Plot1.ps")will save the display plotdisplay into the file Plot1.ps.