15.1 Reading and Writing Data Files


x = 34579 read ("file")
reads numeric data from a file file.dat into a matrix x, each column of the file will be interpreted as column vector of x
x = 34582 readm ("file")
reads mixed text and numeric data from file.dat into a list x, which contains text and numeric matrices separately
34585 write (x, "file.dat" {,mode {,format}})
writes text or numeric arrays to the file file.dat

This section introduces reading and writing simple ASCII data sets. All XploRe code for this section can be found in the quantlet 34592 XLGiofmt01.xpl .

The function 34595 read is the command to read numeric data from a file. Each column of the file will be interpreted as a vector of the resulting matrix. Let us consider the data file geo.dat which has the following contents:

  86     5
  87    14
  88    24
  89    64
  90   100
  91   129
  92   200
  93   176
  94   177
We can read this data file and store it into the matrix x by the following instruction:
  x=read("geo")
Typing just x at the command line will show the contents of x as follows:
  Contents of x
  [1,]       86        5 
  [2,]       87       14 
  [3,]       88       24 
  [4,]       89       64 
  [5,]       90      100 
  [6,]       91      129 
  [7,]       92      200 
  [8,]       93      176 
  [9,]       94      177
Data files often contain mixed text and numeric columns, as in popul.dat :
  778 New_York
  355 Chicago
  248 Los_Angeles
  200 Philadelphia
  167 Detroit
   94 Baltimore
   94 Houston
   88 Cleveland
   76 Washington_DC
   75 Saint_Louis
   74 Milwaukee
   74 San_Francisco
   70 Boston
   68 Dallas
   63 New_Orleans
which stores population data about US cities. Loading this data file with 34602 read would give missing values (NaN) in the second column, since 34605 read is not able to decode the text strings. This file should be read by 34608 readm . Since 34611 readm is part of the xplore library, we load this library first:
  library("xplore") 
  x=readm("popul")
creates a list x which consists of three components. The instruction names(x) shows
  Contents of names
  [1,] "type"
  [2,] "double"
  [3,] "text"
in the output window. The first component, the vector x.type indicates the type of the column in the original data file:
  Contents of type
  [1,]        0 
  [2,]        1
Here, 0 stands for numeric and 1 for text. The second and third components contain the data: x.double is the numeric column, x.text is the text column.

In addition to 34616 read and 34619 readm , XploRe provides the command 34626 readascii to read any type of ASCII data. 34629 readm is actually based on 34632 readascii . We will have a closer look at 34635 readascii in the next section when we discuss input format strings.

To write data to a data file, the function 34638 write can be used. 34641 write can save both, numeric or text data. Let us consider the following example:

  x=(1:4)~(2:5)
  write(x,"mydata.dat")
writes the matrix to the data file mydata.dat. This file has then the following contents:
  1.000000 2.000000
  2.000000 3.000000
  3.000000 4.000000
  4.000000 5.000000
Note that 34644 write optionally accepts a writing mode and a format string to produce formatted output. The possible modes and format strings will be explained in detail when we come to string output, see Section 15.3.