11.3 Estimating a DPLS-Model


11.3.1 The Computer Program DPLS

The very sophisticated computer program LVPLS for static partial least squares models was developed by Lohmöller (1984)(1989).Unfortunately, it is not effectively applicable to dynamic path models.

The computer programme DPLS (Strohe and Geppert; 1997) is available as both PC-ISP macros and XploRe (Härdle, Klinke, and Turlach; 1995) quantlets. The syntax to control the programme is nearly identical in both versions. In XploRe it consists of three basic modules. The first one puts the DPLS algorithm within the XploRe environment into action, the second one calculates a redundancy measure and the last one is a tool for easier creating input variables. This one works on the base of several menus and dialogs. So it is not necessary to know all input matrices exactly in advance and the only input to be prepared by the use of ordinary XploRe commands is the indicator matrix.

The further input matrices firstly are to be to created and defined by the above mentioned tool (third basic module) or ``by hand''. But if a user wants to calculate a model several times one after the other and if he/she wants to modify parameters on the fly in a programmed simulation or something like this he/she can take an advantage by using the basic programmes directly. In that case one has to know more about the details of input and output of the corresponding module.


11.3.2 Creating Design-Matrices


design = 21148 makedesign (y)
generates the design matrices by a dialogue

At first, for running a DPLS session, XploRe has to be started. One must load the quantlib metrics into memory by using the 21157 library command. XploRe users have to type:

 library("metrics")

First, a design-making session (Figure 11.1) could be started by the command mentioned above. There y is a $ n \times l$-matrix with manifest variables (indicators). The output design contains several variables. Among them dy is matrix ( $ l \times k$) with outer designs (0 or 1). Rows are counting manifest variables. Furthermore, d is a $ k \times k$ matrix with inner unlagged designs (0 or 1). No diagonal values are allowed. The ( $ k \times k$) matrix dl contains the inner lagged designs (0 or 1). Diagonal elements are showing autoregression. The matrix w represents start weights with same dimensions as dy. It is simply a copy of dy.

Figure: A session for creating design-matrices in XploRe
\includegraphics[scale=0.6]{dplsfig1}

It is possible to address these variables in a container-variable with the point as an separator between them both. In the case above:

  design.dy
  design.w
  design.dl
  design.w
The syntax of 21169 makedesign is to be seen below:

Input parameters:

y
$ n \times l$ matrix, manifest variables (indicators)

Output parameters:

out.dy
$ l \times k$ matrix, outer designs (0 or 1, rows are counting manifest variables)
out.d
$ k \times k$ matrix, inner unlagged designs (0 or 1, no diagonals allowed)
out.dl
$ k \times k$ matrix, inner lagged designs (0 or 1, diagonals are showing autoregression)
out.w
$ l \times k$ matrix, start weights, same as dy

Besides that, it is possible to create all matrices by hand. An easy sample can be seen below:

  library("metrics")
  randomize(13409)
  b1=0.3
  c1=0.6
  s=500
  n1=normal(s+1)
  n1lag=n1[1:s,]
  n1=n1[2:rows(n1),] ;innermodel
  n2=b1*n1+c1*n1lag+normal(rows(n1))/5
  n=n1~n2
  nn=n./sqrt(var(n)) ;loadingsmatrix
  p=(1|2|3|4|0|0|0)~(0|0|0|0|5|6|7)
  y=nn*p'+normal(rows(n),rows(p))/8
  d=(0|1)~(0|0)
  dl=(0|1)~(0|0)
  w=(1|1|1|1|0|0|0)~(0|0|0|0|1|1|1)
  myfit=dpls(w,d,w,dl,y,1,3)
  myfit.b
  myfit.sk
  myfit.skl
21173 XAGdpls01.xpl


11.3.3 Estimating with DPLS


myfit = 21316 dpls (w, d, dy, dl, y, lag, acc)
estimates the weights and loadings matrices

As one could see above, a common XploRe session of DPLS (Figure 11.2) can be started by the described command. There 21323 dpls denotes the program used and w is a term for the weights the algorithm is to start with. The term d means the inner and dy the outer design matrix. Furthermore, dl is a symbol for an inner design matrix as well but it contains the lagged connections in the model. The matrices d, dy and dl represent our idea about the model structure. The scalar lag determines the lag order the algorithm has to take into account. The matrix y contains the time series of all indicators (manifest variables) and represents virtually all empirical information available. The digit acc stands for ``accuracy'' and controls the final stop of the iteration process. The quantlet 21326 dpls uses this number in order to check after every iteration whether or not the new calculated values are significantly different from the previously calculated values acc specifying how many decimals are taken into consideration.

The syntax of quantlet 21329 dpls has the following structure:

Input parameters:

w
$ l \times k$ matrix, start weights
d
$ k \times k$ matrix, inner unlagged designs (0 or 1, no diagonal values allowed)
dy
$ l \times k$ matrix, outer designs (0 or 1, rows are counting manifests)
dl
$ k \times k$ matrix, inner lagged designs (0 or 1, diagonals are showing autregression)
y
$ n \times l$ matrix, manifest variables (indicators)
lag
a scalar of lag order
acc
scalar, canceling criterion

Output parameters:

myfit.wg
matrix, weights
myfit.b
matrix, loadings
myfit.sk
matrix, path coefficients
myfit.skl
matrix, lagged path coefficients
myfit.lk
matrix, latent variables
myfit.iter
scalar, number of iterations

The matrices myfit.wg and myfit.b correspond to $ W$ and $ P$ in (11.2) and (11.3), resprectively. And myfit.sk and myfit.skl are the $ B$ and $ C$ in (11.5).

All matrices are structured in a comparable way. The number of rows are supposed to correspond with the number of variables and the number of columns should be identical with the corresponding number of observations. If the following model is taken as an example, which contains 41 manifest variables with 74 observations, then the matrix y has the shape $ 41\times 74$. But the design matrix dy which contains the available connections between every manifest and every latent variable (``1'' for a connection and ``0'' for none) must have the shape $ 41\times 5$ because this model contains 5 latent variables. One can observe the same structure in the matrix w with the difference that the matrix could contain at the spots of ``1'' any other value which should be used as starting weight by the algorithm.

The d and dl matrices are squared. The rows and the columns stand for latent variables. And since the following model is designed with connections all leading to the fifth variable, all rows are filled with noughts except of the last. This row describes which of the variables are connected to the fifth variable. With the same logic one has to decode the output variable sk which contains the unlagged path coefficients in the first part followed by the lagged ones.

Figure: DPLS session in XploRe #45687#>
\includegraphics[scale=0.6]{dplsfig2}


11.3.4 Measuring the Forecasting Validity


myredun = 21443 redun (b, sk, lk, skl, y)
estimates the redundancy matrices

The last tool for a DPLS session is the module 21446 redun . It calculates on the base of formula (11.27) the redundancy as a criterion that is used for the evaluation of the goodness of fit of the model. The described overview shortly shows the usage of this module.

The syntax of 21449 redun is to be seen below:

Input parameters:

b
matrix, loadings
sk
matrix, path coefficients
lk
matrix, latent variables
skl
matrix, lagged path coefficients
y
matrix, manifest variables (indicators)

Output parameters:

red
scalar, redundancy value
redm
vector, redundancy values