18.1 State-Space Models and Outliers


{X, Y} = 30762 kemitor2 (T, x0, H, F, ErrY, ErrX)
simulates observations and states of a time-invariant state-space model

For the definition and notation of a state-space model we refer to Härdle, Klinke, and Müller (2000, Section 10.1); we, too, will confine ourselves to discrete time; in particular we make the same assumptions on the vectors $ v_t$ and $ w_t$ and the initial state $ x_0$ given in equations (2) and (3) there.

For our purposes, however, also the state is interesting, so we modified the quantlet 30765 kemitor by just adding an extra output X for it. This means that whenever you have used the command line

  y = kemitor(T, x0, H, F, ey, ex)
you may as well use
  erg = kemitor2(T, x0, H, F, ey, ex)
  y=erg.Y
and additionally, you get the simulated states as a $ T\times n$ matrix x, using
  x=erg.X

With these slight modifications, the examples to 30768 kemitor stay valid. An example also using the state-simulations will be presented in the Subsection 18.1.2.


18.1.1 Outliers and Robustness Problems


{X, Ind}
= 30876 epscontnorm (T, eps, mid, Cid, mcont, Ccont,
DirNorm)
simulates observations from an $ \varepsilon $-contaminated multivariate normal distribution

A complication of the state-space model and the classical Kalman filter enters if we allow for deviations from the model assumptions. In the i.i.d. situation, this led to the development of robustness theory, for which we refer the reader e.g. to Huber (1981), Hampel et al. (1986) , Rieder (1994).

A common way to model this deviation are $ \varepsilon $-contaminations used by the famous Huber (1964). The considered variable $ X$ no longer always comes from a fixed distribution $ P^{\rm ideal}$, but rather from a neighborhood around this fixed central/ideal distribution $ P^{\rm ideal}$, and $ X \sim P^{\rm real}$,

$\displaystyle P^{\rm real}=(1-\varepsilon )P^{\rm ideal}+\varepsilon K$ (18.1)

for some arbitrary distribution $ K$.

In our context we will consider only multivariate normal distributions as central distributions, i.e. $ P^{\rm ideal}={\cal N}_2(\mu_{\rm id},C_{\rm id})$. As contaminating distributions we allow for Dirac-distributions, symmetric Dirac-distributions and again normal. This is done using the quantlet 30879 epscontnorm .


18.1.1.1 Example 1

How it works may be seen by the following example where we generate $ 500$ observations from

$\displaystyle 0.9 {N}_2
\left(\left(\begin{array}{c}0\\ 0
\end{array}\right), \...
...end{array}\right), \left(\begin{array}{cc}3&0\\ 0&0.2\end{array}\right)\right)
$

and then plot them, in green those coming from the ideal, in red those from the contaminating distribution, see Figure 18.1.

Figure 18.1: 500 observations from $ \varepsilon $-contaminated Normal distribution; ideal observations are green squares, contaminated observations are red triangles.
\includegraphics[scale=0.6]{kontbsp}

  library("xplore")
  library("plot")
  library("kalman")
  randomize(0)
  T = 500
  eps = 0.1
  mid=#(0,0)
  Cid = #(2,1)~#(1,1)
  mcont=#(3,3)
  Ccont = #(3,0)~#(0,0.2)
  erg=epscontnorm(T,eps,mid,Cid,mcont,Ccont,0)
  color=2*erg.Ind+2  
                       ; sets color to 2 (green) for "clean" data 
                       ; and 4 (red) for contaminated data
  data=erg.X
  setmaskp(data,color, 3, 8) 
  disp = createdisplay(1,1)
  show(disp,1,1,data)
30886 XAGrkalm01.xpl

In the situation of state-space models deviations can have quite different effects, depending on where they enter. In the time-series context there is a common terminology due to Fox (1972) distinguishing between additive outliers (AO) and innovation outliers (IO).


18.1.1.2 Additive Outliers

AO's enter in the observation-equation $ y_t=Hx_t+v_t$, that is $ v_t$ is contaminated. As a consequence, singular observations will be erroneously large, but note that only a single observation is affected, as the error does not enter the state of the model.

An example for this can be seen in satellite navigation. The state of the system is the $ 3$ dimensions of the satellite in space, whereas ground control receives a possibly noisy signal about this state. An AO could now be caused by a short defect in the observation device.

So the task of the estimator is to down-weight the influence of large obesrvations for the state estimation. This is what our robust Kalman filters rLS and rIC are designed for.


18.1.1.3 Innovation Outliers

Contrary to AO's, IO's enter in the state-equation $ x_t=Fx_{t-1}+w_t$, where $ w_t$ is contaminated. Here an outlier will have effect also for the subsequent states and hence for the observations, too.

To return to our example from the AO's, an IO could be caused by an asteroid hitting the the satellite and thereby deviating it from its ``should-be'' track.

This time the task of the estimator is to realize this and to adapt itself as fast as possible to the new situation. Of course, down-weighting the observations' influence will make detecting state deviations even harder, so simultaneous treatment of AO's and IO's will in general be less effective than either of the two alone. Additionally, for detecting a deviation from the ``should-be'' track, a single observation plus the estimation of the state based on the observations up to then is generally not sufficient. So for this task, it is better to drop strict recursivity in the sense that in stead of just one observation plus the estimation of the state based on the observations up to then one should rather base the estimation on the last $ p$ observations and smoothed/filtered/predicted values $ x_{t-i\vert t}$, $ i=-1,...,p-2$. Simultaneous robust smoothers/filters designed for that purpose have not yet been implementated into XploRe up to now.


18.1.1.4 Other Types of Outliers

This distinction between AO and IO is by no means the only possible one--other types have been considered such as patchy outliers (PO) and substitutive outliers (SO) which will be mentioned later in this chapter.


18.1.2 Examples of AO's and IO's in XploRe

To give you an impression of how AO's and IO's affect data in state space models, we have simulated data from $ 10\%$-AO/IO-contaminated, normal setups.


18.1.2.1 Example 2

We realize AO-contamination by simulating $ v_t$ coming from an $ \varepsilon $-contaminated $ {N}_m(0,Q)$. An example of the effects of an AO-contaminated model is generated by the following XploRe instructions using the quantlets 31040 epscontnorm and 31043 kemitor2 which generate a simulation of length $ 100$ of a steady state model (i.e. $ H=R=Q=F=1$) under a convex contaminated $ v_t$, with contamination-radius $ 0.1$ and $ K={N}(10,0.1)$;

First we set the system parameters:

  library("xplore")
  library("plot")
  library("kalman")
  randomize(0)
  T = 100
  mu=0
  H = 1
  F = 1
  mid=0
  Cid=1
  mcont=10
  Ccont=0.1
  eps=0.1
Then we simulate data from this situation and apply the Kalman filter to this data.
  ErrX = normal(T)
  ErrY = epscontnorm(T,eps,mid,Cid,mcont,Ccont,0)
  sim = kemitor2(T,mu,H,F,ErrY,ErrX)

  state = (1:100)~(sim.X)
  obs= (1:100)~(sim.Y)
  ind = ErrY.Ind
Flags are set for the instances where we have contamination.
  ind=(1:100)~ind
  ind=paf(ind, ind[,2]==1)
Finally we plot the path of the state (blue) and the corresponding observations (green) and set red flags at the instances where we have contamination, see Figure 18.2.
  setmaskp(ind,4, 3, 8)  
  state=setmask(state,"line","blue","thin")          
  obs=setmask(obs,"line","green","thin")          
  disp = createdisplay(1,1)
  show(disp,1,1,state,obs,ind)*0
  setgopt(disp,1,1, "title", "1-dim Steady State Model under AO")
  setgopt(disp,1,1, "xlabel", "t") 
  setgopt(disp,1,1, "ylabel", "x, y")
31047 XAGrkalm02.xpl


18.1.2.2 Example 3

IO-contamination is realized by simulating $ w_t$ coming from an $ \varepsilon $-contaminated $ {N}_n(0,R)$. The effects of an IO-contaminated model may be seen by just interchanging the role of $ w_t$ and $ v_t$ in the first example. They will generate a simulation of length $ 100$ of a steady state model (i.e. $ H=R=Q=F=1$) under a convex contaminated $ w_t$, with radius $ 0.1$, and $ K={N}(10,0.1)$.

  ErrY = normal(T)
  ErrX = epscontnorm(T,eps,mid,Cid,mcont,Ccont,0)
  sim = kemitor2(T,mu,H,F,ErrY,ErrX)

  state = (1:100)~(sim.X)
  obs= (1:100)~(sim.Y)
  ind = ErrX.Ind
31055 XAGrkalm03.xpl

Figure 18.2: Examples 2 ,3 displayed simultaneously: AO's cause single peaks, IO's result in a level change
\includegraphics[scale=0.65]{simAOIO}


18.1.3 Problem Setup

To summarize we want to solve the following problem. In a given ideal normal state-space model, that is $ x_0\sim {N}_n(\mu,\Sigma)$, $ v_t\sim {N}_m(0,Q)$, $ w_t\sim {N}_n(0,R)$ all stochastically independent, with $ F$, $ H$, $ Q$, $ R$ known and in ``correct'' dimensions, we want to find recursive estimates for $ x_t$ based on $ y_{t}$ and a preliminary estimate $ x_{t\vert t-1}$ for $ x_t$ based on all observations $ y_{t-i}$, $ i=1, \ldots,t-1$. The quality of this estimator is measured in terms of the mean squared error (MSE) $ \mathop{\rm {{}E{}}}\nolimits [\vert x_t-f(y_t,x_{t\vert t-1})\vert^2]$. For robustness reasons we want the influence of $ y_t$ on $ f(y_t,x_{t\vert t-1})$ to be bounded in order to protect it from AO outliers.

Of course for this robustness, we pay a price, namely we cannot in general achieve the optimal MSE which is attained by the classical Kalman filter $ x_{t\vert t}^0$. This price will be measured by the relative efficiency loss $ [{\rm MSE}(f)-{\rm MSE}(x_{t\vert t}^0)]/{\rm MSE}(x_{t\vert t}^0)$.