18.4 Robustified Regression: the rIC filter


{filtX, KG, PreviousPs, clipInd} = 31857 rICfil (y, mu, Sig, H, F, Q, R, cliptyp, As, bs)
calculates the rIC filter for a (multivariate) time series
{A, b, ctrl} = 31860 calibrIC (T, Sig, H, F, Q, R, typ, A0, b0, e, N, eps, itmax, expl, fact0, aus)
calibrates the rIC filter to a given efficiency loss in the ideal model.
The idea to think of the filter problem as a ``regression'' problem stems from Boncelet and Dickinson (1984) and Cipra and Romera (1991), where we write ``regression'' because the parameter in this model is stochastic, whereas one component of the observation is deterministic, and thus this regression is not covered by the robustness theory for common regression. The robustification however then uses techniques of optimal influence curves for regression to be found in chapter VII in Rieder (1994); instead of taking M-estimators to get a robust regression estimator with a prescribed influence curve, we use a one-step procedure corresponding to a Hampel-Krasker influence curve.


18.4.1 Filtering Problem as a Regression Problem

The model suggested by Boncelet and Dickinson (1984) and Cipra and Romera (1991) uses the innovational representation (18.2) and reads

$\displaystyle \left(\!\!\begin{array}{c}x^0_{t\vert t-1}\\ y_t\end{array}\!\!\r...
...\!\!\begin{array}{cc}\Sigma_{t\vert t-1}&0\\ 0&Q \end{array}\!\!\right)\right),$ (18.9)

where $ x^0_{t\vert t-1}$ denotes the classical Kalman filter, which in this procedure has to be calculated aside, too. As already indicated, assuming normality, the classical Kalman filter is the ML-Estimator for this model, with scores function

$\displaystyle \Lambda_x\left(\!\!\begin{array}{c}x^0_{t\vert t-1}\\ y_t\end{arr...
...mbda\left(\!\!\begin{array}{c}x^0_{t\vert t-1}-x\\ y_t-Hx\end{array}\!\!\right)$

with

$\displaystyle \Lambda\left(\!\!\begin{array}{c}s\\ u\end{array}\!\!\right):=\Lambda_1+\Lambda_2=\Sigma_{t\vert t-1}^{-1}s+H'Q^{-1}u$ (18.10)

and Fisher information

$\displaystyle {\cal I}=\mathop{\rm {{}E{}}}\nolimits [ \Lambda\Lambda']=\Sigma_{t\vert t}^{-1}.$ (18.11)


18.4.2 Robust Regression Estimates

To get a robust estimator for model (18.9) we use an influence curve (IC) of Hampel-Krasker form:

$\displaystyle \psi(s,u)=A \Lambda(s,u) \min\{1,\frac{b}{\Vert A\Lambda(s,u)\Vert}\}$ (18.12)

where $ A$ is a Lagrange multiplyer guaranteeing that the coorelation condition $ \mathop{\rm {{}E{}}}\nolimits [\psi\Lambda']={\rm\bf I}_n$ is fullfilled; due to symmetry of $ {\cal L}(\Lambda)$, $ \psi$ of form (18.12) is centered automatically; furthermore $ b$ bounds the influence of $ y_t$ on $ x_{t\vert t}$.

The reader not familiar to the notion of influence curve may recur to section 18.5 and look up some of the references given there.

Instead of replacing the ML-equation by an M-Equation to be solved for $ \psi\stackrel{!}{=}0$, we use a one-step with same asymptotic properties:

$\displaystyle x_{t\vert t}$ $\displaystyle =$ $\displaystyle x_{t\vert t-1}+\psi_{x_{t\vert t-1}}
\left(\!\!\begin{array}{c}x^0_{t\vert t-1}\\ y_t\end{array}\!\!\right)=$ (18.13)
  $\displaystyle =$ $\displaystyle x_{t\vert t-1}+\psi\left(\!\!\begin{array}{c}x^0_{t\vert t-1}-x_{t\vert t-1}\\ y_t-Hx_{t\vert t-1}\end{array}\!\!\right).$ (18.14)

We note the following properties:

and again we already note that the rIC has preserved the crucial features from of the Kalman filter


18.4.3 Variants: Separate Clipping

As already seen in (18.10), $ \Lambda$ is decomposed into a sum of two independent variables $ \Lambda_1$ and $ \Lambda_2$. They may be interpreted as estimating $ v_t$ and $ \xi_t$, thus they represent in some sense the sensitive point to AO and IO respectively. Instead of simultaneous clipping of both summands, just clipping the ``IO-part'', i.e. $ \Lambda_1$, or ``AO-part'', i.e. $ \Lambda_2$, separately will therefore lead to a robustified version specialized to IO's or AO's. For the AO-specialization we get

$\displaystyle \tilde \psi(s,u)=A \left(\Lambda_1+\Lambda_2(s,u) \min\left\{1,\frac{b}{\Vert A\Lambda_2(s,u)\Vert}\right\}\right)$ (18.15)

As to the IO-variant we have to admit that the robustification against IO's that is possible with rIC-sep-IO is limited. Here you should better take into account more information on the process history up to that point. Encouraging results however have been obtained in a situation with an unfavorable signal-to-noise ratio--in one dimension the quotient of $ R/Q$.


18.4.4 Criterion for the Choice of $ b$

As in the rLS case, we propose the assurance criterium for the choice of $ b$: We adjust the procedure to a given relative efficiency loss in the ideal model. This loss is quantified in this case as the relative degradation of the ``asymptotic'' variance of our estimator which is in our situation just $ \mathop{\rm {{}E{}}}\nolimits \Vert\psi\Vert^2$, which in the ideal model gives again the MSE.

Of course the lower the clipping $ b$, the higher the relative efficiency loss, so that we may solve

$\displaystyle \mathop{\rm {{}E{}}}\nolimits \Vert\psi\Vert^2 \stackrel{!}{=}(1+...
...\Sigma_{t\vert t}=(1+\delta) \mathop{\rm {{}E{}}}\nolimits \Vert\hat\psi\Vert^2$ (18.16)

in $ b$ for a given efficiency loss $ \delta$, which is monotonous in $ \delta$.


18.4.5 Examples

For better comparability the examples for the rIC will use the same setups as those for rLS. So we just write down the modifications necessary to get from the rLS- to the rIC-example.


18.4.5.1 Example 7

As the first example is one-dimensional, 32196 calibrIC uses a simultaneous Newton procedure to determine $ A,b$, so neither a number of grid points nor a MC-sample size is needed and parameter N is without meaning, as well as fact and expl. Nevertheless you are to transmit them to 32199 calibrIC and, beside the rLS setting we write

  fact=1.2
  expl=2
  A0=0     
  b0=-1    
  typ= 0  ; rIC-sim
Next we calibrate the influence curve $ \psi$ to $ \textrm{\tt e}=0.05$.
  ergIC=calibrIC(T,Sig,H,F,Q,R,typ,A0,b0,e,N,eps,itmax,
                                              expl,fact,aus)

  A=ergIC.A
  b=ergIC.b
Calling 32203 rICfil is then very much as calling 32206 rlsfil --just with some extra parameters:
  res= rICfil(y,mu,Sig,H,F,Q,R,typ,A,b)
  frx = res.filtX
32210 XAGrkalm07.xpl

The graphical output is then done just as in Example 4.

Figure 18.6: Actual observations $ \hat=$ solid line, the classical Kalman filter $ \hat=$ dashed / a bit lighter line, the rIC filter $ \hat=$ dotted line; the clipped instances are marked by circles on top of the graphic.
\includegraphics[scale=0.6]{rICbsp1}


18.4.5.2 Example 8

The second example goes through analogously with the following modifications with respect to Example 5:
  N=300 
  eps=0.01
  itmax=15
  aus=4
  fact=1.2
  expl=2
  A0=0     
  b0=-1    
  typ= 0  ; rIC-sim
Note that as we are in $ 2$ dimensions, integration along the directions is 1-dimensional and is done by a Romberg procedure; so the N might even be a bit too large. The next modifications are straightforward:
  ergIC=calibrIC(T,Sig,H,F,Qid,R,typ,A0,b0,e,N,eps,itmax,
                                                expl,fact,aus)

  A=ergIC.A
  b=ergIC.b

  res = kfilter2(y,mu,Sig, H,F,Qid,R)
  fx = res.filtX
  res= rICfil(y,mu,Sig,H,F,Qid,R,typ,A,b)
  frx = res.filtX
32219 XAGrkalm08.xpl

The graphical result is displayed in Figure 18.7.

Figure 18.7: Simulated data according to Example 2 from Petr Franek: The actual states $ \hat=$ solid line, the classical Kalman filter $ \hat=$ dashed / a bit lighter line, the rIC filter $ \hat=$ dotted line; the clipped instances are marked by circles on bottom of the graphic, the AO-instances by circles on top.
\includegraphics[scale=0.55]{rICbsp2}


18.4.5.3 Example 9

Again, as in the third rLS-example, it is shown in the next example that we really loose some efficiency in the ideal model, using the rIC filter instead of the Kalman filter; the following modifications are to be done with respect to Example 6:

  e=0.05
  N=300
  eps=0.01
  itmax=15
  aus=4
  fact=1.2
  expl=2
  A0=0     
  b0=-1    
  typ= 1   ; rIC-sep-AO

  ergIC=calibrIC(T,Sig,H,F,Q,R,typ,A0,b0,e,N,eps,itmax,
                                              expl,fact,aus)

  A=ergIC.A
  b=ergIC.b

  res = kfilter2(y,mu,Sig, H,F,Q,R)
  fx = res.filtX
  res= rICfil(y,mu,Sig,H,F,Q,R,typ,A,b)
  frx = res.filtX
  fry=(H*frx')'
32228 XAGrkalm09.xpl

All this produces the graphics in Figure 18.8.

Figure 18.8: Actual observations $ \hat=$ solid line, the classical Kalman filter $ \hat=$ dashed / a bit lighter line, the rIC filter $ \hat=$ dotted line; the clipped instances are marked by circles.
\includegraphics[scale=0.6]{rICbsp3}


18.4.6 Possible Extensions

As sort of an outlook, we only want to mention here the possibility of using different norms to assess the quality of our procedures. The most important norms besides the euclidean are in our context those derived from the Fisher information of the ideal model (information-standardization) and the asymptotic Covariance of $ \psi$ itself (self-standardization). Generally speaking these two have some nice properties compared to the euclidean norm; so among others, optimal influence curves in this norm stay invariant under smooth transformation in the parameter space, c.f. Rieder (1994). In the context of normal scores, they even lead to a drastic simplification of the calibration problem even in higher dimensions, c.f. Ruckdeschel (1999). Nevertheless the use of this norm has to be justified by the application, and in the XploRe quantlib kalman , they have not yet been included.