11.2 Chart characteristics

Consider the change point model (11.1). For fixed $ m$ denote $ P_m(\cdot)$ and $ E_m(\cdot)$ the corresponding probability measure and expectation, respectively. Hereby, $ m=\infty$ stands for the case of no change, i.e. the so called in-control case. Then the Average Run Length (ARL) (expectation of the run length $ L$) is defined as

$\displaystyle {\cal L}_\mu = \begin{cases}E_\infty (L) & ,\; \mu=\mu_0 \\ E_1 (L) & ,\; \mu\ne\mu_0 \end{cases}\,.$ (11.7)

Thus, the ARL denotes the average number of observations until signal for a sequence with constant expectation. $ \mu=\mu_0$ or $ m=\infty$ stands for no change, $ \mu\ne\mu_0$ and $ m=1$ mark, that just at the first time point (or earlier) a change takes place from $ \mu_0$ to $ \mu$. Therefore, the ARL evaluates only the special scenario of $ m=1$ of the SPC scheme. Other measures, which take into account that usually $ 1<m<\infty$, were introduced by Lorden (1971) and Pollak and Siegmund (1975), Pollak and Siegmund (1975). Here, we use a performance measure which was firstly proposed by Roberts (1959). The so called (conditional) Average Delay (AD, also known as steady-state ARL) is defined as


$\displaystyle {\cal D}_\mu$ $\displaystyle =$ $\displaystyle \lim_{m\to\infty} {\cal D}^{(m)}_\mu \,,$ (11.8)
$\displaystyle {\cal D}^{(m)}_\mu$ $\displaystyle =$ $\displaystyle E_m \big( L-m+1\vert L\ge m \big)\,,$  

where $ \mu$ is the value of $ \mu_1$ in (11.1), i.e. the expectation after the change. While $ {\cal L}_\mu$ measures the delay for the case $ m=1$, $ {\cal D}_\mu$ determines the delay for a SPC scheme which ran a long time without signal. Usually, the convergence in (11.8) is very fast. For quite small $ m$ the difference between $ {\cal D}^{(m)}_\mu$ and $ {\cal D}_\mu$ is very small already. $ {\cal L}_\mu$ and $ {\cal D}_\mu$ are average values for the random variable $ L$. Unfortunately, $ L$ is characterized by a large standard deviation. Therefore, one might be interested in the whole distribution of $ L$. Again, we restrict on the special cases $ m=1$ and $ m=\infty$. We consider the probability mass function $ P_\mu(L=n)$ (PMF) and the cumulative distribution function $ P_\mu(L\le n)$ (CDF). Based on the CDF, one is able to compute quantiles of the run length $ L$.

For normally distributed random variables it is not possible to derive exact solutions for the above characteristics. There are a couple of approximation techniques. Besides very rough approximations based on the Wald approximation known from sequential analysis, Wiener process approximations and similar methods, three main methods can be distinguished:

  1. Markov chain approach due to Brook and Evans (1972): Replacement of the continuous statistic $ Z_t$ by a discrete one

  2. Quadrature of integral equations which are derived for the ARL, Vance (1986) and Crowder (1986) and for some eigenfunctions which lead to the AD

  3. Waldmann (1986) approach: Iterative computation of $ P(L=n)$ by using quadrature and exploiting of monotone bounds for the considered characteristics

Here we use the first approach, which has the advantage, that all considered characteristics can be presented in a straightforward way. Next, the Markov chain approach is briefly described. Roughly speaking, the continuous statistic $ Z_t$ is approximated by a discrete Markov chain $ M_t$. The transition $ Z_{t-1}=x\to Z_t=y$ is approximated by the transition $ M_{t-1}=i\,w\to
M_t=j\,w$ with $ x\in[i\,w-w/2,i\,w+w/2]$ and $ y\in[j\,w-w/2,j\,w+w/2]$. That is, given an integer $ r$ the continuation region of the scheme $ [-c,c]$, $ [$zreflect$ ,c]$, or $ [0,c]$ is separated into $ 2\,r+1$ or $ r+1$ intervals of the kind $ [i\,w-w/2,i\,w+w/2]$ (one exception is $ [0,w/2]$ as the first subinterval of $ [0,c]$). Then, the transition kernel $ f$ of $ Z_t$ is approximated by the discrete kernel of $ M_t$, i.e.

$\displaystyle f(x,y) \approx P(i\,w\to[j\,w-w/2,j\,w+w/2])/w $

for all $ x\in[i\,w-w/2,i\,w+w/2]$ and $ y\in[j\,w-w/2,j\,w+w/2]$. Eventually, we obtain a Markov chain $ \{M_t\}$ with $ 2\,r+1$ or $ r+1$ transient states and one absorbing state. The last one corresponds to the alarm (signal) of the scheme.

Denote by $ Q=(q_{ij})$ the matrix of transition probabilities of the Markov chain $ \{M_t\}$ on the transient states, $ \underline{1}$ a vector of ones, and $ \underline{L}=(L_i)$ the ARL vector. $ L_i$ stands for the ARL of a SPC scheme which starts in point $ i\,w$ (corresponds to $ z_0$). In the case of a one-sided CUSUM scheme with $ z_0=0\ni[0,w/2]$ the value $ L_0$ approximates the original ARL. By using $ \underline{L}$ we generalize the original schemes to schemes with possibly different starting values $ z_0$. Now, the following linear equation system is valid, Brook and Evans (1972):

$\displaystyle (I-Q)\,\underline{L} = \underline{1}\,,$ (11.9)

where $ I$ denotes the identity matrix. By solving this equation system we get the ARL vector $ \underline{L}$ and an approximation of the ARL of the considered SPC scheme. Remark that the larger $ r$ the better is the approximation. In the days of Brook and Evans (1972) the maximal matrix dimension $ r+1$ (they considered cusum1) was 15 because of the restrictions of the available computing facilities. Nowadays, one can use dimensions larger than some hundreds. By looking at different $ r$ one can find a suitable value. The quantlet 21910 XFGrarl.xpl demonstrates this effect for the Brook and Evans (1972) example. 9 different values of $ r$ from 5 to 500 are used to approximate the in-control ARL of a one-sided CUSUM chart with $ k=0.5$ and $ c_3=3$ (variance $ \sigma^2=1$). We get


$ r$ 5 10 20 30 40 50 100 200 500
$ {\cal L}_0$ 113.47 116.63 117.36 117.49 117.54 117.56 117.59 117.59 117.60
<>



21914 XFGrarl.xpl

The true value is 117.59570 (obtainable via a very large $ r$ or by using the quadrature methods with a suitable large number of abscissas). The computation of the average delay (AD) requires more extensive calculations. For details see, e.g., Knoth (1998) on CUSUM for Erlang distributed data. Here we apply the Markov chain approach again, Crosier (1986). Given one of the considered schemes and normally distributed data, the matrix $ Q$ is primitive, i.e. there exists a power of $ Q$ which is positive. Then $ Q$ has one single eigenvalue which is larger in magnitude than the remaining eigenvalues. Denote this eigenvalue by $ \varrho$. The corresponding left eigenvector $ \underline{\psi}$ is strictly positive, i.e.

$\displaystyle \underline{\psi}\,Q = \varrho\,\underline{\psi}\;,\,\underline{\psi}>0 \,.$ (11.10)

It can be shown, Knoth (1998), that the conditional density $ f(\cdot\vert
L\ge m)$ of both the continuous statistic $ Z_t$ and the Markov chain $ M_t$ tends for $ m\to\infty$ to the normalized left eigenfunction and eigenvector, respectively, which correspond to the dominant eigenvalue $ \varrho$. Therefore, the approximation of $ {\cal D}=\lim\limits_{m\to\infty} E_m ( L-m+1\vert L\ge m)$ can be constructed by

$\displaystyle D = (\underline{\psi}^T \underline{L})/(\underline{\psi}^T \underline{1})\,.$    

Note, that the left eigenvector $ \underline{\psi}$ is computed for the in-control mean $ \mu_0$, while the ARL vector $ \underline{L}$ is computed for a specific out-of-control mean or $ \mu_0$ again.

If we replace in the above quantlet ( 21919 XFGrarl.xpl ) the phrase arl by ad, then we obtain the following output which demonstrates the effect of the parameter $ r$ again.


$ r$ 5 10 20 30 40 50 100 200 500
$ {\cal D}_0$ 110.87 114.00 114.72 114.85 114.90 114.92 114.94 114.95 114.95
<>



21923 XFGrad.xpl

Fortunately, for smaller values of $ r$ than in the ARL case we get good accuracy already. Note, that in case of cusum2 the value $ r$ has to be smaller (less than 30) than for the other charts, since it is based on the computation of the dominant eigenvalue of a very large matrix. The approximation in case of combination of two one-sided schemes needs a twodimensional approximating Markov chain. For the ARL only exists a more suitable approach. As, e.g., Lucas and Crosier (1982) shown it is possible to use the following relation between the ARLs of the one- and the two-sided schemes. Here, the two-sided scheme is a combination of two symmetric one-sided schemes which both start at $ z_0=0$. Therefore, we get a very simple formula for the ARL $ {\cal L}$ of the two-sided scheme and the ARLs $ {\cal L}_{upper}$ and $ {\cal
L}_{lower}$ of the upper and lower one-sided CUSUM scheme

$\displaystyle {\cal L} = \frac{{\cal L}_{upper}\cdot{\cal L}_{lower}} {{\cal L}_{upper}+{\cal L}_{lower}} \,.$ (11.11)

Eventually, we consider the distribution function of the run length $ L$ itself. By using the Markov chain approach and denoting with $ p_i^n$ the approximated probability of $ (L>n)$ for a SPC scheme started in $ i\,w$, such that $ \underline{p}^n=(p_i^n)$, we obtain

$\displaystyle \underline{p}^n=\underline{p}^{n-1}\,Q=\underline{p}^0\,Q^n \,.$ (11.12)

The vector $ \underline{p}^0$ is initialized with $ p_i^0=1$ for the starting point $ z_0\in [i\,w-w/2,i\,w+w/2]$ and $ p_j^0=0$ otherwise. For large $ n$ we can replace the above equation by

$\displaystyle p_i^n \approx g_i\,\varrho^n \,.$ (11.13)

The constant $ g_i$ is defined as

$\displaystyle g_i = \phi_i/( \underline{\phi}^T \underline{\psi})\,,$    

where $ \underline{\phi}$ denotes the right eigenvector of $ Q$, i.e. $ Q\,\underline{\phi}=\varrho\,\underline{\phi}$. Based on (11.12) and (11.13) the probability mass and the cumulative distribution function of the run length $ L$ can be approximated. (11.12) is used up to a certain $ n$. If the difference between (11.12) and (11.13) is smaller than $ 10^{-9}$, then exclusively (11.13) is exploited. Remark, that the same is valid as for the AD. For the two-sided CUSUM scheme (cusum2) the parameter $ r$ has to be small ($ \le30$).


11.2.1 Average Run Length and Critical Values

The spc quantlib provides the quantlets 22316 spcewma1arl ,..., 22319 spccusumCarl for computing the ARL of the corresponding SPC scheme. All routines need the actual value of $ \mu$ as a scalar or as a vector of several $ \mu$, two scheme parameters, and the integer $ r$ (see the beginning of the section). The XploRe example 22326 XFGarl.xpl demonstrates all ...arl routines for $ k=0.5$, $ \lambda=0.1$, zreflect$ =-4$, $ r=50$, $ c=3$, in-control and out-of-control means $ \mu_0=0$ and $ \mu_1=1$, respectively. The next table summarizes the ARL results

chart ewma1 ewma2 cusum1 cusum2 cusumC
$ {\cal L}_0$ 1694.0 838.30 117.56 58.780 76.748
$ {\cal L}_1$ 11.386 11.386 6.4044 6.4036 6.4716
<>

22330 XFGarl.xpl

Remember that the ARL of the two-sided CUSUM (cusum2) scheme is based on the one-sided one, i.e. $ 58.78 = 117.56/2$ and $ 6.4036 =
(6.4044\cdot49716)/(6.4044+49716)$ with $ 49716={\cal L}_{-1}$.

For the setup of the SPC scheme it is usual to give the design parameter $ \lambda$ and $ k$ for EWMA and CUSUM, respectively, and a value $ \xi$ for the in-control ARL. Then, the critical value $ c$ ($ c_2$ or $ c_3$) is the solution of the equation $ {\cal L}_{\mu_0}(c)=\xi$. Here, the regula falsi is used with an accuracy of $ \vert{\cal L}_{\mu_0}(c)-\xi\vert<0.001$. The quantlet 22335 XFGc.xpl demonstrates the computation of the critical values for SPC schemes with in-control ARLs of $ \xi=300$, reference value $ k=0.5$ (CUSUM), smoothing parameter $ \lambda=0.1$ (EWMA), zreflect$ =-4$, and the Markov chain parameter $ r=50$.

chart ewma1 ewma2 cusum1 cusum2 cusumC
$ c$ 2.3081 2.6203 3.8929 4.5695 4.288
<>

22339 XFGc.xpl

The parameter $ r=50$ guarantees fast computation and suitable accuracy. Depending on the power of the computer one can try values of $ r$ up to 1000 or larger (see 22344 XFGrarl.xpl in the beginning of the section).


11.2.2 Average Delay

The usage of the routines for computing the Average Delay (AD) is similar to the ARL routines. Replace only the code arl by ad. Be aware that the computing time is larger than in case of the ARL, because of the computation of the dominant eigenvalue. It would be better to choose smaller $ r$, especially in the case of the two-sided CUSUM. Unfortunately, there is no relation between the one- and two-sided schemes as for the ARL in (11.11). Therefore, the library computes the AD for the two-sided CUSUM based on a twodimensional Markov chain with dimension $ (r+1)^2\times(r+1)^2$. Thus with values of $ r$ larger than 30, the computing time becomes quite large. Here the results follow for the above quantlet 22468 XFGrarl.xpl with ad instead of arl and $ r=30$ for 22471 spccusum2ad :

chart ewma1 ewma2 cusum1 cusum2 cusumC
$ {\cal D}_0$ 1685.8 829.83 114.92 56.047 74.495
$ {\cal D}_1$ 11.204 11.168 5.8533 5.8346 6.2858
<>

22475 XFGad.xpl


11.2.3 Probability Mass and Cumulative Distribution Function

The computation of the probability mass function (PMF) and of the cumulative distribution function (CDF) is implemented in two different types of routines. The first one with the syntax spcchartpmf returns the values of the PMF $ P(L=n)$ and CDF $ P(L\le n)$ at given single points of $ n$, where chart has to be replaced by ewma1, ..., cusumC. The second one written as spcchartpmfm computes the whole vectors of the PMF and of the CDF up to a given point $ n$, i.e. $ \big(P(L=1),P(L=2),\ldots,P(L=n)\big)$ and the similar one of the CDF.

Note, that the same is valid as for the Average Delay (AD). In case of the two-sided CUSUM scheme the computations are based on a twodimensional Markov chain. A value of parameter $ r$ less than 30 would be computing time friendly.

With the quantlet 22567 XFGpmf1.xpl the 5 different schemes ($ r=50$, for cusum2 $ r=25$) are compared according their in-control PMF and CDF ( $ \mu=\mu_0=0$) at the positions $ n$ in $ \{1,10,20,30,50,100,200,300\}$. Remark, that the in-control ARL of all schemes is chosen as 300.


chart ewma1 ewma2 cusum1 cusum2 cusumC
$ P(L=1)$ $ 6\cdot10^{-8}$ $ 2\cdot10^{-9}$ $ 6\cdot10^{-6}$ $ 4\cdot10^{-7}$ $ 2\cdot10^{-6}$
$ P(L=10)$ 0.00318 0.00272 0.00321 0.00307 0.00320
$ P(L=20)$ 0.00332 0.00324 0.00321 0.00325 0.00322
$ P(L=30)$ 0.00315 0.00316 0.00310 0.00314 0.00311
$ P(L=50)$ 0.00292 0.00296 0.00290 0.00294 0.00290
$ P(L=100)$ 0.00246 0.00249 0.00245 0.00248 0.00245
$ P(L=200)$ 0.00175 0.00177 0.00175 0.00176 0.00175
$ P(L=300)$ 0.00125 0.00126 0.00124 0.00125 0.00125
$ P(L=1)$ $ 6\cdot10^{-8}$ $ 2\cdot10^{-9}$ $ 6\cdot10^{-6}$ $ 4\cdot10^{-7}$ $ 2\cdot10^{-6}$
$ P(L\le10)$ 0.01663 0.01233 0.02012 0.01675 0.01958
$ P(L\le20)$ 0.05005 0.04372 0.05254 0.04916 0.05202
$ P(L\le30)$ 0.08228 0.07576 0.08407 0.08109 0.08358
$ P(L\le50)$ 0.14269 0.13683 0.14402 0.14179 0.14360
$ P(L\le100)$ 0.27642 0.27242 0.27728 0.27658 0.27700
$ P(L\le200)$ 0.48452 0.48306 0.48480 0.48597 0.48470
$ P(L\le300)$ 0.63277 0.63272 0.63272 0.63476 0.63273
<>


22571 XFGpmf1.xpl

A more appropriate, graphical representation provides the quantlet 22576 XFGpmf2.xpl . Figure 11.4 shows the corresponding graphs.

Figure 11.4: CDF for two-sided EWMA and Crosier's CUSUM for $ \mu =0$ (in control) and $ \mu =1$ (out of control)
\includegraphics[width=1.4\defpicwidth]{pmf2fig1.ps} \includegraphics[width=1.4\defpicwidth]{pmf2fig2.ps}