4.4 Random Vectors, Dependence, Correlation

A random vector ( $X_1,\ldots,X_n$ ) from $\mathbb{R}^n$ can be useful in describing the mutual dependencies of several random variables $X_1,\ldots,X_n$ , for example several underlying stocks. The joint distribution of the random variables $X_1,\ldots,X_n$ is as in the univariate case, uniquely determined by the probabilities

$\displaystyle \P(a_1 \le X_1 \le b_1, \ldots, a_n \le X_n \le b_n) \, \, , \quad - \infty < a_i \le b_i < \infty \, , \, i=1,...,n \, .$

If the random vector ( $X_1,\ldots,X_n$ ) has a density $p(x_1, \ldots, x_n)$ , the probabilities can be computed by means of the following integrals:

$\displaystyle \P(a_1 \le X_1 \le b_1, \ldots, a_n \le X_n \le b_n) = \int^{b_n}_{a_n} \ldots \int^{b_1}_{a_1} p(x_1, \ldots, x_n) dx_1\ldots dx_n .$

The univariate or marginal distribution of can be computed from the joint density by integrating out the variable not of interest.

$\displaystyle \P(a_j \le X_j \le b_j)= \int^{\infty}_{-\infty}\ldots\int^{b_j}_{a_j} \ldots \int^{\infty}_{-\infty} p(x_1, \ldots, x_n) dx_1\ldots dx_n.$

The intuitive notion of independence of two random variables is formalized by requiring:

$\displaystyle \P(a_1 \le X_1 \le b_1, \, a_2 \le X_2 \le b_2) = \P(a_1 \le X_1 \le b_1) \, \cdot \, \P(a_2 \le X_2 \le b_2),$

i.e. the joint probability of two events depending on the random vector (

) can be factorized. It is sufficient to consider the univariate distributions of

and

exclusively. If the random vector (

) has a density

, then

and

have densities

and

as well. In this case, independence of both random variables is equivalent to a joint density which can be factorized:

$\displaystyle p(x_1, x_2) = p_1(x_1) p_2(x_2).$

Dependence of two random variables

can be very complicated. If

are jointly normally distributed, their dependency structure can be rather easily quantified by their covariance:

$\displaystyle \mathop{\text{\rm Cov}}(X_1, X_2) = \mathop{\text{\rm\sf E}}[(X_1 - \mathop{\text{\rm\sf E}}[X_1])(X_2 - \mathop{\text{\rm\sf E}}[X_2])],$

as well as by their correlation:

$\displaystyle \mathop{\text{\rm Corr}}(X_1, X_2) = {\displaystyle \frac{\mathop{\text{\rm Cov}}(X_1, X_2)}{\sigma (X_1)\ \cdot \, \sigma (X_2) } \, }.$

The correlation has the advantage of taking values between -1 and +1, which is scale invariant. For jointly normally distributed random variables, independence is equivalent to zero correlation, while complete dependence is equivalent to either a correlation of +1 (

is large when

is large) or a correlation of -1 (

is large when

is small).

In general, it holds for independent random variables $X_1,\ldots,X_n$

$\displaystyle \mathop{\text{\rm Cov}}(X_i, X_j) = 0 \qquad \text{\rm for\ } \, i \not= j \, .$

This implies a useful computation rule:

$\displaystyle \mathop{\text{\rm Var}}\big( \sum^n_{j=1} \, X_j \big) = \sum^n_{j=1} \, \mathop{\text{\rm Var}}(X_j) \, .$

If $X_1,\ldots,X_n$ are independent and have all the same distribution:

$\displaystyle \P(a \le X_i \le b ) = \P(a \le X_j \le b)$ for all $\displaystyle \, i, j \, ,$

we call them independently and identically distributed (i.i.d.).