2.4 Derivatives

For later sections of this book, it will be useful to introduce matrix notation for derivatives of a scalar function of a vector $x$ with respect to $x$. Consider $f:\mathbb{R}^p \to \mathbb{R}$ and a $(p \times 1)$ vector $x$, then $\frac{\partial f(x)}{\partial x}$ is the column vector of partial derivatives $\left\{\frac{\partial f(x)}{\partial x_j}\right\},
j=1,\ldots ,p$ and $\frac{\partial f(x)}{\partial x^{\top}}$ is the row vector of the same derivative ( $\frac{\partial f(x)}{\partial x}$ is called the gradient of $f$).

We can also introduce second order derivatives: $\frac{\partial^2 f(x)}{\partial x\partial x^{\top}}$ is the $(p \times p)$ matrix of elements $\frac{\partial^2 f(x)}{\partial x_i\partial x_j}, i=1,
\ldots ,p$ and $j=1,\ldots , p$. ( $\frac{\partial^2 f(x)}{\partial x\partial x^{\top}}$ is called the Hessian of $f$).

Suppose that $a$ is a $(p \times 1)$ vector and that ${\cal A}= {\cal A}^{\top}$ is a $(p \times p)$ matrix. Then

$\displaystyle \frac{\partial a^{\top}x}{\partial x}$ $\textstyle =$ $\displaystyle \frac{\partial x^{\top}a}{\partial x}=a,$ (2.23)


$\displaystyle \frac{\partial x^{\top} {\cal{A}} x}{\partial x}$ $\textstyle =$ $\displaystyle 2 {\cal{A}} x.$ (2.24)

The Hessian of the quadratic form $Q(x)= x^{\top} {\cal{A}} x$ is:

\begin{displaymath}
\frac{\partial^2 x^{\top} {\cal{A}} x}{\partial x \partial x^{\top}}
= 2\cal{A}.
\end{displaymath} (2.25)

EXAMPLE 2.8   Consider the matrix

\begin{displaymath}{\data A}=\left(\begin{array}{cc}1&2\\ 2&3\end{array}\right).\end{displaymath}

From formulas (2.24) and (2.25) it immediately follows that the gradient of $Q(x)= x^{\top} {\cal{A}} x$ is

\begin{displaymath}\frac{\partial x^{\top} {\cal{A}} x}{\partial x} = 2 {\cal{A}...
...ight)x
=\left(\begin{array}{cc}2x&4x\\ 4x&6x\end{array}\right)
\end{displaymath}

and the Hessian is

\begin{displaymath}
\frac{\partial^2 x^{\top} {\data{A}} x}{\partial x \partial ...
...y}\right)
=\left(\begin{array}{cc}2&4\\ 4&6\end{array}\right).
\end{displaymath}