7.1 Estimating GLMs

It is known that the least squares estimator $\widehat{\beta}$ in the classical linear model coincides with the maximum-likelihood estimator under the imposed normal distribution. By using appropriate distributional assumptions for in GLM, one may stay in the framework of maximum-likelihood in this case.

7.1.1 Models

For maximum-likelihood estimation, one assumes that the distribution of belongs to an exponential family. Exponential families cover a broad range of distributions, for example discrete as the Binomial and Poisson distribution or continuous as the Gaussian (normal) and Gamma distribution.

A distribution is said to belong to an exponential family if its probability function (if discrete) or its density function (if continuous) has the structure

$\displaystyle f(y,\theta,\phi) = \exp\left\{\frac{y\theta-b(\theta)}{a(\phi)} + c(y,\phi)\right\},$

(7.1)

with some special functions $a(\bullet)$ , $b(\bullet)$ and $c(\bullet)$ . These functions vary for the distributions contained in this model class.

Generally speaking, we are interested in estimating $\theta=\theta(x^T\beta)$ , the canonical parameter. $\phi$ is a nuisance parameter (as the variance $\sigma^2$ in linear regression for example). Apart from the distribution of , the link function is another essential part of the generalized linear model. Recall the notations

$\displaystyle \eta = x^T\beta \quad\textrm{ and }\quad \mu = G(\eta).$

For each distribution, one special link function exists, namely if

$\displaystyle x^T\beta = \eta =\theta.$

If this holds, the link function is called the canonical link function. For models with a canonical link, some theoretical and practical problems are easier to solve. Table 7.1 summarizes characteristics for some exponential functions together with canonical parameters $\theta$ and their canonical link functions. Note that the Negative Binomial distribution only fits into the framework described above if we assume that the parameter

is known.

Table 7.1: Distribution implemented in GLM.


Notation	Range	$b(\theta)$	$\mu(\theta)$	Canonical	Variance	$a(\phi)$
	of			link $\theta(\mu)$	$V(\mu)$

Normal $N(\mu,\sigma^2)$	$(-\infty,\infty)$	$\theta^2/2$	$\theta$	identity	1	$\sigma^2$

Poisson $P(\mu)$	$[0,\infty)$ integer	$\exp(\theta)$	$\exp(\theta)$	$\log$	$\mu$	1

Binomial $B(m,\pi)$	integer	$m\log(1+e^\theta)$	$\frac{\displaystyle e^\theta}{\displaystyle 1+e^\theta}$	logit	$m\pi(1-\pi)$	1

Gamma $G(\mu,\nu)$	$(0,\infty)$	$-\log(-\theta)$	$-\,1/\theta$	reciprocal	$\mu^2$	$1/\nu$

Inverse Gaussian $IG(\mu,\sigma^2)$	$(0,\infty)$	$-(-2\theta)^{1/2}$	$\frac{\displaystyle -1}{\displaystyle \sqrt{(-2\theta)}}$	squared reciprocal	$\mu^3$	$\sigma^2$

Negative Binomial $N{\!}B(\mu,k)$	$[0,\infty)$ integer	$\frac{\displaystyle - \log\left(1 - e^{\theta}\right)} {\displaystyle k}$	$\frac{\displaystyle e^\theta\!}{\displaystyle k(1-e^\theta)}$	$\log\left(\frac{k\mu}{1+k\mu}\right)$	$\mu + k\mu^2$	1

7.1.2 Maximum-Likelihood Estimation

All models in the glm library are estimated by maximum-likelihood. The default numerical algorithm is the Newton-Raphson iteration (except for ordinary regression where no iteration is necessary). Optionally, a Fisher Scoring can be chosen, which uses the expectation of the Hessian matrix instead of the Hessian itself. In the case of a canonical link function, the Newton-Raphson algorithm and the Fisher scoring algorithm coincide.