Next: 3.7 Sampler Performance and Up: 3. Markov Chain Monte Previous: 3.5 MCMC Sampling with

3.6 Estimation of Density Ordinates

We mention that if the full conditional densities are available, whether in the context of the multiple-block M-H algorithm or that of the Gibbs sampler, then the MCMC output can be used to estimate posterior marginal density functions ([58,23]). We exploit the fact that the marginal density of $\boldsymbol{\psi}_{k}$ at the point $\boldsymbol{\psi}_{k}^{\ast}$ is

$\displaystyle \pi (\boldsymbol{\psi}_{k}^{\ast})=\int \pi (\boldsymbol{\psi}_{k... ...\boldsymbol{\psi}_{-k})\,\pi (\boldsymbol{\psi}_{-k})d\boldsymbol{\psi}_{-k}\;,$

where as before $\boldsymbol{\psi}_{-k}=\boldsymbol{\psi}\backslash \boldsymbol{\psi}_{k}$ . Provided the normalizing constant of $\pi \left(\boldsymbol{\psi}_{k}^{\ast}\vert \boldsymbol{\psi}_{-k}\right)$ is known, an estimate of the marginal density is available as an average of the full conditional density over the simulated values of $\boldsymbol{ \psi}_{-k}$ :

$\displaystyle \hat{\pi}(\boldsymbol{\psi}_{k}^{\ast})=M^{-1}\sum_{j=1}^{M}\pi \left(\boldsymbol{\psi} _{k}^{\ast}\vert\boldsymbol{\psi}_{-k}^{(j)}\right)\;.$

Under the assumptions of Proposition 1,

$\displaystyle M^{-1}\sum_{j=1}^{M}\pi \left(\boldsymbol{\psi}_{k}^{\ast}\vert\b... ...ymbol{\psi} _{-k}^{(j)}\right)\rightarrow \pi (\boldsymbol{\psi}_{k}^{\ast})\;,$ as $\displaystyle \quad M\rightarrow \infty \;.$

[23] refer to this approach as Rao-Blackwellization because of the connections with the Rao-Blackwell theorem in classical statistics. That connection is more clearly seen in the context of estimating (say) the mean of $\boldsymbol{\psi}_{k}$ , $E(\boldsymbol{\psi}_{k})=\int \boldsymbol{\psi}_{k}\pi (\boldsymbol{\psi} _{k})d\boldsymbol{\psi}_{k}$ . By the law of the iterated expectation,

$\displaystyle E(\boldsymbol{\psi}_{k})=E\left\{E(\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k})\right\}$

and therefore the estimates

$\displaystyle M^{-1}\sum_{j=1}^{M}\boldsymbol{\psi}_{k}^{j}$

and

$\displaystyle M^{-1}\sum_{j=1}^{M}E\left(\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k}^{(j)}\right)$

both converge to $E(\boldsymbol{\psi}_{k})$ as $M\rightarrow \infty$ . Under

sampling, and under Markov sampling provided some conditions are satisfied - see [35], [6] and [50], it can be shown that the variance of the latter estimate is smaller than that of the former. Thus, it can help to average the conditional mean $E(\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k}),$ if that were available, rather than average the draws directly. [23] appeal to this analogy to argue that the Rao-Blackwellized estimate of the density is preferable to that based on the method of kernel smoothing. [11] extends the Rao-Blackwellization approach to estimate reduced conditional ordinates defined as the density of $\boldsymbol{\psi}_{k}$ conditioned on one or more of the remaining blocks. Finally, [9] provides an importance weighted estimate of the marginal density for cases where the conditional posterior density does not have a known normalizing constant. Chen's estimator is based on the identity

$\displaystyle \pi (\boldsymbol{\psi}_{k}^{\ast})=\int w(\boldsymbol{\psi}_{k}\v... ...{\psi}_{k},\boldsymbol{\psi}_{-k})}\pi (\boldsymbol{\psi})d\boldsymbol{\psi}\;,$

where $w(\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k})$ is a completely known conditional density whose support is equal to the support of the full conditional density $\pi (\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k})$ . In this form, the normalizing constant of the full conditional density is not required and given a sample of draws $\{\boldsymbol{\psi}^{(1)}, \ldots, \boldsymbol{\psi}^{(M)}\}$ from $\pi (\boldsymbol{\psi})$ , a Monte Carlo estimate of the marginal density is given by

$\displaystyle \hat{\pi}(\boldsymbol{\psi}_{k}^{\ast})=M^{-1}\sum_{j=1}^{M}w\lef... ...{\pi \left(\boldsymbol{\psi}_{k}^{(j)},\boldsymbol{\psi} _{-k}^{(j)}\right)}\;.$

[9] discusses the choice of the conditional density

. Since it depends on $\boldsymbol{ \psi}_{-k}$ , the choice of

will vary from one sampled draw to the next.

Next: 3.7 Sampler Performance and Up: 3. Markov Chain Monte Previous: 3.5 MCMC Sampling with