next up previous contents index
Next: 3.7 Sampler Performance and Up: 3. Markov Chain Monte Previous: 3.5 MCMC Sampling with


3.6 Estimation of Density Ordinates

We mention that if the full conditional densities are available, whether in the context of the multiple-block M-H algorithm or that of the Gibbs sampler, then the MCMC output can be used to estimate posterior marginal density functions ([58,23]). We exploit the fact that the marginal density of $ \boldsymbol{\psi}_{k}$ at the point $ \boldsymbol{\psi}_{k}^{\ast}$ is

$\displaystyle \pi (\boldsymbol{\psi}_{k}^{\ast})=\int \pi (\boldsymbol{\psi}_{k...
...\boldsymbol{\psi}_{-k})\,\pi (\boldsymbol{\psi}_{-k})d\boldsymbol{\psi}_{-k}\;,$    

where as before $ \boldsymbol{\psi}_{-k}=\boldsymbol{\psi}\backslash
\boldsymbol{\psi}_{k}$. Provided the normalizing constant of $ \pi
\left(\boldsymbol{\psi}_{k}^{\ast}\vert \boldsymbol{\psi}_{-k}\right)$ is known, an estimate of the marginal density is available as an average of the full conditional density over the simulated values of  $ \boldsymbol{
\psi}_{-k}$:

$\displaystyle \hat{\pi}(\boldsymbol{\psi}_{k}^{\ast})=M^{-1}\sum_{j=1}^{M}\pi \left(\boldsymbol{\psi} _{k}^{\ast}\vert\boldsymbol{\psi}_{-k}^{(j)}\right)\;.$    

Under the assumptions of Proposition 1,

$\displaystyle M^{-1}\sum_{j=1}^{M}\pi \left(\boldsymbol{\psi}_{k}^{\ast}\vert\b...
...ymbol{\psi} _{-k}^{(j)}\right)\rightarrow \pi (\boldsymbol{\psi}_{k}^{\ast})\;,$   as$\displaystyle \quad M\rightarrow \infty \;.$    

[23] refer to this approach as Rao-Blackwellization because of the connections with the Rao-Blackwell theorem in classical statistics. That connection is more clearly seen in the context of estimating (say) the mean of $ \boldsymbol{\psi}_{k}$, $ E(\boldsymbol{\psi}_{k})=\int \boldsymbol{\psi}_{k}\pi
(\boldsymbol{\psi} _{k})d\boldsymbol{\psi}_{k}$. By the law of the iterated expectation,

$\displaystyle E(\boldsymbol{\psi}_{k})=E\left\{E(\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k})\right\}$    

and therefore the estimates

$\displaystyle M^{-1}\sum_{j=1}^{M}\boldsymbol{\psi}_{k}^{j}$    

and

$\displaystyle M^{-1}\sum_{j=1}^{M}E\left(\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k}^{(j)}\right)$    

both converge to $ E(\boldsymbol{\psi}_{k})$ as $ M\rightarrow
\infty$. Under $ iid$ sampling, and under Markov sampling provided some conditions are satisfied - see [35], [6] and [50], it can be shown that the variance of the latter estimate is smaller than that of the former. Thus, it can help to average the conditional mean $ E(\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k}),$ if that were available, rather than average the draws directly. [23] appeal to this analogy to argue that the Rao-Blackwellized estimate of the density is preferable to that based on the method of kernel smoothing. [11] extends the Rao-Blackwellization approach to estimate reduced conditional ordinates defined as the density of $ \boldsymbol{\psi}_{k}$ conditioned on one or more of the remaining blocks. Finally, [9] provides an importance weighted estimate of the marginal density for cases where the conditional posterior density does not have a known normalizing constant. Chen's estimator is based on the identity

$\displaystyle \pi (\boldsymbol{\psi}_{k}^{\ast})=\int w(\boldsymbol{\psi}_{k}\v...
...{\psi}_{k},\boldsymbol{\psi}_{-k})}\pi (\boldsymbol{\psi})d\boldsymbol{\psi}\;,$    

where $ w(\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k})$ is a completely known conditional density whose support is equal to the support of the full conditional density $ \pi (\boldsymbol{\psi}_{k}\vert\boldsymbol{\psi}_{-k})$. In this form, the normalizing constant of the full conditional density is not required and given a sample of draws $ \{\boldsymbol{\psi}^{(1)}, \ldots,
\boldsymbol{\psi}^{(M)}\}$ from $ \pi (\boldsymbol{\psi})$, a Monte Carlo estimate of the marginal density is given by

$\displaystyle \hat{\pi}(\boldsymbol{\psi}_{k}^{\ast})=M^{-1}\sum_{j=1}^{M}w\lef...
...{\pi \left(\boldsymbol{\psi}_{k}^{(j)},\boldsymbol{\psi} _{-k}^{(j)}\right)}\;.$    

[9] discusses the choice of the conditional density $ w$. Since it depends on $ \boldsymbol{
\psi}_{-k}$, the choice of $ w$ will vary from one sampled draw to the next.


next up previous contents index
Next: 3.7 Sampler Performance and Up: 3. Markov Chain Monte Previous: 3.5 MCMC Sampling with