Normalization constant estimation via stepping stone

Alexandre Bouchard-Côté

Normalization constant

Goal computing

\[Z = \int \gamma(x) {\text{d}}x\]

Main motivation in Bayesian statistics comparing models via Bayes factors, i.e. ratios of marginal likelihood.

Recall from T8-model-selection-basics.html:

Note in contrast to say computing say a posterior mean, a marginal likelihood is not an expectation, in the sense that \(\ux x\) in not a probability distribution in general. This is why it requires its own set of computational methods.

Background

Suppose

\[\gamma(x) = |\sin(x) e^{-x^2}| {{\bf 1}}[x > 0]\]

We want to approximate

\[Z = \int \gamma(x) {\text{d}}x\]

Poll: which of these converge to \(Z\) (a.s.)

  1. \((1/n) \sum_i |\sin(X_i) e^{-X_i^2}|/e^{-X_i}\) for \(X_i \sim \text{Exp}(1)\)
  2. \((1/n) \sum_i |\sin(X_i) e^{-X_i^2}|/(2 e^{-2 X_i})\) for \(X_i \sim \text{Exp}(2)\)
  3. \((1/n) \sum_i |\sin(X_i) e^{-X_i^2}| {{\bf 1}}[X_i > 0] \sqrt{2\pi}/e^{-X_i^2/2}\) for \(X_i \sim \text{Normal}(0, 1)\)
  4. All of the above
  5. None of the above

Background: importance sampling

We want to approximate

\[Z = \int \gamma(x) {\text{d}}x\]

All of the choices follow the following structure: \(X_i \sim q\), and

\[ \begin{align*} Z = \int \gamma(x) {\text{d}}x &= \int \gamma(x) \frac{q(x)}{q(x)} {\text{d}}x \\ &= \int \frac{\gamma(x)}{q(x)} q(x) {\text{d}}x \\ &= {\mathbf{E}}\left[ \frac{\gamma(X_1)}{q(X_1)} \right] \leftarrow \frac{1}{n} \sum_{i=1}^n \frac{\gamma(X_i)}{q(X_i)} \end{align*} \]

This is just importance sampling!

Note for this to work, \(q\) is assumed to be properly normalized.

Next

Naive method to compute normalization constants with importance sampling

We want to approximate

\[Z = \int \gamma(x) {\text{d}}x\]

\[ \frac{1}{n} \sum_{i=1}^n \frac{\gamma(X_i)}{q(X_i)} = \frac{1}{n} \sum_{i=1}^n \frac{{\text{prior}}(X_i) {\text{likelihood}}(X_i)}{{\text{prior}}(X_i)} = \frac{1}{n} \sum_{i=1}^n {\text{likelihood}}(X_i) \]

Stepping stone: a better way to use importance sampling for normalization constants

Stepping stone: some details

Write:

\[Z = Z_N = \frac{Z_N}{Z_{N-1}} \frac{Z_{N-1}}{Z_{N-2}} \dots \frac{Z_{1}}{Z_{0}} Z_0\]

Estimating \(Z_{i+1}/Z_i\) based on importance sampling:

\[ \begin{align*} \frac{Z_{i+1}}{Z_i} = \frac{1}{C} Z = \int \gamma(x) {\text{d}}x &= \frac{1}{C} \int \gamma(x) \frac{\tilde q(x)}{\tilde q(x)} {\text{d}}x \\ &= \frac{1}{C} \int \frac{\gamma(x)}{\tilde q(x)} \tilde q(x) {\text{d}}x \\ &= \int \frac{\gamma(x)}{\tilde q(x)} q(x) {\text{d}}x \\ &= {\mathbf{E}}\left[ \frac{\gamma(X_1)}{\tilde q(X_1)} \right] \leftarrow \frac{1}{n} \sum_{i=1}^n \frac{\gamma(X_i)}{\tilde q(X_i)} \end{align*} \]

For more information, see: https://pubmed.ncbi.nlm.nih.gov/21187451/

This is the default method used in Blang, which you can find in Plots > monitoringPlots > logNormalizationContantProgress.pdf, as we have seen earlier in the course, see e.g. Model_selection.html