Design of MCMC samplers

Alexandre Bouchard-Côté

How MCMC methods are designed..

Goal build a Markov chain with transitions \(K\) such that:

  1. \(K\) is satisfies global balance with respect to the posterior \(\pi\) (\(K\) is “\(\pi\)-invariant”)
  2. \(K\) is irreducible (for each \(x\), \(x'\), there is a number of steps \(n\) such that we can jump from \(x\) to \(x'\) in \(n\) steps, mathematically, \(K^n(x'|x) > 0\))

If both conditions are satisfied, then we have a law of large numbers.

Main tools used to construct efficient algorithms satisfying conditions (1) and (2) above:

Mixture and alternation: example

Poll: which condition(s) \(K_1\) alone satisfies, if any

  1. \(\pi\)-invariance and irreducibility
  2. \(\pi\)-invariance only
  3. irreducibility only
  4. None

Mixing proposals

“Mixing proposals” consists in the following algorithm (applied iteratively)

Notation \(K = p K_1 + (1-p) K_2\)

Key result if both \(K_1\) and \(K_2\) are \(\pi\)-invariant, then their mixture \(K\) is \(\pi\)-invariant

Take a minute to prove this result. Convince yourself at the same time that the argument crucially depends on \(p\) not varying with the current state.

Proof

\[ \begin{align*} \sum_{x} \pi(x) K(x'|x) &= \sum_{x} \pi(x) (p K_1(x'|x) + (1-p) K_2(x'|x)) \\ &= p \sum_{x} \pi(x) K_1(x'|x) + (1-p) \sum_{x} \pi(x) K_2(x'|x)) \\ &= p \pi(x') + (1-p) \pi(x') = \pi(x') \end{align*} \]

Note we can also show that if \(K_1\) and \(K_2\) satisfy detailed balance, then \(K\) also does:

\[ \begin{align*} \pi(x) K(x'|x) &= \pi(x) (p K_1(x'|x) + (1-p) K_2(x'|x)) \\ &= p \pi(x) K_1(x'|x) + (1-p) \pi(x) K_2(x'|x) \\ &= p \pi(x') K_1(x|x') + (1-p) \pi(x') K_2(x|x') \\ &= \pi(x') K(x|x') \\ \end{align*} \]

Alternation proposals

“Alternation proposals” consists in the following algorithm (applied iteratively)

Notation \(K = K_1 K_2\)

Key result if both \(K_1\) and \(K_2\) are \(\pi\)-invariant, then their alternation \(K\) is \(\pi\)-invariant

Take a minute to prove this result. Hint: the notation \(K_1 K_2\) is inspired by matrix multiplication.

Proof

Note: \(K(x'' | x) = \sum_{x'} K_1(x'|x) K_2(x''|x')\) by the law of total probability. Hence:

\[ \begin{align*} \sum_{x} \pi(x) K(x''|x) &= \sum_{x} \pi(x) \sum_{x'} K_1(x'|x) K_2(x''|x') \\ &= \sum_{x'} K_2(x''|x') \sum_{x} \pi(x) K_1(x'|x) \\ &= \sum_{x'} K_2(x''|x') \pi(x') = \pi(x'') \\ \end{align*} \]

Note in contrast to mixtures, even if \(K_1\) and \(K_2\) satisfy detailed balance, then \(K\) may not necessarily satisfy detailed balance

The surprinsing richness of the “MCMC methodology game”

There is a very small number of rules:

  1. The Metropolis-Hastings algorithm is \(\pi\)-invariant
  2. Auxiliary variables/augmented targets
  3. Mixture and Alternation of kernels preserve \(\pi\)-invariance
  4. Marginalization

But a huge number of MCMC samplers can be constructed based on these very simple rules

Optional readings

For a good introduction to MCMC methodologies, see chapter 1–3 in Geyer’s notes