Lecture 10: Reversible jump MCMC

25 Mar 2015

Instructor: Alexandre Bouchard-Côté
Editor: TBA

Motivation for Reversible jump MCMC (RJMCMC)

Main motivation: model choice.

Recall: notation

$I$: an index over a discrete set of models.
$\Zscr_i$ for $i\in I$: latent space for model $i$.
$p_i$, $\ell_i$, $m_i$: prior, likelihood, and marginal likelihood densities for model $i$.

Recall: key idea of model choice

Put a prior $p$ on $I$, and make the uncertainty over models part of the probabilistic model.

The new joint probability density is given by:

\begin{eqnarray} p((i, z), x) = p(i) p_i(z) \ell_i(x | z), \end{eqnarray}

where $(i, z)$ is a member of a new latent space given by:

\begin{eqnarray}\label{eq:new-latent-space} \Zscr = \bigcup_{i\in I} \left( \{i\} \times \Zscr_i \right), \end{eqnarray}

Recall: model saturation

Instead of defining the global latent space as a union of each model's latent space, define it as a product space,
and add to that an indicator $\mu$ that selects which model to use to explain the data. The event $M_1$ corresponds to $\mu = 1$ and $M_2$, to $\mu = 2$.

This creates the following auxiliary latent space:

\begin{eqnarray} \Zscr' = \{1, 2\} \times \Zscr_1 \times \Zscr_2. \end{eqnarray}

Idea, and comparison to model saturation

Stay in the union space, but make the dimensionality of the space in the union match.
"Pad" with auxiliary iid random variables.

Key advantage:

We do not to instantiate all the auxiliary random variable.
Lazy computation: only sample these auxiliary random variable when they will be needed.
This means we can have an infinite number of auxiliary variables!
This becomes important when $I$ is countable infinite, e.g. for non-parametric models.

Towards RJMCMC: an alternate view to standard Metropolis-Hastings (MH).

**Recall: the MH ratio allows to transform a proposal into a Markov chain with a prescribed stationary distribution.

\begin{eqnarray} \frac{\pi(x')}{\pi(x)} \frac{q(x\mid x')}{q(x'\mid x)}. \end{eqnarray}

Exercise: (more error-prone than it first looks!)

Let $\pi(v)$ be an exponential random variable with rate 1.
Consider the proposal, which, given a current value of $v^{(i)}$, propose the next candidate $v^\star$ as follows:
- Sample a multiplier $m$ with density $g(m) = 1/(\lambda m)$ on the interval $[1/e^{\lambda/2}, e^{\lambda/2}]$.
- Return $v^\star = m \cdot v$.
Compute the MH ratio for this proposal.

First view: computing $q(v^\star\mid v)$...

Second view:

Auxiliary space with states of the form $x = (m, v)$.
Two moves:
- Sample a new value for $m \sim g(\cdot)$. (always accepted)
- Propose one state according to a deterministic function $\psi(m, v) = (m^\star, v^\star)$, where $m^\star = 1/m$ and $v^\star = mv$. (accept-reject)

Questions:

What is $x^{\star\star} = \Psi(\Psi(x))$?...
What is the acceptance ratio for the deterministic proposal?...
Conditions for that to work?

RJMCMC

RJMCMC works similarly to the second view of MH, with the difference that:

We pad a variable number of auxiliary variables in order to be able to build diffeomorphic mappings $\Psi$ (more specifically, mappings with non-vanishing Jacobians).
We many need more than one $\Psi_j$, selected at random according to some probabilities $\rho_{\cdot\to j}$.

Dimensionality matching: a necessary conditions for the mapping to be diffeomorphic is that the input dimensionality of $\Psi$ should match the output dimensionality of $\Psi$.

Consequence: let us say that we want to "jump" from a model with $m_1$ dimensions into one with $m_2$ dimensions. What constraints do we have on the number $n_1$ of auxiliary variables we add to the first model, and the number $n_2$ we add to the second?

Notation:

$p(i)$ prior on model $i$
$\pi_i$ posterior given model $i$
$i,i'$ old and proposed model indices
$x, x'$ old and proposed model parameters
$u_i$: auxiliary variables before the move, input into $\Psi_j$, with density $g_i$
$u_{i'}$: auxiliary variables after the move, output of $\Psi_j$, with density $g_{i'}$

Ratio for RJMCMC:

\begin{eqnarray} \frac{p(i')\pi_{i'}(x')}{p(i)\pi_i(x)} \frac{\rho_{i'\to i}}{\rho_{i\to i'}} \frac{g_{i'}(u_{i'})}{g_{i}(u_{i})} \left| J(x', u_2) \right| \end{eqnarray}

Example: textbook, page 365.