Lecture 10: Reversible jump MCMC
Instructor: Alexandre Bouchard-Côté
Editor: TBA
Motivation for Reversible jump MCMC (RJMCMC)
Main motivation: model choice.
Recall: notation
- $I$: an index over a discrete set of models.
- $\Zscr_i$ for $i\in I$: latent space for model $i$.
- $p_i$, $\ell_i$, $m_i$: prior, likelihood, and marginal likelihood densities for model $i$.
Recall: key idea of model choice
Put a prior $p$ on $I$, and make the uncertainty over models part of the probabilistic model.
The new joint probability density is given by:
\begin{eqnarray} p((i, z), x) = p(i) p_i(z) \ell_i(x | z), \end{eqnarray}
where $(i, z)$ is a member of a new latent space given by:
\begin{eqnarray}\label{eq:new-latent-space} \Zscr = \bigcup_{i\in I} \left( \{i\} \times \Zscr_i \right), \end{eqnarray}
Recall: model saturation
- Instead of defining the global latent space as a union of each model's latent space, define it as a product space,
- and add to that an indicator $\mu$ that selects which model to use to explain the data. The event $M_1$ corresponds to $\mu = 1$ and $M_2$, to $\mu = 2$.
This creates the following auxiliary latent space:
\begin{eqnarray} \Zscr' = \{1, 2\} \times \Zscr_1 \times \Zscr_2. \end{eqnarray}
Idea, and comparison to model saturation
- Stay in the union space, but make the dimensionality of the space in the union match.
- "Pad" with auxiliary iid random variables.
Key advantage:
- We do not to instantiate all the auxiliary random variable.
- Lazy computation: only sample these auxiliary random variable when they will be needed.
- This means we can have an infinite number of auxiliary variables!
- This becomes important when $I$ is countable infinite, e.g. for non-parametric models.
Towards RJMCMC: an alternate view to standard Metropolis-Hastings (MH).
**Recall: the MH ratio allows to transform a proposal into a Markov chain with a prescribed stationary distribution.
\begin{eqnarray} \frac{\pi(x')}{\pi(x)} \frac{q(x\mid x')}{q(x'\mid x)}. \end{eqnarray}
Exercise: (more error-prone than it first looks!)
- Let $\pi(v)$ be an exponential random variable with rate 1.
- Consider the proposal, which, given a current value of $v^{(i)}$, propose the next candidate $v^\star$ as follows:
- Sample a multiplier $m$ with density $g(m) = 1/(\lambda m)$ on the interval $[1/e^{\lambda/2}, e^{\lambda/2}]$.
- Return $v^\star = m \cdot v$.
- Compute the MH ratio for this proposal.
First view: computing $q(v^\star\mid v)$...
Second view:
- Auxiliary space with states of the form $x = (m, v)$.
- Two moves:
- Sample a new value for $m \sim g(\cdot)$. (always accepted)
- Propose one state according to a deterministic function $\psi(m, v) = (m^\star, v^\star)$, where $m^\star = 1/m$ and $v^\star = mv$. (accept-reject)
Questions:
- What is $x^{\star\star} = \Psi(\Psi(x))$?...
- What is the acceptance ratio for the deterministic proposal?...
- Conditions for that to work?
RJMCMC
RJMCMC works similarly to the second view of MH, with the difference that:
- We pad a variable number of auxiliary variables in order to be able to build diffeomorphic mappings $\Psi$ (more specifically, mappings with non-vanishing Jacobians).
- We many need more than one $\Psi_j$, selected at random according to some probabilities $\rho_{\cdot\to j}$.
Dimensionality matching: a necessary conditions for the mapping to be diffeomorphic is that the input dimensionality of $\Psi$ should match the output dimensionality of $\Psi$.
Consequence: let us say that we want to "jump" from a model with $m_1$ dimensions into one with $m_2$ dimensions. What constraints do we have on the number $n_1$ of auxiliary variables we add to the first model, and the number $n_2$ we add to the second?
Notation:
- $p(i)$ prior on model $i$
- $\pi_i$ posterior given model $i$
- $i,i'$ old and proposed model indices
- $x, x'$ old and proposed model parameters
- $u_i$: auxiliary variables before the move, input into $\Psi_j$, with density $g_i$
- $u_{i'}$: auxiliary variables after the move, output of $\Psi_j$, with density $g_{i'}$
Ratio for RJMCMC:
\begin{eqnarray} \frac{p(i')\pi_{i'}(x')}{p(i)\pi_i(x)} \frac{\rho_{i'\to i}}{\rho_{i\to i'}} \frac{g_{i'}(u_{i'})}{g_{i}(u_{i})} \left| J(x', u_2) \right| \end{eqnarray}
Example: textbook, page 365.