STAT 520 - Bayesian Analysis

Alexandre Bouchard-Côté

4/1/2019

Today

Program

Logistics

Recall: examples of model selection

Recall: model selection notation

\[ m_i = \int p_i(z) \ell_i(z) {\text{d}}z \]

Notation warning: \(m\) is called \(Z\) in a Monte Carlo context, but here \(Z\) will be used for the random variable corresponding to the latent variable \(z\).

Synonyms used for “marginal likelihood”:

Recall: key idea for Bayesian model selection

Put a prior \(p\) on \(I\), and make the uncertainty over models part of the probabilistic model.

The new joint probability density is given by:

\[ p((i, z), x) = p(i) p_i(z) \ell_i(x | z), \]

where \((i, z)\) is a member of a new latent space given by:

\[ {\mathscr{Z}}= \bigcup_{i\in I} \left( \{i\} \times {\mathscr{Z}}_i \right), \]

Notation: denote the event that model \(i\) is the model explaining the data by \(M_i\).

Outcome: Using this construction, model choice can in principle be approached using the same methods as those used last week.

Observations

Graphical modelling: \({\mathscr{Z}}\) cannot be directly expressed as a non-trivial graphical model (since it is not a product space). How to transform it into a graphical model?

Non-regularity: even with the reductions introduced so far, model selection justifies special attentions because of non-regularities: the likelihood depends in a non-smooth way upon the model indicator variable. Importantly, different models have different dimensionality of the latent space. We will see that MCMC then requires special techniques called trans-dimensional MCMC.

Bayes factor

Ratio of the marginal likelihood for two models:

\[ B_{12} = \frac{m_1(x)}{m_2(x)} \]

Values of \(B_{12}\) greater than 1.0 favor model #1 over #2. Values smaller than 1.0 favor #2 over #1.

This is just a reparameterization of the Bayes estimator with an asymmetric 0-1 loss. Note that it is different from a likelihood ratio:

\[ \frac{\sup_{z_1} \ell_1(x|z_1)}{\sup_{z_2} \ell_2(x|z_2)}, \]

which is not used within the Bayesian framework.

Computation of Bayes factor

Many approaches exist, with pros and cons, see for example 19 dubious ways to compute the marginal likelihood of a phylogenetic tree topology

I will outline some key methods:

Model saturation

The idea is to build an augmented model, which can be written as a graphical model, and from which we can still approximate \(m_i(x)\).

Construction of the auxiliary latent space:

This creates the following auxiliary latent space:

\[ {\mathscr{Z}}' = \{1, 2\} \times {\mathscr{Z}}_1 \times {\mathscr{Z}}_2. \]

Example: dim(\({\mathscr{Z}}_1\)) = 1, dim(\({\mathscr{Z}}_2\)) = 2.

Construction of the auxiliary joint distribution: suppose the current state is \((\mu, z_1, z_2, x)\). We need to define an auxiliary joint density \(\tilde p(\mu, z_1, z_2, y)\).

The idea is that when \(\mu = 1\), we explain the data \(x\) using \(z_1\), and when \(\mu = 2\), we explain the data \(x\) using \(z_2\).

In notation, if \(\mu = 1\), we set:

\[ \tilde p(\mu, z_1, z_2, x) = p(\mu) p_1(z_1) p_2(z_2) \ell_1(x | z_1), \]

and if \(\mu = 2\),

\[ \tilde p(\mu, z_1, z_2, x) = p(\mu) p_1(z_1) p_2(z_2) \ell_2(x | z_2). \]

Pro: can use existing MCMC methods that do not have marginal likelihood computation methods, such as JAGS/Stan .

Cons:

Thermodynamic integration

We start with the following identity, which holds under the assumption that we can swap a derivative and an integral (see for example Folland, Real Analysis):

\[ \begin{aligned} \log C_1 &= \log C_1 - \log C_0 \\ &=\int_0^1 \frac{{\text{d}}}{{\text{d}}t} \log C_t \\ &= \int_0^1 \frac{1}{C_t} \frac{{\text{d}}}{{\text{d}}t} C_t \\ &= \int_0^1 \frac{1}{C_t} \frac{{\text{d}}}{{\text{d}}t} \int p(z) \exp(t \log \ell(x|z)) \\ &= \int_0^1 \int \frac{1}{C_t} p(z) \frac{{\text{d}}}{{\text{d}}t} \exp(t \log \ell(x|z)) \\ &= \int_0^1 \int \frac{1}{C_t} p(z) \exp(t \log \ell(x|z)) \log \ell(x|z) \\ &= \int_0^1 \int \pi_t(z) \log \ell(x|z) \\ &= \int_0^1 g(t) \end{aligned} \]

SMC

See files/SMC.pdf

Reversible jump

Key advantage:

Reversible jump algorithm

Dimensionality matching: a necessary conditions for the mapping to be diffeomorphic is that the input dimensionality of \(\Psi\) should match the output dimensionality of \(\Psi\).

Consequence: let us say that we want to “jump” from a model with \(m_1\) dimensions into one with \(m_2\) dimensions. What constraints do we have on the number \(n_1\) of auxiliary variables we add to the first model, and the number \(n_2\) we add to the second?

Notation:

Ratio for RJMCMC:

\[ \frac{p(i')\pi_{i'}(x')}{p(i)\pi_i(x)} \frac{\rho_{i'\to i}}{\rho_{i\to i'}} \frac{g_{i'}(u_{i'})}{g_{i}(u_{i})} \left| J(x', u_2) \right| \]

Example: textbook, page 365.

Office hour materials

Set-up

Upgrading Blang (optional)

Hierarchical model for Ariane rockets

Setting up the code

--model.data data/failure_counts.csv 
--model.rocketTypes.name LV.Type
--postProcessor DefaultPostProcessor
--postProcessor.imageFormat pdf
--engine PT 
--engine.nChains 10

High-level overview of Blang

See files/blang-high-level.pdf

Running the code

./gradlew clean
./gradlew installDist
java -Xmx2g -cp build/install/blang520Assign/lib/\\* demo.ArianeHierarchical [COMMAND LINE OPTIONS]

Some quick experimentation with this model

Mini-exercise:

Interpreting the standard out messages

We will go over the meaning of the output below in the office hours.

Note: data source data/failure_counts.csv does not contain column failureprobabilities --- treating as missing
WARNING: There were files not up to date in the code repository (see /Users/bouchard/w/blangExample/results/all/2019-04-01-12-43-49-TokB5Hso.exec/executionInfo/code/dirty-files.txt)
Preprocess {
  17 samplers constructed with following prototypes:
    RealScalar sampled via: [RealSliceSampler]
  Initialization {
    Warning: small concentrations may cause numeric instability to Dirichlet and Beta distributions. Consider enforcing a lower bound of say 0.5 This message may also occur when slice samling outside of such constraint, you can then ignore this message. 
    Propagation [ annealParam=0.0 ess=0.900 ]
    Propagation [ annealParam=0.001 ess=0.736 ]
    Propagation [ annealParam=0.004 ess=0.563 ]
    Propagation [ annealParam=0.008 ess=0.410 ]
    Resampling [ iter=3 annealParam=0.014 logNormalization=-1.678 ]
    Propagation [ annealParam=0.014 ess=0.900 ]
    Propagation [ annealParam=0.022 ess=0.717 ]
    Propagation [ annealParam=0.034 ess=0.548 ]
    Propagation [ annealParam=0.051 ess=0.413 ]
    Resampling [ iter=7 annealParam=0.082 logNormalization=-4.323 ]
    Propagation [ annealParam=0.082 ess=0.958 ]
    Propagation [ annealParam=0.111 ess=0.793 ]
    Propagation [ annealParam=0.179 ess=0.677 ]
    Propagation [ annealParam=0.222 ess=0.463 ]
    Resampling [ iter=0 annealParam=0.305 logNormalization=-9.731 ]
    Propagation [ annealParam=0.305 ess=0.992 ]
    Propagation [ annealParam=0.333 ess=0.849 ]
    Propagation [ annealParam=0.434 ess=0.827 ]
    Propagation [ annealParam=0.444 ess=0.596 ]
    Propagation [ annealParam=0.556 ess=0.414 ]
    Resampling [ iter=0 annealParam=0.667 logNormalization=-17.196 ]
    Propagation [ annealParam=0.667 ess=0.900 ]
    Propagation [ annealParam=0.771 ess=0.889 ]
    Propagation [ annealParam=0.778 ess=0.707 ]
    Propagation [ annealParam=0.889 ess=0.548 ]
  } [ endingBlock=Initialization blockTime=338.4ms blockNErrors=1 ]
} [ endingBlock=Preprocess blockTime=587.1ms blockNErrors=1 ]
Inference {
  Round(1/9) {
    Performing 1220 moves... [ nScans=2 nChains=10 movesPerScan=61 ]
    swapSummaries [ round=0 lowest=0.0 average=0.375 ]
    logNormalizationContantProgress [ round=0 value=-142.950 ]
    estimatedLambda [ round=0 value=5.622 ]
    actualTemperedRestarts [ round=0 count=0 rate=0.0 ]
    asymptoticRoundTripBound [ round=0 count=0.151 rate=0.076 ]
  } [ endingBlock=Round(1/9) blockTime=71.62ms blockNErrors=0 ]
  Round(2/9) {
    Performing 2440 moves... [ nScans=4 nChains=10 movesPerScan=61 ]
    swapSummaries [ round=1 lowest=0.100 average=0.477 ]
    logNormalizationContantProgress [ round=1 value=-17.644 ]
    estimatedLambda [ round=1 value=4.710 ]
    actualTemperedRestarts [ round=1 count=0 rate=0.0 ]
    asymptoticRoundTripBound [ round=1 count=0.350 rate=0.088 ]
  } [ endingBlock=Round(2/9) blockTime=81.04ms blockNErrors=0 ]
...
  Round(8/9) {
    Performing 153110 moves... [ nScans=251 nChains=10 movesPerScan=61 ]
    swapSummaries [ round=7 lowest=0.595 average=0.668 ]
    logNormalizationContantProgress [ round=7 value=-20.780 ]
    estimatedLambda [ round=7 value=2.992 ]
    actualTemperedRestarts [ round=7 count=11 rate=0.044 ]
    asymptoticRoundTripBound [ round=7 count=31.436 rate=0.125 ]
  } [ endingBlock=Round(8/9) blockTime=811.1ms blockNErrors=0 ]
  Round(9/9) {
    Performing 305000 moves... [ nScans=500 nChains=10 movesPerScan=61 ]
    swapSummaries [ round=8 lowest=0.574 average=0.658 ]
    logNormalizationContantProgress [ round=8 value=-20.692 ]
    estimatedLambda [ round=8 value=3.075 ]
    actualTemperedRestarts [ round=8 count=34 rate=0.068 ]
    asymptoticRoundTripBound [ round=8 count=61.347 rate=0.123 ]
  } [ endingBlock=Round(9/9) blockTime=1.473s blockNErrors=0 ]
} [ endingBlock=Inference blockTime=3.505s blockNErrors=0 ]
Postprocess {
  Post-processing allLogDensities
  Post-processing energy
  Post-processing failureProbabilities
  Post-processing logDensity
  Post-processing m
  Post-processing s
  MC diagnostics
} [ endingBlock=Postprocess blockTime=27.55s blockNErrors=0 ]
executionMilliseconds : 31649
outputFolder : /Users/bouchard/w/blangExample/results/all/2019-04-01-12-43-49-TokB5Hso.exec

Looking at the files produced

Look at results/latest. We will go over those in the office hours. A copy of what is expected is available here.

Understanding the command line options

Add --help to the options. You should see the following, which we will go over:

Note: data source data/failure_counts.csv does not contain column failureprobabilities --- treating as missing
 --checkIsDAG <boolean> (default value: true)

 --engine <PosteriorInferenceEngine: SCM|PT|Forward|Exact|None|fully qualified> (default value: SCM)

 --engine.adaptFraction <double> (default value: 0.5)

 --engine.initialization <InitType: COPIES|FORWARD|SCM> (default value: SCM)

 --engine.ladder <TemperatureLadder: Geometric|EquallySpaced|Polynomial|UserSpecified|fully qualified> (default value: EquallySpaced)

 --engine.nChains <Integer> (optional)
   description: If unspecified, use the number of threads.

 --engine.nPassesPerScan <double> (default value: 3)

 --engine.nScans <int> (default value: 1_000)

 --engine.nThreads <Cores: Single|Max|Dynamic|Fixed|fully qualified> (default value: Dynamic)

 --engine.nThreads.fraction <double> (default value: 0.5)

 --engine.nThreads.ignoreUtilizedCores <boolean> (default value: true)

 --engine.nThreads.verbose <boolean> (default value: false)

 --engine.random <Random> (default value: 1)

 --engine.reversible <boolean> (default value: false)

 --engine.scmInit.maxAnnealingParameter <double> (default value: 1.0)
   description: Use higher values for likelihood maximization

 --engine.scmInit.nFinalRejuvenations <int> (default value: 5)
   description: Number of rejuvenation passes to do after the change of measure.

 --engine.scmInit.nParticles <int> (default value: 1_000)

 --engine.scmInit.nThreads <Cores: Single|Max|Dynamic|Fixed|fully qualified> (default value: Dynamic)

 --engine.scmInit.nThreads.fraction <double> (default value: 0.5)

 --engine.scmInit.nThreads.ignoreUtilizedCores <boolean> (default value: true)

 --engine.scmInit.nThreads.verbose <boolean> (default value: false)

 --engine.scmInit.random <Random> (default value: 1)
   description: Random seed used for proposals and resampling.

 --engine.scmInit.resamplingESSThreshold <double> (default value: 0.5)
   description: If the (relative) Effective Sample Size (ESS) falls below, perform a resampling round.

 --engine.scmInit.resamplingScheme <ResamplingScheme: STRATIFIED|MULTINOMIAL> (default value: STRATIFIED)

 --engine.scmInit.temperatureSchedule <TemperatureSchedule: AdaptiveTemperatureSchedule|FixedTemperatureSchedule|fully qualified> (default value: AdaptiveTemperatureSchedule)
   description: Algorithm selecting annealing parameter increments.

 --engine.scmInit.temperatureSchedule.nudgeFromZeroIfOutOfSupport <double> (default value: 1e-10)
   description: If all particles are out of support at first iteration, nudge the temperature a bit so that support constraints kick in.

 --engine.scmInit.temperatureSchedule.threshold <double> (default value: 0.9999)
   description: Annealing parameter is selected to get the (conditional) ESS decrease specified by this parameter.

 --engine.scmInit.temperatureSchedule.useConditional <boolean> (default value: true)
   description: See Zhou, Johansen and Aston (2013).

 --engine.targetAccept <Double> (optional)

 --engine.usePriorSamples <boolean> (default value: true)

 --excludeFromOutput <List: Space separated items or "file <path>" to load from newline separated file> (optional)

 --experimentConfigs.configFile <File> (optional)
   description: If set, use those arguments in provided file that do not appear in the provided arguments.

 --experimentConfigs.managedExecutionFolder <boolean> (default value: true)
   description: Automatically organize results into subdirectories of 'results/all'?

 --experimentConfigs.maxIndentationToPrint <int> (default value: inf)
   description: Use -1 to silence all HLogs output

 --experimentConfigs.recordExecutionInfo <boolean> (default value: true)
   description: Record information such as timing, main class, code version, etc for this run?

 --experimentConfigs.recordGitInfo <boolean> (default value: true)

 --experimentConfigs.saveStandardStreams <boolean> (default value: true)
   description: Save combined standard out and err into a file?

 --experimentConfigs.tabularWriter <TabularWriterFactory: CSV|Spark|fully qualified> (default value: CSV)

 --initRandom <Random> (default value: 1)

 --model <ModelBuilder: fully qualified>

 --model.data <GlobalDataSource: Path to the DataSource.>

 --model.data.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)

 --model.data.reader.commentCharacter <Character> (optional)

 --model.data.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)

 --model.data.reader.separator <char> (default value: ,)

 --model.data.reader.strictQuotes <boolean> (default value: false)

 --model.failureProbabilities.dataSource <DataSource: Path to the DataSource.>

 --model.failureProbabilities.dataSource.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)

 --model.failureProbabilities.dataSource.reader.commentCharacter <Character> (optional)

 --model.failureProbabilities.dataSource.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)

 --model.failureProbabilities.dataSource.reader.separator <char> (default value: ,)

 --model.failureProbabilities.dataSource.reader.strictQuotes <boolean> (default value: false)

 --model.failureProbabilities.name <ColumnName> (optional)
   description: Name of variable in the plate

 --model.m <RealVar: A number or NA> (optional)

 --model.numberOfFailures.dataSource <DataSource: Path to the DataSource.>

 --model.numberOfFailures.dataSource.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)

 --model.numberOfFailures.dataSource.reader.commentCharacter <Character> (optional)

 --model.numberOfFailures.dataSource.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)

 --model.numberOfFailures.dataSource.reader.separator <char> (default value: ,)

 --model.numberOfFailures.dataSource.reader.strictQuotes <boolean> (default value: false)

 --model.numberOfFailures.name <ColumnName> (optional)
   description: Name of variable in the plate

 --model.numberOfLaunches.dataSource <DataSource: Path to the DataSource.>

 --model.numberOfLaunches.dataSource.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)

 --model.numberOfLaunches.dataSource.reader.commentCharacter <Character> (optional)

 --model.numberOfLaunches.dataSource.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)

 --model.numberOfLaunches.dataSource.reader.separator <char> (default value: ,)

 --model.numberOfLaunches.dataSource.reader.strictQuotes <boolean> (default value: false)

 --model.numberOfLaunches.name <ColumnName> (optional)
   description: Name of variable in the plate

 --model.rocketTypes.dataSource <DataSource: Path to the DataSource.>

 --model.rocketTypes.dataSource.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)

 --model.rocketTypes.dataSource.reader.commentCharacter <Character> (optional)

 --model.rocketTypes.dataSource.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)

 --model.rocketTypes.dataSource.reader.separator <char> (default value: ,)

 --model.rocketTypes.dataSource.reader.strictQuotes <boolean> (default value: false)

 --model.rocketTypes.maxSize <Integer> (optional)

 --model.rocketTypes.name <ColumnName> (optional)

 --model.s <RealVar: A number or NA> (optional)

 --model.useHierarchical <Boolean> (optional)

 --postProcessor <PostProcessor: DefaultPostProcessor|NoPostProcessor|fully qualified> (default value: NoPostProcessor)

 --postProcessor.blangExecutionDirectory <File> (optional)
   description: When called from Blang, this will be the latest run, otherwise point to the .exec folder created by Blang

 --postProcessor.burnInFraction <double> (default value: 0.5)

 --postProcessor.essEstimator <EssEstimator: BATCH|ACT|AR> (default value: BATCH)

 --postProcessor.experimentConfigs.configFile <File> (optional)
   description: If set, use those arguments in provided file that do not appear in the provided arguments.

 --postProcessor.experimentConfigs.managedExecutionFolder <boolean> (default value: true)
   description: Automatically organize results into subdirectories of 'results/all'?

 --postProcessor.experimentConfigs.maxIndentationToPrint <int> (default value: inf)
   description: Use -1 to silence all HLogs output

 --postProcessor.experimentConfigs.recordExecutionInfo <boolean> (default value: true)
   description: Record information such as timing, main class, code version, etc for this run?

 --postProcessor.experimentConfigs.recordGitInfo <boolean> (default value: true)

 --postProcessor.experimentConfigs.saveStandardStreams <boolean> (default value: true)
   description: Save combined standard out and err into a file?

 --postProcessor.experimentConfigs.tabularWriter <TabularWriterFactory: CSV|Spark|fully qualified> (default value: CSV)

 --postProcessor.facetHeight <double> (default value: 2.0)
   description: In inches

 --postProcessor.facetWidth <double> (default value: 4.0)
   description: In inches

 --postProcessor.imageFormat <String> (default value: png)

 --postProcessor.rCmd <String> (default value: Rscript)

 --printAccessibilityGraph <boolean> (default value: false)

 --samplers.additional <SamplerSet: Fully qualified instances of blang.mcmc.Sampler>
   description: Samplers to be added.

 --samplers.excluded <SamplerSet: Fully qualified instances of blang.mcmc.Sampler>
   description: Samplers to be excluded (only useful if useAnnotation = true).

 --samplers.useAnnotation <boolean> (default value: true)
   description: If the arguments of the annotations @Samplers should be used to determine a starting set of sampler types.

 --stripped <boolean> (default value: false)
   description: Stripped means that the construction of forward simulators and annealers is skipped

 --treatNaNAsNegativeInfinity <boolean> (default value: false)

 --version <String> (optional)
   description: Version of the blang SDK to use (see https://github.com/UBC-Stat-ML/blangSDK/releases), of the form of a git tag x.y.z where x >= 2. If omitted, use the local SDK's 'master' version.