Alexandre Bouchard-Côté
4/1/2019
\[ m_i = \int p_i(z) \ell_i(z) {\text{d}}z \]
Notation warning: \(m\) is called \(Z\) in a Monte Carlo context, but here \(Z\) will be used for the random variable corresponding to the latent variable \(z\).
Synonyms used for “marginal likelihood”:
Put a prior \(p\) on \(I\), and make the uncertainty over models part of the probabilistic model.
The new joint probability density is given by:
\[ p((i, z), x) = p(i) p_i(z) \ell_i(x | z), \]
where \((i, z)\) is a member of a new latent space given by:
\[ {\mathscr{Z}}= \bigcup_{i\in I} \left( \{i\} \times {\mathscr{Z}}_i \right), \]
Notation: denote the event that model \(i\) is the model explaining the data by \(M_i\).
Outcome: Using this construction, model choice can in principle be approached using the same methods as those used last week.
Graphical modelling: \({\mathscr{Z}}\) cannot be directly expressed as a non-trivial graphical model (since it is not a product space). How to transform it into a graphical model?
Non-regularity: even with the reductions introduced so far, model selection justifies special attentions because of non-regularities: the likelihood depends in a non-smooth way upon the model indicator variable. Importantly, different models have different dimensionality of the latent space. We will see that MCMC then requires special techniques called trans-dimensional MCMC.
Ratio of the marginal likelihood for two models:
\[ B_{12} = \frac{m_1(x)}{m_2(x)} \]
Values of \(B_{12}\) greater than 1.0 favor model #1 over #2. Values smaller than 1.0 favor #2 over #1.
This is just a reparameterization of the Bayes estimator with an asymmetric 0-1 loss. Note that it is different from a likelihood ratio:
\[ \frac{\sup_{z_1} \ell_1(x|z_1)}{\sup_{z_2} \ell_2(x|z_2)}, \]
which is not used within the Bayesian framework.
Many approaches exist, with pros and cons, see for example 19 dubious ways to compute the marginal likelihood of a phylogenetic tree topology
I will outline some key methods:
The idea is to build an augmented model, which can be written as a graphical model, and from which we can still approximate \(m_i(x)\).
Construction of the auxiliary latent space:
This creates the following auxiliary latent space:
\[ {\mathscr{Z}}' = \{1, 2\} \times {\mathscr{Z}}_1 \times {\mathscr{Z}}_2. \]
Example: dim(\({\mathscr{Z}}_1\)) = 1, dim(\({\mathscr{Z}}_2\)) = 2.
Construction of the auxiliary joint distribution: suppose the current state is \((\mu, z_1, z_2, x)\). We need to define an auxiliary joint density \(\tilde p(\mu, z_1, z_2, y)\).
The idea is that when \(\mu = 1\), we explain the data \(x\) using \(z_1\), and when \(\mu = 2\), we explain the data \(x\) using \(z_2\).
In notation, if \(\mu = 1\), we set:
\[ \tilde p(\mu, z_1, z_2, x) = p(\mu) p_1(z_1) p_2(z_2) \ell_1(x | z_1), \]
and if \(\mu = 2\),
\[ \tilde p(\mu, z_1, z_2, x) = p(\mu) p_1(z_1) p_2(z_2) \ell_2(x | z_2). \]
Pro: can use existing MCMC methods that do not have marginal likelihood computation methods, such as JAGS/Stan .
Cons:
We start with the following identity, which holds under the assumption that we can swap a derivative and an integral (see for example Folland, Real Analysis):
\[ \begin{aligned} \log C_1 &= \log C_1 - \log C_0 \\ &=\int_0^1 \frac{{\text{d}}}{{\text{d}}t} \log C_t \\ &= \int_0^1 \frac{1}{C_t} \frac{{\text{d}}}{{\text{d}}t} C_t \\ &= \int_0^1 \frac{1}{C_t} \frac{{\text{d}}}{{\text{d}}t} \int p(z) \exp(t \log \ell(x|z)) \\ &= \int_0^1 \int \frac{1}{C_t} p(z) \frac{{\text{d}}}{{\text{d}}t} \exp(t \log \ell(x|z)) \\ &= \int_0^1 \int \frac{1}{C_t} p(z) \exp(t \log \ell(x|z)) \log \ell(x|z) \\ &= \int_0^1 \int \pi_t(z) \log \ell(x|z) \\ &= \int_0^1 g(t) \end{aligned} \]
See files/SMC.pdf
Key advantage:
Dimensionality matching: a necessary conditions for the mapping to be diffeomorphic is that the input dimensionality of \(\Psi\) should match the output dimensionality of \(\Psi\).
Consequence: let us say that we want to “jump” from a model with \(m_1\) dimensions into one with \(m_2\) dimensions. What constraints do we have on the number \(n_1\) of auxiliary variables we add to the first model, and the number \(n_2\) we add to the second?
Notation:
Ratio for RJMCMC:
\[ \frac{p(i')\pi_{i'}(x')}{p(i)\pi_i(x)} \frac{\rho_{i'\to i}}{\rho_{i\to i'}} \frac{g_{i'}(u_{i'})}{g_{i}(u_{i})} \left| J(x', u_2) \right| \]
Example: textbook, page 365.
build.gradle
locate the line that starts with compile group: 'ca.ubc.stat', name: 'blangSDK', version: ...
./gradlew clean
then ./gradlew installDist
./gradlew eclipse
then in eclipse,
Refresh
Project > Clean
ArianeHierarchical.bl
the rootsrc/java/demo/ArianeHierarchical.bl
configuration.txt
--model.data data/failure_counts.csv
--model.rocketTypes.name LV.Type
--postProcessor DefaultPostProcessor
--postProcessor.imageFormat pdf
--engine PT
--engine.nChains 10
See files/blang-high-level.pdf
./gradlew clean
./gradlew installDist
java -Xmx2g -cp build/install/blang520Assign/lib/\\* demo.ArianeHierarchical [COMMAND LINE OPTIONS]
Mini-exercise:
--model.useHierarchical false
, the other from ArianeFlat.bl
available hereWe will go over the meaning of the output below in the office hours.
Note: data source data/failure_counts.csv does not contain column failureprobabilities --- treating as missing
WARNING: There were files not up to date in the code repository (see /Users/bouchard/w/blangExample/results/all/2019-04-01-12-43-49-TokB5Hso.exec/executionInfo/code/dirty-files.txt)
Preprocess {
17 samplers constructed with following prototypes:
RealScalar sampled via: [RealSliceSampler]
Initialization {
Warning: small concentrations may cause numeric instability to Dirichlet and Beta distributions. Consider enforcing a lower bound of say 0.5 This message may also occur when slice samling outside of such constraint, you can then ignore this message.
Propagation [ annealParam=0.0 ess=0.900 ]
Propagation [ annealParam=0.001 ess=0.736 ]
Propagation [ annealParam=0.004 ess=0.563 ]
Propagation [ annealParam=0.008 ess=0.410 ]
Resampling [ iter=3 annealParam=0.014 logNormalization=-1.678 ]
Propagation [ annealParam=0.014 ess=0.900 ]
Propagation [ annealParam=0.022 ess=0.717 ]
Propagation [ annealParam=0.034 ess=0.548 ]
Propagation [ annealParam=0.051 ess=0.413 ]
Resampling [ iter=7 annealParam=0.082 logNormalization=-4.323 ]
Propagation [ annealParam=0.082 ess=0.958 ]
Propagation [ annealParam=0.111 ess=0.793 ]
Propagation [ annealParam=0.179 ess=0.677 ]
Propagation [ annealParam=0.222 ess=0.463 ]
Resampling [ iter=0 annealParam=0.305 logNormalization=-9.731 ]
Propagation [ annealParam=0.305 ess=0.992 ]
Propagation [ annealParam=0.333 ess=0.849 ]
Propagation [ annealParam=0.434 ess=0.827 ]
Propagation [ annealParam=0.444 ess=0.596 ]
Propagation [ annealParam=0.556 ess=0.414 ]
Resampling [ iter=0 annealParam=0.667 logNormalization=-17.196 ]
Propagation [ annealParam=0.667 ess=0.900 ]
Propagation [ annealParam=0.771 ess=0.889 ]
Propagation [ annealParam=0.778 ess=0.707 ]
Propagation [ annealParam=0.889 ess=0.548 ]
} [ endingBlock=Initialization blockTime=338.4ms blockNErrors=1 ]
} [ endingBlock=Preprocess blockTime=587.1ms blockNErrors=1 ]
Inference {
Round(1/9) {
Performing 1220 moves... [ nScans=2 nChains=10 movesPerScan=61 ]
swapSummaries [ round=0 lowest=0.0 average=0.375 ]
logNormalizationContantProgress [ round=0 value=-142.950 ]
estimatedLambda [ round=0 value=5.622 ]
actualTemperedRestarts [ round=0 count=0 rate=0.0 ]
asymptoticRoundTripBound [ round=0 count=0.151 rate=0.076 ]
} [ endingBlock=Round(1/9) blockTime=71.62ms blockNErrors=0 ]
Round(2/9) {
Performing 2440 moves... [ nScans=4 nChains=10 movesPerScan=61 ]
swapSummaries [ round=1 lowest=0.100 average=0.477 ]
logNormalizationContantProgress [ round=1 value=-17.644 ]
estimatedLambda [ round=1 value=4.710 ]
actualTemperedRestarts [ round=1 count=0 rate=0.0 ]
asymptoticRoundTripBound [ round=1 count=0.350 rate=0.088 ]
} [ endingBlock=Round(2/9) blockTime=81.04ms blockNErrors=0 ]
...
Round(8/9) {
Performing 153110 moves... [ nScans=251 nChains=10 movesPerScan=61 ]
swapSummaries [ round=7 lowest=0.595 average=0.668 ]
logNormalizationContantProgress [ round=7 value=-20.780 ]
estimatedLambda [ round=7 value=2.992 ]
actualTemperedRestarts [ round=7 count=11 rate=0.044 ]
asymptoticRoundTripBound [ round=7 count=31.436 rate=0.125 ]
} [ endingBlock=Round(8/9) blockTime=811.1ms blockNErrors=0 ]
Round(9/9) {
Performing 305000 moves... [ nScans=500 nChains=10 movesPerScan=61 ]
swapSummaries [ round=8 lowest=0.574 average=0.658 ]
logNormalizationContantProgress [ round=8 value=-20.692 ]
estimatedLambda [ round=8 value=3.075 ]
actualTemperedRestarts [ round=8 count=34 rate=0.068 ]
asymptoticRoundTripBound [ round=8 count=61.347 rate=0.123 ]
} [ endingBlock=Round(9/9) blockTime=1.473s blockNErrors=0 ]
} [ endingBlock=Inference blockTime=3.505s blockNErrors=0 ]
Postprocess {
Post-processing allLogDensities
Post-processing energy
Post-processing failureProbabilities
Post-processing logDensity
Post-processing m
Post-processing s
MC diagnostics
} [ endingBlock=Postprocess blockTime=27.55s blockNErrors=0 ]
executionMilliseconds : 31649
outputFolder : /Users/bouchard/w/blangExample/results/all/2019-04-01-12-43-49-TokB5Hso.exec
Look at results/latest
. We will go over those in the office hours. A copy of what is expected is available here.
Add --help
to the options. You should see the following, which we will go over:
Note: data source data/failure_counts.csv does not contain column failureprobabilities --- treating as missing
--checkIsDAG <boolean> (default value: true)
--engine <PosteriorInferenceEngine: SCM|PT|Forward|Exact|None|fully qualified> (default value: SCM)
--engine.adaptFraction <double> (default value: 0.5)
--engine.initialization <InitType: COPIES|FORWARD|SCM> (default value: SCM)
--engine.ladder <TemperatureLadder: Geometric|EquallySpaced|Polynomial|UserSpecified|fully qualified> (default value: EquallySpaced)
--engine.nChains <Integer> (optional)
description: If unspecified, use the number of threads.
--engine.nPassesPerScan <double> (default value: 3)
--engine.nScans <int> (default value: 1_000)
--engine.nThreads <Cores: Single|Max|Dynamic|Fixed|fully qualified> (default value: Dynamic)
--engine.nThreads.fraction <double> (default value: 0.5)
--engine.nThreads.ignoreUtilizedCores <boolean> (default value: true)
--engine.nThreads.verbose <boolean> (default value: false)
--engine.random <Random> (default value: 1)
--engine.reversible <boolean> (default value: false)
--engine.scmInit.maxAnnealingParameter <double> (default value: 1.0)
description: Use higher values for likelihood maximization
--engine.scmInit.nFinalRejuvenations <int> (default value: 5)
description: Number of rejuvenation passes to do after the change of measure.
--engine.scmInit.nParticles <int> (default value: 1_000)
--engine.scmInit.nThreads <Cores: Single|Max|Dynamic|Fixed|fully qualified> (default value: Dynamic)
--engine.scmInit.nThreads.fraction <double> (default value: 0.5)
--engine.scmInit.nThreads.ignoreUtilizedCores <boolean> (default value: true)
--engine.scmInit.nThreads.verbose <boolean> (default value: false)
--engine.scmInit.random <Random> (default value: 1)
description: Random seed used for proposals and resampling.
--engine.scmInit.resamplingESSThreshold <double> (default value: 0.5)
description: If the (relative) Effective Sample Size (ESS) falls below, perform a resampling round.
--engine.scmInit.resamplingScheme <ResamplingScheme: STRATIFIED|MULTINOMIAL> (default value: STRATIFIED)
--engine.scmInit.temperatureSchedule <TemperatureSchedule: AdaptiveTemperatureSchedule|FixedTemperatureSchedule|fully qualified> (default value: AdaptiveTemperatureSchedule)
description: Algorithm selecting annealing parameter increments.
--engine.scmInit.temperatureSchedule.nudgeFromZeroIfOutOfSupport <double> (default value: 1e-10)
description: If all particles are out of support at first iteration, nudge the temperature a bit so that support constraints kick in.
--engine.scmInit.temperatureSchedule.threshold <double> (default value: 0.9999)
description: Annealing parameter is selected to get the (conditional) ESS decrease specified by this parameter.
--engine.scmInit.temperatureSchedule.useConditional <boolean> (default value: true)
description: See Zhou, Johansen and Aston (2013).
--engine.targetAccept <Double> (optional)
--engine.usePriorSamples <boolean> (default value: true)
--excludeFromOutput <List: Space separated items or "file <path>" to load from newline separated file> (optional)
--experimentConfigs.configFile <File> (optional)
description: If set, use those arguments in provided file that do not appear in the provided arguments.
--experimentConfigs.managedExecutionFolder <boolean> (default value: true)
description: Automatically organize results into subdirectories of 'results/all'?
--experimentConfigs.maxIndentationToPrint <int> (default value: inf)
description: Use -1 to silence all HLogs output
--experimentConfigs.recordExecutionInfo <boolean> (default value: true)
description: Record information such as timing, main class, code version, etc for this run?
--experimentConfigs.recordGitInfo <boolean> (default value: true)
--experimentConfigs.saveStandardStreams <boolean> (default value: true)
description: Save combined standard out and err into a file?
--experimentConfigs.tabularWriter <TabularWriterFactory: CSV|Spark|fully qualified> (default value: CSV)
--initRandom <Random> (default value: 1)
--model <ModelBuilder: fully qualified>
--model.data <GlobalDataSource: Path to the DataSource.>
--model.data.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)
--model.data.reader.commentCharacter <Character> (optional)
--model.data.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)
--model.data.reader.separator <char> (default value: ,)
--model.data.reader.strictQuotes <boolean> (default value: false)
--model.failureProbabilities.dataSource <DataSource: Path to the DataSource.>
--model.failureProbabilities.dataSource.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)
--model.failureProbabilities.dataSource.reader.commentCharacter <Character> (optional)
--model.failureProbabilities.dataSource.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)
--model.failureProbabilities.dataSource.reader.separator <char> (default value: ,)
--model.failureProbabilities.dataSource.reader.strictQuotes <boolean> (default value: false)
--model.failureProbabilities.name <ColumnName> (optional)
description: Name of variable in the plate
--model.m <RealVar: A number or NA> (optional)
--model.numberOfFailures.dataSource <DataSource: Path to the DataSource.>
--model.numberOfFailures.dataSource.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)
--model.numberOfFailures.dataSource.reader.commentCharacter <Character> (optional)
--model.numberOfFailures.dataSource.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)
--model.numberOfFailures.dataSource.reader.separator <char> (default value: ,)
--model.numberOfFailures.dataSource.reader.strictQuotes <boolean> (default value: false)
--model.numberOfFailures.name <ColumnName> (optional)
description: Name of variable in the plate
--model.numberOfLaunches.dataSource <DataSource: Path to the DataSource.>
--model.numberOfLaunches.dataSource.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)
--model.numberOfLaunches.dataSource.reader.commentCharacter <Character> (optional)
--model.numberOfLaunches.dataSource.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)
--model.numberOfLaunches.dataSource.reader.separator <char> (default value: ,)
--model.numberOfLaunches.dataSource.reader.strictQuotes <boolean> (default value: false)
--model.numberOfLaunches.name <ColumnName> (optional)
description: Name of variable in the plate
--model.rocketTypes.dataSource <DataSource: Path to the DataSource.>
--model.rocketTypes.dataSource.reader <DataSourceReader: CSV|fully qualified> (default value: CSV)
--model.rocketTypes.dataSource.reader.commentCharacter <Character> (optional)
--model.rocketTypes.dataSource.reader.ignoreLeadingWhiteSpace <boolean> (default value: true)
--model.rocketTypes.dataSource.reader.separator <char> (default value: ,)
--model.rocketTypes.dataSource.reader.strictQuotes <boolean> (default value: false)
--model.rocketTypes.maxSize <Integer> (optional)
--model.rocketTypes.name <ColumnName> (optional)
--model.s <RealVar: A number or NA> (optional)
--model.useHierarchical <Boolean> (optional)
--postProcessor <PostProcessor: DefaultPostProcessor|NoPostProcessor|fully qualified> (default value: NoPostProcessor)
--postProcessor.blangExecutionDirectory <File> (optional)
description: When called from Blang, this will be the latest run, otherwise point to the .exec folder created by Blang
--postProcessor.burnInFraction <double> (default value: 0.5)
--postProcessor.essEstimator <EssEstimator: BATCH|ACT|AR> (default value: BATCH)
--postProcessor.experimentConfigs.configFile <File> (optional)
description: If set, use those arguments in provided file that do not appear in the provided arguments.
--postProcessor.experimentConfigs.managedExecutionFolder <boolean> (default value: true)
description: Automatically organize results into subdirectories of 'results/all'?
--postProcessor.experimentConfigs.maxIndentationToPrint <int> (default value: inf)
description: Use -1 to silence all HLogs output
--postProcessor.experimentConfigs.recordExecutionInfo <boolean> (default value: true)
description: Record information such as timing, main class, code version, etc for this run?
--postProcessor.experimentConfigs.recordGitInfo <boolean> (default value: true)
--postProcessor.experimentConfigs.saveStandardStreams <boolean> (default value: true)
description: Save combined standard out and err into a file?
--postProcessor.experimentConfigs.tabularWriter <TabularWriterFactory: CSV|Spark|fully qualified> (default value: CSV)
--postProcessor.facetHeight <double> (default value: 2.0)
description: In inches
--postProcessor.facetWidth <double> (default value: 4.0)
description: In inches
--postProcessor.imageFormat <String> (default value: png)
--postProcessor.rCmd <String> (default value: Rscript)
--printAccessibilityGraph <boolean> (default value: false)
--samplers.additional <SamplerSet: Fully qualified instances of blang.mcmc.Sampler>
description: Samplers to be added.
--samplers.excluded <SamplerSet: Fully qualified instances of blang.mcmc.Sampler>
description: Samplers to be excluded (only useful if useAnnotation = true).
--samplers.useAnnotation <boolean> (default value: true)
description: If the arguments of the annotations @Samplers should be used to determine a starting set of sampler types.
--stripped <boolean> (default value: false)
description: Stripped means that the construction of forward simulators and annealers is skipped
--treatNaNAsNegativeInfinity <boolean> (default value: false)
--version <String> (optional)
description: Version of the blang SDK to use (see https://github.com/UBC-Stat-ML/blangSDK/releases), of the form of a git tag x.y.z where x >= 2. If omitted, use the local SDK's 'master' version.