Partial solutions for some of the recent modelling examples
Here is a simple hierarchical model to attack this problem:
ex.vax.Vaccines
package ex.vax
import static extension ex.vax.Utils.*
model Vaccines {
param GlobalDataSource data
param Plate<String> trials
param Plate<String> arms
param Plated<IntVar> groupSizes
random Plated<IntVar> numbersOfCases
random Plated<RealVar> efficacies
random Plated<RealVar> incidences
random RealVar a ?: latentReal, b ?: latentReal, c ?: latentReal, d ?: latentReal
laws {
a ~ Exponential(0.001)
b ~ Exponential(0.001)
c ~ Exponential(0.001)
d ~ Exponential(0.001)
for (Index<String> trial : trials.indices) {
efficacies.get(trial) | a,b ~ Beta(a, b)
incidences.get(trial) | c,d ~ Beta(c, d)
}
for (Index<String> trial : trials.indices) {
for (Index<String> arm : arms.indices) {
numbersOfCases.get(trial, arm) |
IntVar groupSize = groupSizes.get(trial, arm),
RealVar efficacy = if (arm.isControl) 0.0 else efficacies.get(trial),
RealVar incidence = incidences.get(trial)
~ Binomial(groupSize, incidence * (1.0 - efficacy))
}
}
}
}
Instead of assuming a logistic change in failure probability,
this model assumes that it is piecewise constant with an unknown change point:
ex.cp.ChallengerChangePoint
package ex.cp
model ChallengerChangePoint {
random List<RealVar> failureParameters ?: latentRealList(2)
random RealVar changePoint ?: latentReal
random IntVar prediction ?: latentInt
random List<IntVar> incidents
param List<RealVar> temperatures
laws {
for (RealVar failureParameter : failureParameters) {
failureParameter ~ Normal(0.0, 100.0)
}
changePoint ~ Normal(70.0, 100.0)
for (int i : 0 ..< incidents.size) {
incidents.get(i) | RealVar temperature = temperatures.get(i), failureParameters, changePoint
~ Bernoulli(logistic(failureParameters.get(if (temperature < changePoint) 0 else 1)))
}
prediction | failureParameters, changePoint ~ Bernoulli(logistic(failureParameters.get(if (31 < changePoint) 0 else 1)))
}
}
To summarize all the marginal likelihood for different models we tried on
the challenger data:
Description
|
Blang model
|
Marginal likelihood
|
Basic logistic regression
|
Challenger.bl
|
-18.8
|
Spike-and-slab logistic regression
|
ChallengerSpiked.bl
|
-17.7
|
Spike-and-slab logistic regression (modified prior)
|
ChallengerSpiked2.bl
|
-17.2
|
Change point model
|
ChallengerChangePoint.bl
|
-13.9
|
So we can see that the change point model outperforms the other ones according to the marginal likelihood
criterion.
I mentioned briefly the German Tank problem. Here is an implementation of a simple model
to approach this problem.
Read the example given in this wikipedia article,
In the code below, I am using the same observation as in the wikipedia example (summarized with the sufficient
statistics that 4 serial numbers were sampled, and that the maximum observed in these 4 is the serial number '60'):
ex.gt.SerialInference
package ex.gt
model SerialInference {
random IntVar observedMax ?: fixedInt(60)
random IntVar populationSize ?: latentInt
param IntVar nDraws ?: fixedInt(4)
laws {
// An example of a discrete prior: here in this historical example,
// the unknown quantity of
// interest was the number of tanks in the German WW2 army.
// The distribution DiscreteUniform comes from the standard library.
populationSize ~ DiscreteUniform(0, 101)
// In contrast, we will construct SerialSampling by ourself.
// "observedMax" will be matched to the variables marked as random in
// SerialSampling.bl, while "populationSize, nDraws" will be matched to
// the variables marked as param in SerialSampling.bl.
observedMax | populationSize, nDraws ~ SerialSampling(populationSize, nDraws)
}
}
We obtain a credible interval from 61 to 91.
The implementation of the likelihood, SerialSampling
, is shown below:
package ex.gt
import static extension java.util.Collections.shuffle
model SerialSampling {
random IntVar maxRealization
param IntVar populationSize
param IntVar nDraws
laws {
// This illustrates a second method from specifying a (conditional)
// distribution: writing a function that computes the (log) of the
// (conditional) probability.
logf(maxRealization; populationSize, nDraws) {
// The probability is 0 (i.e. the log probability is minus infinity)
// when the parameters are invalid.
if (populationSize < 1 || nDraws < 1 || maxRealization < 0) {
return NEGATIVE_INFINITY
}
if (maxRealization + 1 < nDraws) {
return NEGATIVE_INFINITY
}
if (populationSize <= maxRealization) {
return NEGATIVE_INFINITY
}
return logBinomial(maxRealization, nDraws - 1) - logBinomial(populationSize, nDraws)
}
}
// This part specifies how to *sample* from the probability model. You can think of rand
// as being an outcome, and the code below is then responsible to evaluate the random
// variable at that realization.
generate(rand) {
val List<Integer> list = (0 ..< populationSize).toList
list.shuffle(rand)
return list.subList(0, nDraws).max
}
}