Bayes

Partial solutions for some of the recent modelling examples

Here is a simple hierarchical model to attack this problem:

ex.vax.Vaccines
package ex.vax import static extension ex.vax.Utils.* model Vaccines { param GlobalDataSource data param Plate<String> trials param Plate<String> arms param Plated<IntVar> groupSizes random Plated<IntVar> numbersOfCases random Plated<RealVar> efficacies random Plated<RealVar> incidences random RealVar a ?: latentReal, b ?: latentReal, c ?: latentReal, d ?: latentReal laws { a ~ Exponential(0.001) b ~ Exponential(0.001) c ~ Exponential(0.001) d ~ Exponential(0.001) for (Index<String> trial : trials.indices) { efficacies.get(trial) | a,b ~ Beta(a, b) incidences.get(trial) | c,d ~ Beta(c, d) } for (Index<String> trial : trials.indices) { for (Index<String> arm : arms.indices) { numbersOfCases.get(trial, arm) | IntVar groupSize = groupSizes.get(trial, arm), RealVar efficacy = if (arm.isControl) 0.0 else efficacies.get(trial), RealVar incidence = incidences.get(trial) ~ Binomial(groupSize, incidence * (1.0 - efficacy)) } } } }

Instead of assuming a logistic change in failure probability, this model assumes that it is piecewise constant with an unknown change point:

ex.cp.ChallengerChangePoint
package ex.cp model ChallengerChangePoint { random List<RealVar> failureParameters ?: latentRealList(2) random RealVar changePoint ?: latentReal random IntVar prediction ?: latentInt random List<IntVar> incidents param List<RealVar> temperatures laws { for (RealVar failureParameter : failureParameters) { failureParameter ~ Normal(0.0, 100.0) } changePoint ~ Normal(70.0, 100.0) for (int i : 0 ..< incidents.size) { incidents.get(i) | RealVar temperature = temperatures.get(i), failureParameters, changePoint ~ Bernoulli(logistic(failureParameters.get(if (temperature < changePoint) 0 else 1))) } prediction | failureParameters, changePoint ~ Bernoulli(logistic(failureParameters.get(if (31 < changePoint) 0 else 1))) } }

To summarize all the marginal likelihood for different models we tried on the challenger data:

Description

Blang model

Marginal likelihood

Basic logistic regression

Challenger.bl

-18.8

Spike-and-slab logistic regression

ChallengerSpiked.bl

-17.7

Spike-and-slab logistic regression (modified prior)

ChallengerSpiked2.bl

-17.2

Change point model

ChallengerChangePoint.bl

-13.9

So we can see that the change point model outperforms the other ones according to the marginal likelihood criterion.

I mentioned briefly the German Tank problem. Here is an implementation of a simple model to approach this problem.

Read the example given in this wikipedia article, In the code below, I am using the same observation as in the wikipedia example (summarized with the sufficient statistics that 4 serial numbers were sampled, and that the maximum observed in these 4 is the serial number '60'):

ex.gt.SerialInference
package ex.gt model SerialInference { random IntVar observedMax ?: fixedInt(60) random IntVar populationSize ?: latentInt param IntVar nDraws ?: fixedInt(4) laws { // An example of a discrete prior: here in this historical example, // the unknown quantity of // interest was the number of tanks in the German WW2 army. // The distribution DiscreteUniform comes from the standard library. populationSize ~ DiscreteUniform(0, 101) // In contrast, we will construct SerialSampling by ourself. // "observedMax" will be matched to the variables marked as random in // SerialSampling.bl, while "populationSize, nDraws" will be matched to // the variables marked as param in SerialSampling.bl. observedMax | populationSize, nDraws ~ SerialSampling(populationSize, nDraws) } }

We obtain a credible interval from 61 to 91.

The implementation of the likelihood, SerialSampling, is shown below:

package ex.gt import static extension java.util.Collections.shuffle model SerialSampling { random IntVar maxRealization param IntVar populationSize param IntVar nDraws laws { // This illustrates a second method from specifying a (conditional) // distribution: writing a function that computes the (log) of the // (conditional) probability. logf(maxRealization; populationSize, nDraws) { // The probability is 0 (i.e. the log probability is minus infinity) // when the parameters are invalid. if (populationSize < 1 || nDraws < 1 || maxRealization < 0) { return NEGATIVE_INFINITY } if (maxRealization + 1 < nDraws) { return NEGATIVE_INFINITY } if (populationSize <= maxRealization) { return NEGATIVE_INFINITY } return logBinomial(maxRealization, nDraws - 1) - logBinomial(populationSize, nDraws) } } // This part specifies how to *sample* from the probability model. You can think of rand // as being an outcome, and the code below is then responsible to evaluate the random // variable at that realization. generate(rand) { val List<Integer> list = (0 ..< populationSize).toList list.shuffle(rand) return list.subList(0, nDraws).max } }