Bayes estimator

\[\begin{align} \renewcommand{\Pr}{\mathbb{P}} \newcommand{\Bern}{\text{Bern}} \newcommand{\Unif}{\text{Unif}} \newcommand{\Bet}{\text{Beta}} \newcommand{\Bin}{\text{Bin}} \newcommand{\Categorical}{\text{Categorical}} \newcommand{\Gamm}{\text{Gamma}} \newcommand\prior{\text{prior}} \newcommand\like{\text{likelihood}} \newcommand{\E}{\mathbf{E}} \newcommand{\var}{\text{Var}} \newcommand{\sd}{\text{SD}} \newcommand{\R}{\mathbf{R}} \newcommand{\1}{{\bf 1}} \newcommand{\argmin}{\textrm{argmin}} \newcommand{\argmax}{\textrm{argmax}} \newcommand{\Ascr}{\mathcal{A}} \newcommand{\ud}{\text{d}} \newcommand\deq{\stackrel{\scriptscriptstyle d}{=}} \newcommand{\Xscr}{\mathscr{X}} \newcommand{\Zscr}{\mathscr{Z}} \newcommand{\Oscr}{\mathscr{O}} \newcommand{\randindex}{\textrm{rand}} \newcommand{\rhot}{\rho} \newcommand{\rhop}{{\rho'}} \newcommand{\best}{\delta_{\text{B}}} \newcommand{\oest}{\delta_{\text{o}}} \end{align}\]

Tutorial on the Bayes estimator

In this tutorial, we will go over a simple example of how to apply a Bayes estimator in practice. We will use a dataset of rocket launches to illustrate the concepts.

drawing

You are consulting for a satellite operator company. The company is about to about to send a $50M satellite on a Delta 7925H rocket. Data: as of November 2020 has been launched 3 times, with 0 failed launches. Should you recommend buying a $1M insurance policy?

Maximum likelihood would therefore give an estimate of 0% for the chance of failure, and hence that no insurance should be purchased. Obviously we will not use this crazy estimate today and instead use Bayesian methods.

Let us use the simplest Bayesian model for this task (in a later tutorial, we will improve this model): we assume the four launches are independent, biased coin flips, all with a shared probability of failure (bias) given by an unknown parameter failureProbability. In other words, we assume a binomial likelihood. We also assume a beta prior on the failureProbability, the parameter $p$ of the binomial (the parameter $n$ is known and set to the number of observed launches, numberOfLaunches).

Let us walk over the step involved into solving this problem using the Bayes estimator and the Blang Probabilistic Programming Language.

ex.be.LaunchPrediction

package ex.be model LaunchPrediction { random RealVar failureProbability ?: latentReal random IntVar launch1 ?: latentInt, launch2 ?: latentInt, launch3 ?: latentInt, nextLaunch ?: latentInt laws { launch1 | failureProbability ~ Bernoulli(failureProbability) launch2 | failureProbability ~ Bernoulli(failureProbability) launch3 | failureProbability ~ Bernoulli(failureProbability) nextLaunch | failureProbability ~ Bernoulli(failureProbability) failureProbability ~ ContinuousUniform(0, 1) } }

See output Reproduce

The laws { ... } block specifies a joint distribution ("law") over a collection of random variables, namely those marked with the keyword random when they are declared above. Each line starting in the form variable ~ ... declares a distribution or conditional distribution.
We haven't introduced observation yet (we will do this when conditioning, step 2), so for now we treat all variable as unobserved (marked using ?: latentReal and ?: latentInt).
Click on "See output" above to see the main plots created by running the code. You can find in particular a visualization of the marginal distribution on the random variable failureProbability by following the tab "Plots", then "posteriorPlot" located in the top right of the page linked by "See output". Right now, we are not conditioning on any of the random variables so the "posterior" reduces to the joint density.

Exercise 1

Look at the plot "failureProbability.pdf". Convince yourself that the approximation should match the prior. Check if it does.

Exercise 2

Look at the plot "nextLaunch.pdf". Use the law of total probability to find the true value of the marginal probability mass function over the nextLaunch variable. Check the quality of the approximation based on this.

We will now introduce the actual dataset (3 successes, 0 failures).

ex.be.Conditioning

package ex.be model Conditioning { random RealVar failureProbability ?: latentReal random IntVar launch1 ?: 0, launch2 ?: 0, launch3 ?: 0, nextLaunch ?: latentInt laws { launch1 | failureProbability ~ Bernoulli(failureProbability) launch2 | failureProbability ~ Bernoulli(failureProbability) launch3 | failureProbability ~ Bernoulli(failureProbability) nextLaunch | failureProbability ~ Bernoulli(failureProbability) failureProbability ~ ContinuousUniform(0, 1) } }

See output Reproduce

Note that we haven't changed the description of the joint distribution (in "laws"). Rather, we changed ?: latentInt into ?: 0. The effect of this is that running the code will now approximate the posterior distribution of the variable that are unknown (latent), given those that now have a known value. (launch1, launch2, launch3)

Look at the marginal distribution of nextLaunch. It is now a posterior marginal. Notice that it has been updated

Exercise 3

Optional: Compute the exact posterior distribution using Bayes rule (hint: to deal with the denominator, use the fact that the numerator Bayes rule has the same functional for as a Beta distribution with respect to the unknown quantity, this trick is called "conjugacy"). Plot the posterior density and compare it with the approximation obtained from Blang.

In the following, we let $x = (x_1, x_2, x_3)$ denote the observed data, and $z = (p, x_4)$, the unobserved data.

To make a decision, we need to encode the following into a loss function (all numbers in \$M):

With insurance $(a=1)$: loss of 0 if success, loss of 50 if failure.
Without $(a=0)$: loss of 1 if success, loss of 1 if failure.

We can encode this as $L(a, z) = \1[a = 1] (\$1M) + \1[a = 0] \1[x_4 = 1] (\$50M)$

The Bayes estimator tells us we should use

\[\begin{align} \newcommand\testddd{OK} \argmin \{ \E[L(a, Z) | X] : a \in \mathcal{A} \} \end{align}\]

This can be simplified into:

\[\begin{align} \argmin \{ \E[L(a, Z) | X] : a \in \Ascr \} &= \argmin \{ \E[\1[a = 1] + 50 \1[a = 0] \1[X_4 = 1] | X] : a \in \Ascr \} \\ &= \argmin \{ \E[\1[a = 1] | X] + 50 \E[ \1[a = 0] \1[X_4 = 1] | X] : a \in \Ascr \} \\ &= \argmin \{ \1[a = 1] + 50 \1[a = 0] \Pr[X_4 = 1 | X] : a \in \Ascr \} \\ &= \1[p_\text{fail}(X) > 1/50]], \end{align}\]

Exercise 4

Combine the posterior computed in the second step with the Bayes estimator to make a decision.

Exercise 5

Optional: does the decision change if a different prior is used?

Bayes

Tutorial on the Bayes estimator

Application context

Creating a joint distribution (step 1)

Conditioning (step 2)

Bayes estimator (step 3)