Hierarchical models

Alexandre Bouchard-Côté

Hierarchical model: motivation

package ex.be

model LaunchPrediction {
  
  random RealVar failureProbability ?: latentReal
  random IntVar 
    launch1 ?: latentInt, 
    launch2 ?: latentInt, 
    launch3 ?: latentInt, 
    nextLaunch ?: latentInt
  
  laws {
    launch1 | failureProbability ~ Bernoulli(failureProbability)
    launch2 | failureProbability ~ Bernoulli(failureProbability)
    launch3 | failureProbability ~ Bernoulli(failureProbability)
    
    nextLaunch | failureProbability ~ Bernoulli(failureProbability)
    
    failureProbability ~ ContinuousUniform(0, 1)
  }
}

Poll: Do you know of other reasonable choices of prior here?

  1. No
  2. Yes, the Gamma distribution
  3. Yes, the Beta distribution
  4. Yes, the Weibull distribution
  5. Yes, the Beta distribution and many others

Many priors are possible

Choosing prior

General approach

Often, the choice of prior is approached in two stages

When does the choice of prior matter?

Strategies for constructing priors

Hierarchical models

Key idea: use “side data” to inform the prior

data <- read.csv("failure_counts.csv")
data %>% 
  head() %>% 
  knitr::kable(floating.environment="sidewaystable")
LV.Type numberOfLaunches numberOfFailures
Aerobee 1 0
Angara A5 1 0
Antares 110 2 0
Antares 120 2 0
Antares 130 1 1
Antares 230 1 0

How to (badly) use side data

First try

LV.Type numberOfLaunches numberOfFailures
Aerobee 1 0
Angara A5 1 0
Antares 110 2 0
Antares 120 2 0
Antares 130 1 1
Antares 230 1 0

Why it is bad

Towards an improved way to use side data

An improved way to use side data (not quite full Bayesian yet!)

counts <- read.csv("failure_counts.csv")
ggplot(counts, aes(x = numberOfFailures / numberOfLaunches)) +
  geom_histogram()

Solution: go fully Bayesian

Recall:

  1. Construct a probability model including
    • random variables for what we will measure/observe
    • random variables for the unknown quantities
      • those we are interested in (“parameters”, “predictions”)
      • others that just help us formulate the problem (“nuisance”, “random effects”).
  2. Compute the posterior distribution conditionally on the actual data at hand
  3. Use the posterior distribution to:
    • make prediction (point estimate)
    • estimate uncertainty (credible intervals)
    • make a decision

drawing

In our case: just make \(\mu\) and \(s\) random! (or equivalently, \(\alpha\) and \(\beta\))

New higher-level hyperparameters = new problems? No we are probably ok!

It seems we have introduced new problems as now we again have hyperparameters, namely those for the priors on \(\mu\) and \(s\). Here we picked \(\mu \sim {\text{Beta}}(1,1) = {\text{Unif}}(0, 1)\), \(s \sim \text{Exponential}(1/10000)\)

Key point: yes, but now we are less sensitive to these choices!

Why? Heuristic: say you have a random variable connected to some hyper-parameters (grey squares) and random variables connected to data (circles)

Before going hierarchical: for maiden/early flights we had

drawing

After going hierarchical:

drawing

Using more information

full <- read.csv("processed.csv")
[1] "    X  X..Launch    Launch.Date..UTC.      COSPAR         PL.Name                        Orig.PL.Name                SATCAT   LV.Type                 LV.S.N            Site                              Suc   Ref                     Suc_bin  Family           Space.Port    Year   Launch.Index"
[2] "-----  -----------  ---------------------  -------------  -----------------------------  --------------------------  -------  ----------------------  ----------------  --------------------------------  ----  ---------------------  --------  ---------------  -----------  -----  -------------"
[3] "    1  1957 ALP     1957 Oct  4 1928:34    1957 ALP 2     1-y ISZ                        PS-1                        S00002   Sputnik 8K71PS          M1-PS             NIIP-5   LC1                      S     Energiya                      1  Sputnik          NIIP          1957              1"
[4] "    2  1957-U01     1957 Oct 17 0505       1957-U01       USAF 88 Charge A               Poulter Pellet              A08258   Aerobee                 USAF 88           HADC     A                        S     EngSci1.58                    1  Aerobee          HADC          1957              1"
[5] "    3  1957 BET     1957 Nov  3 0230:42    1957 BET 1     2-y ISZ                        PS-2                        S00003   Sputnik 8K71PS          M1-2PS            NIIP-5   LC1                      S     Grahn-WWW                     1  Sputnik          NIIP          1957              2"
[6] "    4  1957-F01     1957 Dec  6 1644:35    1957-F01       Vanguard                       Vanguard Test Satellite     F00002   Vanguard                TV-3              CC       LC18A                    F     Vang-ER9948                   0  Vanguard         CC            1957              1"

Taller hierarchies

Optional exercise

Review the concepts in this topic by going over this optional exercise set: Hierarchical_models.html