Lab 1

08 Apr 2015

Logistics

Setup (please do before the tutorial if possible)

In preparation for the tutorial, it would be great if you could install the following on your computer:

If you encounter problems, feel free to come by at the office hour (4pm), or to post the issue on Piazza.

References on java

But be assured that we are there to answer any questions you may have!

Setup at the beginning of the tutorial

At the beginning of the tutorial, you will able to clone the repository simplesmc-scaffold using git clone git@github.com:alexandrebouchard/simplesmc-scaffold.git.

Then, follow these steps:

  • Type gradle eclipse from the root of the repository
  • From eclipse:
    • Import in File menu
    • Import existing projects into workspace
    • Select the root
    • Deselect Copy projects into workspace to avoid having duplicates

High performance posterior inference with SMC

We will cover a prefix of the following (another lab can be organized to cover the rest if there is interest):

  • Efficient resampling
  • Modular design of proposal distributions
  • Parallel and correct SMC implementation
  • PMCMC

Efficient resampling

Wee src/test/java/simplesmc/resampling/TestPopulationResampling.java for instructions.

Modular design of proposal distributions

Eventually, we would like to apply our SMC algorithm to complex problems. However, to ensure correctness of the implementation, it is useful to first attack an easy problem (fully discrete HMM), where we can compute the data likelihood Z analytically (to check SMC's Z estimate converges to it). But how to make sure we do not introduce bugs when we then adapt the code back to a complex model?

The answer is to use interfaces, which will allow us to decouple the interaction of the SMC algorithm with the details of the problems we are trying to solve.

Look at src/main/java/simplesmc/ProblemSpecification.java for the interface we will be using in our algorithm. Read the description of each method (function that will be associated with objects).

Now make src/main/java/simplesmc/hmm/HMMProblemSpecification.java implement this interface (using the keyword implements after the class name, followed by the name of the interface, ProblemSpecification). This will enforce the implementation of the correct methods (right click, select Source - Override/Implement methods). Fill in the implementation by using the HMM's transition as the proposal, and the HMM's emission as you weight update.

After the assignment, if you want to use your algorithm for your own problems/project, you will just have to repeat these steps for your new problem.

Parallel and correct SMC implementation

We will now complete the implementation of SMC, in src/main/java/simplesmc/SMCAlgorithm.java.

Start by implementing a serial version of the proposal. To make it parallel, use BriefParallel.process(..). See an example at http://github.com/alexandrebouchard/briefj#briefparallel. To take into account a required library dependency update pushed to the scaffold on April 10, you may need to pull the latest version of the git repo, and then download the dependencies with gradle eclipse, and finally refresh eclipse.

To test your implementation, run the test in src/test/java/simplesmc/TestSMC.java. Your implementation should agree with the analytic version.

Also create a new test to ensure that the code outputs the same thing when ran twice with the same seed. This is especially important in the concurrent version of the algorithm.

PMCMC

Note: make sure to use git pull to pick up an additional scaffold file created on April 9.

We will now use PMCMC to resample the value of the global parameter selfTransitionProbability. Let us place a uniform prior on this parameter.

We will use artificial data in this question. You can find in the class HMMUtils a function generate(..), which creates synthetic data. We will be using an held-out value of 0.8 for the parameter selfTransitionPr, and a two-states HMM of length 100.

You have two possible avenues here. You can either write your own Metropolis-Hastings proposal that directly calls the SMC implementation you have completed in the previous part.

Alternatively, you use our home-brewed probabilistic programming blang. The main file, TestPMCMC has been written for you. It creates a model declaratively, and provides some explanations on blang. The main thing left to implement is PMCMCFactor, which in logDensity() should delegate likelihood approximation to the existing SMC class. The main tricky part is to ensure that the SMC is not reran when a parameter is rejected. To do this, we need some way to detect changes in the parameters. This is mediated by the cache object, which indexes configurations by calling the signature() method on the parameters (you will have to make HMMProblemSpecification implement WithSignature).

Going further

Some ideas for additional things to try if you are curious:

  • Create a new command line option to set an ess threshold, and only perform resampling when the ess goes below this threshold.
  • Add a command line for the type of resampling scheme as well.
  • Create a new ProblemSpecification containing a more challenging domain.