Blang

Input and output: overview

In this page, we cover

  1. Input: how to load data. This is used for:

    1. fixing a random variable's value to a given observation (conditioning),

    2. setting the hyper-parameters of models,

    3. setting the tuning parameters of inference algorithms.

  2. Output: how to control the output of samples when custom types are used.

Input

Inputs are controlled using the injection framework inits which is designed for dependency injection in the context of scientific models. This is entirely automatic for existing Blang types, only read this section if you want to create custom data types and would like to condition on them, i.e. load data and fix the variable to that loaded value.

To summarize, instantiation of arbitrary types is approached recursively with these main schemes:

For more information, see the README.md file in the inits repository.

As a convention, we use the string NA to mean unobserved (latent). This string can be accessed in a type safe manner via NA:SYMBOL.

Argument parsing is automatically taken care of (by introspection of the injection framework's annotations). Naming of switches is done hierarchically.

Here is a concrete example to show how it works. In Blang's main class, there is an annotated field @Arg PosteriorInferenceEngine engine. This type declares the following implementations:

package blang.engines.internals; import blang.engines.internals.factories.Exact; import blang.engines.internals.factories.Forward; import blang.engines.internals.factories.None; import blang.engines.internals.factories.PT; import blang.engines.internals.factories.SCM; import blang.inits.Implementations; import blang.runtime.SampledModel; import blang.runtime.internals.objectgraph.GraphAnalysis; @Implementations({SCM.class, PT.class, Forward.class, Exact.class, None.class}) public interface PosteriorInferenceEngine { public void setSampledModel(SampledModel model); public void performInference(); public void check(GraphAnalysis analysis); }

Now let's look at one of those implementations, say SCM. SCM's parent class is AdaptiveJarzynski, which declares @Arg Cores nThreads.

In turn, the Core declares the following static factory:

@DesignatedConstructor public Cores( @Input(formatDescription = "Integer - skip or " + MAX + " to use max available") Optional<String> input) { ... }

This creates the following command line options (described here by a snippet of what is produced by --help:

--engine <PosteriorInferenceEngine: SCM|PT|Forward|Exact|None|fully qualified> --engine.nThreads <Cores: Integer - skip or MAX to use max available>

Output

Every Blang execution creates a unique directory. The path is output to standard out at the end of the run. The latest run is also softlinked at results/latest.

The directory has the following structure:

The samples are stored in tidy csv files. For example, two samples for a list of two RealVar's would look like:

index_0,sample,value 0,0,0.45370104866569855 1,0,0.38696647209956947 2,0,0.42871560465749226 0,1,0.5107038773755743 1,1,0.34488603941828144 2,1,0.40406618985385023

By default, the method toString is used to create the last column (value). This behaviour can be customized to follow the tidy philosophy. To do so, implement the interface TidilySerializable (example available here).

The following command line arguments can be used to tune the output: