Inferring Parameters for ODEs using Stan

Introduction

The equation of motion for a pendulum of unit length subject to Gaussian white noise is

\displaystyle  \frac{\mathrm{d}^2\alpha}{\mathrm{d}t^2} = -g\sin\alpha + w(t)

We can discretize this via the usual Euler method

\displaystyle  \begin{bmatrix} x_{1,i} \\ x_{2,i} \end{bmatrix} = \begin{bmatrix} x_{1,i-1} + x_{2,i-1}\Delta t \\ x_{2,i-1} - g\sin x_{1,i-1}\Delta t \end{bmatrix} + \mathbf{q}_{i-1}

where q_i \sim {\mathcal{N}}(0,Q) and

\displaystyle  Q = \begin{bmatrix} \frac{q^c \Delta t^3}{3} & \frac{q^c \Delta t^2}{2} \\ \frac{q^c \Delta t^2}{2} & {q^c \Delta t} \end{bmatrix}

The explanation of the precise form of the covariance matrix will be the subject of another blog post; for the purpose of exposition of using Stan and, in particular, Stan’s ability to handle ODEs, this detail is not important.

Instead of assuming that we know g let us take it to be unknown and that we wish to infer its value using the pendulum as our experimental apparatus.

Stan is a probabilistic programming language which should be welll suited to perform such an inference. We use its interface via the R package rstan.

A Stan and R Implementation

Let’s generate some samples using Stan but rather than generating exactly the model we have given above, instead let’s solve the differential equation and then add some noise. Of course this won’t quite give us samples from the model the parameters of which we wish to estimate but it will allow us to use Stan’s ODE solving capability.

Here’s the Stan

functions {
  real[] pendulum(real t,
                  real[] y,
                  real[] theta,
                  real[] x_r,
                  int[] x_i) {
    real dydt[2];
    dydt[1] <- y[2];
    dydt[2] <- - theta[1] * sin(y[1]);
    return dydt;
  }
}
data {
  int<lower=1> T;
  real y0[2];
  real t0;
  real ts[T];
  real theta[1];
  real sigma[2];
}
transformed data {
  real x_r[0];
  int x_i[0];
}
model {
}
generated quantities {
  real y_hat[T,2];
  y_hat <- integrate_ode(pendulum, y0, t0, ts, theta, x_r, x_i);
  for (t in 1:T) {
    y_hat[t,1] <- y_hat[t,1] + normal_rng(0,sigma[1]);
    y_hat[t,2] <- y_hat[t,2] + normal_rng(0,sigma[2]);
  }
}

And here’s the R to invoke it

library(rstan)
library(mvtnorm)

qc1 = 0.0001
deltaT = 0.01
nSamples = 100
m0 = c(1.6, 0)
g = 9.81
t0 = 0.0
ts = seq(deltaT,nSamples * deltaT,deltaT)

bigQ = matrix(c(qc1 * deltaT^3 / 3, qc1 * deltaT^2 / 2,
                qc1 * deltaT^2 / 2,       qc1 * deltaT
                ),
              nrow = 2,
              ncol = 2,
              byrow = TRUE
              )

samples <- stan(file = 'Pendulum.stan',
                data = list (
                    T  = nSamples,
                    y0 = m0,
                    t0 = t0,
                    ts = ts,
                    theta = array(g, dim = 1),
                    sigma = c(bigQ[1,1], bigQ[2,2]),
                    refresh = -1
                ),
                algorithm="Fixed_param",
                seed = 42,
                chains = 1,
                iter =1
                )

We can plot the angle the pendulum subtends to the vertical over time. Note that this is not very noisy.

s <- extract(samples,permuted=FALSE)
plot(s[1,1,1:100])

Now let us suppose that we don’t know the value of g and we can only observe the horizontal displacement.

zStan <- sin(s[1,1,1:nSamples])

Now we can use Stan to infer the value of g.

functions {
  real[] pendulum(real t,
                  real[] y,
                  real[] theta,
                  real[] x_r,
                  int[] x_i) {
    real dydt[2];
    dydt[1] <- y[2];
    dydt[2] <- - theta[1] * sin(y[1]);
    return dydt;
  }
}
data {
  int<lower=1> T;
  real y0[2];
  real z[T];
  real t0;
  real ts[T];
}
transformed data {
  real x_r[0];
  int x_i[0];
}
parameters {
  real theta[1];
  vector<lower=0>[1] sigma;
}
model {
  real y_hat[T,2];
  real z_hat[T];
  theta ~ normal(0,1);
  sigma ~ cauchy(0,2.5);
  y_hat <- integrate_ode(pendulum, y0, t0, ts, theta, x_r, x_i);
  for (t in 1:T) {
    z_hat[t] <- sin(y_hat[t,1]);
    z[t] ~ normal(z_hat[t], sigma);
  }
}

Here’s the R to invoke it.

estimates <- stan(file = 'PendulumInfer.stan',
                  data = list (
                      T  = nSamples,
                      y0 = m0,
                      z  = zStan,
                      t0 = t0,
                      ts = ts
                  ),
                  seed = 42,
                  chains = 1,
                  iter = 1000,
                  warmup = 500,
                  refresh = -1
                  )
e <- extract(estimates,pars=c("theta[1]","sigma[1]","lp__"),permuted=TRUE)

This gives an estiamted valeu for g of 9.809999 which is what we would hope.

Now let’s try adding some noise to our observations.

set.seed(42)
epsilons <- rmvnorm(n=nSamples,mean=c(0.0),sigma=bigR)

zStanNoisy <- sin(s[1,1,1:nSamples] + epsilons[,1])

estimatesNoisy <- stan(file = 'PendulumInfer.stan',
                       data = list (T  = nSamples,
                                    y0 = m0,
                                    z  = zStanNoisy,
                                    t0 = t0,
                                    ts = ts
                                    ),
                       seed = 42,
                       chains = 1,
                       iter = 1000,
                       warmup = 500,
                       refresh = -1
                       )
eNoisy <- extract(estimatesNoisy,pars=c("theta[1]","sigma[1]","lp__"),permuted=TRUE)

This gives an estiamted value for g of 8.5871024 which is ok but not great.

Postamble

To build this page, download the relevant files from github and run this in R:

library(knitr)
knit('Pendulum.Rmd')

And this from command line:

pandoc -s Pendulum.md --filter=./Include > PendulumExpanded.html
Advertisements

One thought on “Inferring Parameters for ODEs using Stan

  1. Thanks for the examples, it’s great to see coded solutions. How would you adapt that model if your observations were only at a subset of time points? Similarly how would you adapt it if you observed noisy observations from say three swings of the pendulum?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s