# Introduction

In the 1920s, Lotka (1909) and Volterra (1926) developed a model of a very simple predator-prey ecosystem.

\displaystyle \begin{aligned} \frac{\mathrm{d}N_1}{\mathrm{d}t} & = & \rho_1 N_1 - c_1 N_1 N_2 \label{eq2a} \\ \frac{\mathrm{d}N_2}{\mathrm{d}t} & = & c_2 N_1 N_2 - \rho_2 N2 \label{eq2b} \end{aligned}

Although simple, it turns out that the Canadian lynx and showshoe hare are well represented by such a model. Furthermore, the Hudson Bay Company kept records of how many pelts of each species were trapped for almost a century, giving a good proxy of the population of each species.

We can capture the fact that we do not have a complete model by describing our state of ignorance about the parameters. In order to keep this as simple as possible let us assume that log parameters undergo Brownian motion. That is, we know the parameters will jiggle around and the further into the future we look the less certain we are about what values they will have taken. By making the log parameters undergo Brownian motion, we can also capture our modelling assumption that birth, death and predation rates are always positive. A similar approach is taken in Dureau, Kalogeropoulos, and Baguelin (2013) where the (log) parameters of an epidemiological model are taken to be Ornstein-Uhlenbeck processes (which is biologically more plausible although adds to the complexity of the model, something we wish to avoid in an example such as this).

Andrieu, Doucet, and Holenstein (2010) propose a method to estimate the parameters of such models (Particle Marginal Metropolis Hastings aka PMMH) and the domain specific probabilistic language LibBi (Murray (n.d.)) can be used to apply this (and other inference methods).

For the sake of simplicity, in this blog post, we only model one parameter as being unknown and undergoing Brownian motion. A future blog post will consider more sophisticated scenarios.

# A Dynamical System Aside

The above dynamical system is structurally unstable (more on this in a future post), a possible indication that it should not be considered as a good model of predator–prey interaction. Let us modify this to include carrying capacities for the populations of both species.

\displaystyle \begin{aligned} \frac{\mathrm{d}N_1}{\mathrm{d}t} & = & \rho_1 N_1 \bigg(1 - \frac{N_1}{K_1}\bigg) - c_1 N_1 N_2 \\ \frac{\mathrm{d}N_2}{\mathrm{d}t} & = & -\rho_2 N_2 \bigg(1 + \frac{N_2}{K_2}\bigg) + c_2 N_1 N_2 \end{aligned}

# Data Generation with LibBi

Let’s generate some data using LibBi.

// Generate data assuming a fixed growth rate for hares rather than
// e.g. a growth rate that undergoes Brownian motion.

model PP {
const h         = 0.1;    // time step
const delta_abs = 1.0e-3; // absolute error tolerance
const delta_rel = 1.0e-6; // relative error tolerance

const a  = 5.0e-1 // Hare growth rate
const k1 = 2.0e2  // Hare carrying capacity
const b  = 2.0e-2 // Hare death rate per lynx
const d  = 4.0e-1 // Lynx death rate
const k2 = 2.0e1  // Lynx carrying capacity
const c  = 4.0e-3 // Lynx birth rate per hare

state P, Z       // Hares and lynxes
state ln_alpha   // Hare growth rate - we express it in log form for
// consistency with the inference model
obs P_obs        // Observations of hares

sub initial {
P ~ log_normal(log(100.0), 0.2)
Z ~ log_normal(log(50.0), 0.1)
}

sub transition(delta = h) {
ode(h = h, atoler = delta_abs, rtoler = delta_rel, alg = 'RK4(3)') {
dP/dt =  a * P * (1 - P / k1) - b * P * Z
dZ/dt = -d * Z * (1 + Z / k2) + c * P * Z
}
}

sub observation {
P_obs ~ log_normal(log(P), 0.1)
}
}

We can look at phase space starting with different populations and see they all converge to the same fixed point.

Since at some point in the future, I plan to produce Haskell versions of the methods given in Andrieu, Doucet, and Holenstein (2010), let’s generate the data using Haskell.

> {-# OPTIONS_GHC -Wall                     #-}

> module LotkaVolterra (
>     solLv
>   , solPp
>   , h0
>   , l0
>   , baz
>   , logBM
>   , eulerEx
>   )where

> import Numeric.GSL.ODE
> import Numeric.LinearAlgebra

> import Data.Random.Source.PureMT
> import Data.Random hiding ( gamma )


Here’s the unstable model.

> lvOde :: Double ->
>          Double ->
>          Double ->
>          Double ->
>          Double ->
>          [Double] ->
>          [Double]
> lvOde rho1 c1 rho2 c2 _t [h, l] =
>   [
>     rho1 * h - c1 * h * l
>   , c2 * h * l - rho2 * l
>   ]
> lvOde _rho1 _c1 _rho2 _c2 _t vars =
>   error $"lvOde called with: " ++ show (length vars) ++ " variable"  > rho1, c1, rho2, c2 :: Double > rho1 = 0.5 > c1 = 0.02 > rho2 = 0.4 > c2 = 0.004  > deltaT :: Double > deltaT = 0.1  > solLv :: Matrix Double > solLv = odeSolve (lvOde rho1 c1 rho2 c2) > [50.0, 50.0] > (fromList [0.0, deltaT .. 50])  And here’s the stable model. > ppOde :: Double -> > Double -> > Double -> > Double -> > Double -> > Double -> > Double -> > [Double] -> > [Double] > ppOde a k1 b d k2 c _t [p, z] = > [ > a * p * (1 - p / k1) - b * p * z > , -d * z * (1 + z / k2) + c * p * z > ] > ppOde _a _k1 _b _d _k2 _c _t vars = > error$ "ppOde called with: " ++ show (length vars) ++ " variable"

> a, k1, b, d, k2, c :: Double
> a = 0.5
> k1 = 200.0
> b = 0.02
> d = 0.4
> k2 = 50.0
> c = 0.004

> solPp :: Double -> Double -> Matrix Double
> solPp x y = odeSolve (ppOde a k1 b d k2 c)
>                  [x, y]
>                  (fromList [0.0, deltaT .. 50])

> gamma, alpha, beta :: Double
> gamma = d / a
> alpha = a / (c * k1)
> beta  = d / (a * k2)

> fp :: (Double, Double)
> fp = ((gamma + beta) / (1 + alpha * beta), (1 - gamma * alpha) / (1 + alpha * beta))

> h0, l0 :: Double
> h0 = a * fst fp / c
> l0 = a * snd fp / b

> foo, bar :: Matrix R
> foo = matrix 2 [a / k1, b, c, negate d / k2]
> bar = matrix 1 [a, d]

> baz :: Maybe (Matrix R)
> baz = linearSolve foo bar


This gives a stable fixed point of

ghci> baz
Just (2><1)
[ 120.00000000000001
,               10.0 ]


Here’s an example of convergence to that fixed point in phase space.

## The Stochastic Model

Let us now assume that the Hare growth parameter undergoes Brownian motion so that the further into the future we go, the less certain we are about it. In order to ensure that this parameter remains positive, let’s model the log of it to be Brownian motion.

\displaystyle \begin{aligned} \frac{\mathrm{d}N_1}{\mathrm{d}t} & = & \rho_1 N_1 \bigg(1 - \frac{N_1}{K_1}\bigg) - c_1 N_1 N_2 \\ \frac{\mathrm{d}N_2}{\mathrm{d}t} & = & -\rho_2 N_2 \bigg(1 + \frac{N_2}{K_2}\bigg) + c_2 N_1 N_2 \\ \mathrm{d} \rho_1 & = & \rho_1 \sigma_{\rho_1} \mathrm{d}W_t \end{aligned}

where the final equation is a stochastic differential equation with $W_t$ being a Wiener process.

By Itô we have

$\displaystyle \mathrm{d} (\log{\rho_1}) = - \frac{\sigma_{\rho_1}^2}{2} \mathrm{d} t + \sigma_{\rho_1} \mathrm{d}W_t$

We can use this to generate paths for $\rho_1$.

$\displaystyle \rho_1(t + \Delta t) = \rho_1(t)\exp{\bigg(- \frac{\sigma_{\rho_1}^2}{2} \Delta t + \sigma_{\rho_1} \sqrt{\Delta t} Z\bigg)}$

where $Z \sim {\mathcal{N}}(0,1)$.

> oneStepLogBM :: MonadRandom m => Double -> Double -> Double -> m Double
> oneStepLogBM deltaT sigma rhoPrev = do
>   x <- sample $rvar StdNormal > return$ rhoPrev * exp(sigma * (sqrt deltaT) * x - 0.5 * sigma * sigma * deltaT)

> iterateM :: Monad m => (a -> m a) -> m a -> Int -> m [a]
> iterateM f mx n = sequence . take n . iterate (>>= f) $mx  > logBMM :: MonadRandom m => Double -> Double -> Int -> Int -> m [Double] > logBMM initRho sigma n m = > iterateM (oneStepLogBM (recip$ fromIntegral n) sigma) (return initRho) (n * m)

> logBM :: Double -> Double -> Int -> Int -> Int -> [Double]
> logBM initRho sigma n m seed =
>   evalState (logBMM initRho sigma n m) (pureMT $fromIntegral seed)  We can see the further we go into the future the less certain we are about the value of the parameter. Using this we can simulate the whole dynamical system which is now a stochastic process. > f1, f2 :: Double -> Double -> Double -> > Double -> Double -> > Double > f1 a k1 b p z = a * p * (1 - p / k1) - b * p * z > f2 d k2 c p z = -d * z * (1 + z / k2) + c * p * z  > oneStepEuler :: MonadRandom m => > Double -> > Double -> > Double -> Double -> > Double -> Double -> Double -> > (Double, Double, Double) -> > m (Double, Double, Double) > oneStepEuler deltaT sigma k1 b d k2 c (rho1Prev, pPrev, zPrev) = do > let pNew = pPrev + deltaT * f1 (exp rho1Prev) k1 b pPrev zPrev > let zNew = zPrev + deltaT * f2 d k2 c pPrev zPrev > rho1New <- oneStepLogBM deltaT sigma rho1Prev > return (rho1New, pNew, zNew)  > euler :: MonadRandom m => > (Double, Double, Double) -> > Double -> > Double -> Double -> > Double -> Double -> Double -> > Int -> Int -> > m [(Double, Double, Double)] > euler stateInit sigma k1 b d k2 c n m = > iterateM (oneStepEuler (recip$ fromIntegral n) sigma k1 b d k2 c)
>            (return stateInit)
>            (n * m)

> eulerEx :: (Double, Double, Double) ->
>            Double -> Int -> Int -> Int ->
>            [(Double, Double, Double)]
> eulerEx stateInit sigma n m seed =
>   evalState (euler stateInit sigma k1 b d k2 c n m) (pureMT $fromIntegral seed)  We see that the populations become noisier the further into the future we go. Notice that the second order effects of the system are now to some extent captured by the fact that the growth rate of Hares can drift. In our simulation, this is demonstrated by our decreasing lack of knowledge the further we look into the future. # Inference Now let us infer the growth rate using PMMH. Here’s the model expressed in LibBi. // Infer growth rate for hares model PP { const h = 0.1; // time step const delta_abs = 1.0e-3; // absolute error tolerance const delta_rel = 1.0e-6; // relative error tolerance const a = 5.0e-1 // Hare growth rate - superfluous for inference // but a reminder of what we should expect const k1 = 2.0e2 // Hare carrying capacity const b = 2.0e-2 // Hare death rate per lynx const d = 4.0e-1 // Lynx death rate const k2 = 2.0e1 // Lynx carrying capacity const c = 4.0e-3 // Lynx birth rate per hare state P, Z // Hares and lynxes state ln_alpha // Hare growth rate - we express it in log form for // consistency with the inference model obs P_obs // Observations of hares param mu, sigma // Mean and standard deviation of hare growth rate noise w // Noise sub parameter { mu ~ uniform(0.0, 1.0) sigma ~ uniform(0.0, 0.5) } sub proposal_parameter { mu ~ truncated_gaussian(mu, 0.02, 0.0, 1.0); sigma ~ truncated_gaussian(sigma, 0.01, 0.0, 0.5); } sub initial { P ~ log_normal(log(100.0), 0.2) Z ~ log_normal(log(50.0), 0.1) ln_alpha ~ gaussian(log(mu), sigma) } sub transition(delta = h) { w ~ normal(0.0, sqrt(h)); ode(h = h, atoler = delta_abs, rtoler = delta_rel, alg = 'RK4(3)') { dP/dt = exp(ln_alpha) * P * (1 - P / k1) - b * P * Z dZ/dt = -d * Z * (1 + Z / k2) + c * P * Z dln_alpha/dt = -sigma * sigma / 2 - sigma * w / h } } sub observation { P_obs ~ log_normal(log(P), 0.1) } } Let’s look at the posteriors of the hyper-parameters for the Hare growth parameter. The estimate for $\mu$ is pretty decent. For our generated data, $\sigma =0$ and given our observations are quite noisy maybe the estimate for this is not too bad also. # Appendix: The R Driving Code All code including the R below can be downloaded from github but make sure you use the straight-libbi branch and not master. install.packages("devtools") library(devtools) install_github("sbfnk/RBi",ref="master") install_github("sbfnk/RBi.helpers",ref="master") rm(list = ls(all.names=TRUE)) unlink(".RData") library('RBi') try(detach(package:RBi, unload = TRUE), silent = TRUE) library(RBi, quietly = TRUE) library('RBi.helpers') library('ggplot2', quietly = TRUE) library('gridExtra', quietly = TRUE) endTime <- 50 PP <- bi_model("PP.bi") synthetic_dataset_PP <- bi_generate_dataset(endtime=endTime, model=PP, seed="42", verbose=TRUE, add_options = list( noutputs=500)) rdata_PP <- bi_read(synthetic_dataset_PP) df <- data.frame(rdata_PP$P$nr, rdata_PP$P$value, rdata_PP$Z$value, rdata_PP$P_obs$value) ggplot(df, aes(rdata_PP$P$nr, y = Population, color = variable), size = 0.1) + geom_line(aes(y = rdata_PP$P$value, col = "Hare"), size = 0.1) + geom_line(aes(y = rdata_PP$Z$value, col = "Lynx"), size = 0.1) + geom_point(aes(y = rdata_PP$P_obs$value, col = "Observations"), size = 0.1) + theme(legend.position="none") + ggtitle("Example Data") + xlab("Days") + theme(axis.text=element_text(size=4), axis.title=element_text(size=6,face="bold")) + theme(plot.title = element_text(size=10)) ggsave(filename="diagrams/LVdata.png",width=4,height=3) synthetic_dataset_PP1 <- bi_generate_dataset(endtime=endTime, model=PP, init = list(P = 100, Z=50), seed="42", verbose=TRUE, add_options = list( noutputs=500)) rdata_PP1 <- bi_read(synthetic_dataset_PP1) synthetic_dataset_PP2 <- bi_generate_dataset(endtime=endTime, model=PP, init = list(P = 150, Z=25), seed="42", verbose=TRUE, add_options = list( noutputs=500)) rdata_PP2 <- bi_read(synthetic_dataset_PP2) df1 <- data.frame(hare = rdata_PP$P$value, lynx = rdata_PP$Z$value, hare1 = rdata_PP1$P$value, lynx1 = rdata_PP1$Z$value, hare2 = rdata_PP2$P$value, lynx2 = rdata_PP2$Z$value) ggplot(df1) + geom_path(aes(x=df1$hare,  y=df1$lynx, col = "0"), size = 0.1) + geom_path(aes(x=df1$hare1, y=df1$lynx1, col = "1"), size = 0.1) + geom_path(aes(x=df1$hare2, y=df1$lynx2, col = "2"), size = 0.1) + theme(legend.position="none") + ggtitle("Phase Space") + xlab("Hare") + ylab("Lynx") + theme(axis.text=element_text(size=4), axis.title=element_text(size=6,face="bold")) + theme(plot.title = element_text(size=10)) ggsave(filename="diagrams/PPviaLibBi.png",width=4,height=3) PPInfer <- bi_model("PPInfer.bi") bi_object_PP <- libbi(client="sample", model=PPInfer, obs = synthetic_dataset_PP) bi_object_PP$run(add_options = list(
"end-time" = endTime,
noutputs = endTime,
nsamples = 4000,
nparticles = 128,
seed=42,
## verbose = TRUE,
stdoutput_file_name = tempfile(pattern="pmmhoutput", fileext=".txt"))

bi_file_summary(bi_object_PP$result$output_file_name)

mu <- bi_read(bi_object_PP, "mu")$value g1 <- qplot(x = mu[2001:4000], y = ..density.., geom = "histogram") + xlab(expression(mu)) sigma <- bi_read(bi_object_PP, "sigma")$value
g2 <- qplot(x = sigma[2001:4000], y = ..density.., geom = "histogram") + xlab(expression(sigma))
g3 <- grid.arrange(g1, g2)
ggsave(plot=g3,filename="diagrams/LvPosterior.png",width=4,height=3)

df2 <- data.frame(hareActs = rdata_PP$P$value,
hareObs  = rdata_PP$P_obs$value)

ggplot(df, aes(rdata_PP$P$nr, y = value, color = variable)) +
geom_line(aes(y = rdata_PP$P$value, col = "Phyto")) +
geom_line(aes(y = rdata_PP$Z$value, col = "Zoo")) +
geom_point(aes(y = rdata_PP$P_obs$value, col = "Phyto Obs"))

ln_alpha <- bi_read(bi_object_PP, "ln_alpha")$value P <- matrix(bi_read(bi_object_PP, "P")$value,nrow=51,byrow=TRUE)
Z <- matrix(bi_read(bi_object_PP, "Z")$value,nrow=51,byrow=TRUE) data50 <- bi_generate_dataset(endtime=endTime, model=PP, seed="42", verbose=TRUE, add_options = list( noutputs=50)) rdata50 <- bi_read(data50) df3 <- data.frame(days = c(1:51), hares = rowMeans(P), lynxes = rowMeans(Z), actHs = rdata50$P$value, actLs = rdata50$Z\$value)

ggplot(df3) +
geom_line(aes(x = days, y = hares, col = "Est Phyto")) +
geom_line(aes(x = days, y = lynxes, col = "Est Zoo")) +
geom_line(aes(x = days, y = actHs, col = "Act Phyto")) +
geom_line(aes(x = days, y = actLs, col = "Act Zoo"))

# Bibliography

Andrieu, Christophe, Arnaud Doucet, and Roman Holenstein. 2010. “Particle Markov chain Monte Carlo methods.” Journal of the Royal Statistical Society. Series B: Statistical Methodology 72 (3): 269–342. doi:10.1111/j.1467-9868.2009.00736.x.

Dureau, Joseph, Konstantinos Kalogeropoulos, and Marc Baguelin. 2013. “Capturing the time-varying drivers of an epidemic using stochastic dynamical systems.” Biostatistics (Oxford, England) 14 (3): 541–55. doi:10.1093/biostatistics/kxs052.

Lotka, Alfred J. 1909. “Contribution to the Theory of Periodic Reactions.” The Journal of Physical Chemistry 14 (3): 271–74. doi:10.1021/j150111a004.

Murray, Lawrence M. n.d. “Bayesian State-Space Modelling on High-Performance Hardware Using LibBi.”

Volterra, Vito. 1926. “Variazioni e fluttuazioni del numero d’individui in specie animali conviventi.” Memorie Della R. Accademia Dei Lincei 6 (2): 31–113. http://www.liberliber.it/biblioteca/v/volterra/variazioni{\_}e{\_}fluttuazioni/pdf/volterra{\_}variazioni{\_}e{\_}fluttuazioni.pdf.