# Introduction

In the 1920s, Lotka (1909) and Volterra (1926) developed a model of a very simple predator-prey ecosystem.

Although simple, it turns out that the Canadian lynx and showshoe hare are well represented by such a model. Furthermore, the Hudson Bay Company kept records of how many pelts of each species were trapped for almost a century, giving a good proxy of the population of each species.

We can capture the fact that we do not have a complete model by describing our state of ignorance about the parameters. In order to keep this as simple as possible let us assume that log parameters undergo Brownian motion. That is, we know the parameters will jiggle around and the further into the future we look the less certain we are about what values they will have taken. By making the log parameters undergo Brownian motion, we can also capture our modelling assumption that birth, death and predation rates are always positive. A similar approach is taken in Dureau, Kalogeropoulos, and Baguelin (2013) where the (log) parameters of an epidemiological model are taken to be Ornstein-Uhlenbeck processes (which is biologically more plausible although adds to the complexity of the model, something we wish to avoid in an example such as this).

Andrieu, Doucet, and Holenstein (2010) propose a method to estimate the parameters of such models (Particle Marginal Metropolis Hastings aka PMMH) and the domain specific probabilistic language LibBi (Murray (n.d.)) can be used to apply this (and other inference methods).

For the sake of simplicity, in this blog post, we only model one parameter as being unknown and undergoing Brownian motion. A future blog post will consider more sophisticated scenarios.

# A Dynamical System Aside

The above dynamical system is structurally unstable (more on this in a future post), a possible indication that it should not be considered as a good model of predator–prey interaction. Let us modify this to include carrying capacities for the populations of both species.

# Data Generation with LibBi

Let’s generate some data using LibBi.

```
// Generate data assuming a fixed growth rate for hares rather than
// e.g. a growth rate that undergoes Brownian motion.
model PP {
const h = 0.1; // time step
const delta_abs = 1.0e-3; // absolute error tolerance
const delta_rel = 1.0e-6; // relative error tolerance
const a = 5.0e-1 // Hare growth rate
const k1 = 2.0e2 // Hare carrying capacity
const b = 2.0e-2 // Hare death rate per lynx
const d = 4.0e-1 // Lynx death rate
const k2 = 2.0e1 // Lynx carrying capacity
const c = 4.0e-3 // Lynx birth rate per hare
state P, Z // Hares and lynxes
state ln_alpha // Hare growth rate - we express it in log form for
// consistency with the inference model
obs P_obs // Observations of hares
sub initial {
P ~ log_normal(log(100.0), 0.2)
Z ~ log_normal(log(50.0), 0.1)
}
sub transition(delta = h) {
ode(h = h, atoler = delta_abs, rtoler = delta_rel, alg = 'RK4(3)') {
dP/dt = a * P * (1 - P / k1) - b * P * Z
dZ/dt = -d * Z * (1 + Z / k2) + c * P * Z
}
}
sub observation {
P_obs ~ log_normal(log(P), 0.1)
}
}
```

We can look at phase space starting with different populations and see they all converge to the same fixed point.

# Data Generation with Haskell

Since at some point in the future, I plan to produce Haskell versions of the methods given in Andrieu, Doucet, and Holenstein (2010), let’s generate the data using Haskell.

```
> {-# OPTIONS_GHC -Wall #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing #-}
```

```
> module LotkaVolterra (
> solLv
> , solPp
> , h0
> , l0
> , baz
> , logBM
> , eulerEx
> )where
```

```
> import Numeric.GSL.ODE
> import Numeric.LinearAlgebra
```

```
> import Data.Random.Source.PureMT
> import Data.Random hiding ( gamma )
> import Control.Monad.State
```

Here’s the unstable model.

```
> lvOde :: Double ->
> Double ->
> Double ->
> Double ->
> Double ->
> [Double] ->
> [Double]
> lvOde rho1 c1 rho2 c2 _t [h, l] =
> [
> rho1 * h - c1 * h * l
> , c2 * h * l - rho2 * l
> ]
> lvOde _rho1 _c1 _rho2 _c2 _t vars =
> error $ "lvOde called with: " ++ show (length vars) ++ " variable"
```

```
> rho1, c1, rho2, c2 :: Double
> rho1 = 0.5
> c1 = 0.02
> rho2 = 0.4
> c2 = 0.004
```

```
> deltaT :: Double
> deltaT = 0.1
```

```
> solLv :: Matrix Double
> solLv = odeSolve (lvOde rho1 c1 rho2 c2)
> [50.0, 50.0]
> (fromList [0.0, deltaT .. 50])
```

And here’s the stable model.

```
> ppOde :: Double ->
> Double ->
> Double ->
> Double ->
> Double ->
> Double ->
> Double ->
> [Double] ->
> [Double]
> ppOde a k1 b d k2 c _t [p, z] =
> [
> a * p * (1 - p / k1) - b * p * z
> , -d * z * (1 + z / k2) + c * p * z
> ]
> ppOde _a _k1 _b _d _k2 _c _t vars =
> error $ "ppOde called with: " ++ show (length vars) ++ " variable"
```

```
> a, k1, b, d, k2, c :: Double
> a = 0.5
> k1 = 200.0
> b = 0.02
> d = 0.4
> k2 = 50.0
> c = 0.004
```

```
> solPp :: Double -> Double -> Matrix Double
> solPp x y = odeSolve (ppOde a k1 b d k2 c)
> [x, y]
> (fromList [0.0, deltaT .. 50])
```

```
> gamma, alpha, beta :: Double
> gamma = d / a
> alpha = a / (c * k1)
> beta = d / (a * k2)
```

```
> fp :: (Double, Double)
> fp = ((gamma + beta) / (1 + alpha * beta), (1 - gamma * alpha) / (1 + alpha * beta))
```

```
> h0, l0 :: Double
> h0 = a * fst fp / c
> l0 = a * snd fp / b
```

```
> foo, bar :: Matrix R
> foo = matrix 2 [a / k1, b, c, negate d / k2]
> bar = matrix 1 [a, d]
```

```
> baz :: Maybe (Matrix R)
> baz = linearSolve foo bar
```

This gives a stable fixed point of

```
ghci> baz
Just (2><1)
[ 120.00000000000001
, 10.0 ]
```

Here’s an example of convergence to that fixed point in phase space.

## The Stochastic Model

Let us now assume that the Hare growth parameter undergoes Brownian motion so that the further into the future we go, the less certain we are about it. In order to ensure that this parameter remains positive, let’s model the log of it to be Brownian motion.

where the final equation is a stochastic differential equation with being a Wiener process.

By Itô we have

We can use this to generate paths for .

where .

```
> oneStepLogBM :: MonadRandom m => Double -> Double -> Double -> m Double
> oneStepLogBM deltaT sigma rhoPrev = do
> x <- sample $ rvar StdNormal
> return $ rhoPrev * exp(sigma * (sqrt deltaT) * x - 0.5 * sigma * sigma * deltaT)
```

```
> iterateM :: Monad m => (a -> m a) -> m a -> Int -> m [a]
> iterateM f mx n = sequence . take n . iterate (>>= f) $ mx
```

```
> logBMM :: MonadRandom m => Double -> Double -> Int -> Int -> m [Double]
> logBMM initRho sigma n m =
> iterateM (oneStepLogBM (recip $ fromIntegral n) sigma) (return initRho) (n * m)
```

```
> logBM :: Double -> Double -> Int -> Int -> Int -> [Double]
> logBM initRho sigma n m seed =
> evalState (logBMM initRho sigma n m) (pureMT $ fromIntegral seed)
```

We can see the further we go into the future the less certain we are about the value of the parameter.

Using this we can simulate the whole dynamical system which is now a stochastic process.

```
> f1, f2 :: Double -> Double -> Double ->
> Double -> Double ->
> Double
> f1 a k1 b p z = a * p * (1 - p / k1) - b * p * z
> f2 d k2 c p z = -d * z * (1 + z / k2) + c * p * z
```

```
> oneStepEuler :: MonadRandom m =>
> Double ->
> Double ->
> Double -> Double ->
> Double -> Double -> Double ->
> (Double, Double, Double) ->
> m (Double, Double, Double)
> oneStepEuler deltaT sigma k1 b d k2 c (rho1Prev, pPrev, zPrev) = do
> let pNew = pPrev + deltaT * f1 (exp rho1Prev) k1 b pPrev zPrev
> let zNew = zPrev + deltaT * f2 d k2 c pPrev zPrev
> rho1New <- oneStepLogBM deltaT sigma rho1Prev
> return (rho1New, pNew, zNew)
```

```
> euler :: MonadRandom m =>
> (Double, Double, Double) ->
> Double ->
> Double -> Double ->
> Double -> Double -> Double ->
> Int -> Int ->
> m [(Double, Double, Double)]
> euler stateInit sigma k1 b d k2 c n m =
> iterateM (oneStepEuler (recip $ fromIntegral n) sigma k1 b d k2 c)
> (return stateInit)
> (n * m)
```

```
> eulerEx :: (Double, Double, Double) ->
> Double -> Int -> Int -> Int ->
> [(Double, Double, Double)]
> eulerEx stateInit sigma n m seed =
> evalState (euler stateInit sigma k1 b d k2 c n m) (pureMT $ fromIntegral seed)
```

We see that the populations become noisier the further into the future we go.

Notice that the second order effects of the system are now to some extent captured by the fact that the growth rate of Hares can drift. In our simulation, this is demonstrated by our decreasing lack of knowledge the further we look into the future.

# Inference

Now let us infer the growth rate using PMMH. Here’s the model expressed in LibBi.

```
// Infer growth rate for hares
model PP {
const h = 0.1; // time step
const delta_abs = 1.0e-3; // absolute error tolerance
const delta_rel = 1.0e-6; // relative error tolerance
const a = 5.0e-1 // Hare growth rate - superfluous for inference
// but a reminder of what we should expect
const k1 = 2.0e2 // Hare carrying capacity
const b = 2.0e-2 // Hare death rate per lynx
const d = 4.0e-1 // Lynx death rate
const k2 = 2.0e1 // Lynx carrying capacity
const c = 4.0e-3 // Lynx birth rate per hare
state P, Z // Hares and lynxes
state ln_alpha // Hare growth rate - we express it in log form for
// consistency with the inference model
obs P_obs // Observations of hares
param mu, sigma // Mean and standard deviation of hare growth rate
noise w // Noise
sub parameter {
mu ~ uniform(0.0, 1.0)
sigma ~ uniform(0.0, 0.5)
}
sub proposal_parameter {
mu ~ truncated_gaussian(mu, 0.02, 0.0, 1.0);
sigma ~ truncated_gaussian(sigma, 0.01, 0.0, 0.5);
}
sub initial {
P ~ log_normal(log(100.0), 0.2)
Z ~ log_normal(log(50.0), 0.1)
ln_alpha ~ gaussian(log(mu), sigma)
}
sub transition(delta = h) {
w ~ normal(0.0, sqrt(h));
ode(h = h, atoler = delta_abs, rtoler = delta_rel, alg = 'RK4(3)') {
dP/dt = exp(ln_alpha) * P * (1 - P / k1) - b * P * Z
dZ/dt = -d * Z * (1 + Z / k2) + c * P * Z
dln_alpha/dt = -sigma * sigma / 2 - sigma * w / h
}
}
sub observation {
P_obs ~ log_normal(log(P), 0.1)
}
}
```

Let’s look at the posteriors of the hyper-parameters for the Hare growth parameter.

The estimate for is pretty decent. For our generated data, and given our observations are quite noisy maybe the estimate for this is not too bad also.

# Appendix: The R Driving Code

All code including the R below can be downloaded from github but make sure you use the *straight-libbi* branch and *not* master.

```
install.packages("devtools")
library(devtools)
install_github("sbfnk/RBi",ref="master")
install_github("sbfnk/RBi.helpers",ref="master")
rm(list = ls(all.names=TRUE))
unlink(".RData")
library('RBi')
try(detach(package:RBi, unload = TRUE), silent = TRUE)
library(RBi, quietly = TRUE)
library('RBi.helpers')
library('ggplot2', quietly = TRUE)
library('gridExtra', quietly = TRUE)
endTime <- 50
PP <- bi_model("PP.bi")
synthetic_dataset_PP <- bi_generate_dataset(endtime=endTime,
model=PP,
seed="42",
verbose=TRUE,
add_options = list(
noutputs=500))
rdata_PP <- bi_read(synthetic_dataset_PP)
df <- data.frame(rdata_PP$P$nr,
rdata_PP$P$value,
rdata_PP$Z$value,
rdata_PP$P_obs$value)
ggplot(df, aes(rdata_PP$P$nr, y = Population, color = variable), size = 0.1) +
geom_line(aes(y = rdata_PP$P$value, col = "Hare"), size = 0.1) +
geom_line(aes(y = rdata_PP$Z$value, col = "Lynx"), size = 0.1) +
geom_point(aes(y = rdata_PP$P_obs$value, col = "Observations"), size = 0.1) +
theme(legend.position="none") +
ggtitle("Example Data") +
xlab("Days") +
theme(axis.text=element_text(size=4),
axis.title=element_text(size=6,face="bold")) +
theme(plot.title = element_text(size=10))
ggsave(filename="diagrams/LVdata.png",width=4,height=3)
synthetic_dataset_PP1 <- bi_generate_dataset(endtime=endTime,
model=PP,
init = list(P = 100, Z=50),
seed="42",
verbose=TRUE,
add_options = list(
noutputs=500))
rdata_PP1 <- bi_read(synthetic_dataset_PP1)
synthetic_dataset_PP2 <- bi_generate_dataset(endtime=endTime,
model=PP,
init = list(P = 150, Z=25),
seed="42",
verbose=TRUE,
add_options = list(
noutputs=500))
rdata_PP2 <- bi_read(synthetic_dataset_PP2)
df1 <- data.frame(hare = rdata_PP$P$value,
lynx = rdata_PP$Z$value,
hare1 = rdata_PP1$P$value,
lynx1 = rdata_PP1$Z$value,
hare2 = rdata_PP2$P$value,
lynx2 = rdata_PP2$Z$value)
ggplot(df1) +
geom_path(aes(x=df1$hare, y=df1$lynx, col = "0"), size = 0.1) +
geom_path(aes(x=df1$hare1, y=df1$lynx1, col = "1"), size = 0.1) +
geom_path(aes(x=df1$hare2, y=df1$lynx2, col = "2"), size = 0.1) +
theme(legend.position="none") +
ggtitle("Phase Space") +
xlab("Hare") +
ylab("Lynx") +
theme(axis.text=element_text(size=4),
axis.title=element_text(size=6,face="bold")) +
theme(plot.title = element_text(size=10))
ggsave(filename="diagrams/PPviaLibBi.png",width=4,height=3)
PPInfer <- bi_model("PPInfer.bi")
bi_object_PP <- libbi(client="sample", model=PPInfer, obs = synthetic_dataset_PP)
bi_object_PP$run(add_options = list(
"end-time" = endTime,
noutputs = endTime,
nsamples = 4000,
nparticles = 128,
seed=42,
nthreads = 1),
## verbose = TRUE,
stdoutput_file_name = tempfile(pattern="pmmhoutput", fileext=".txt"))
bi_file_summary(bi_object_PP$result$output_file_name)
mu <- bi_read(bi_object_PP, "mu")$value
g1 <- qplot(x = mu[2001:4000], y = ..density.., geom = "histogram") + xlab(expression(mu))
sigma <- bi_read(bi_object_PP, "sigma")$value
g2 <- qplot(x = sigma[2001:4000], y = ..density.., geom = "histogram") + xlab(expression(sigma))
g3 <- grid.arrange(g1, g2)
ggsave(plot=g3,filename="diagrams/LvPosterior.png",width=4,height=3)
df2 <- data.frame(hareActs = rdata_PP$P$value,
hareObs = rdata_PP$P_obs$value)
ggplot(df, aes(rdata_PP$P$nr, y = value, color = variable)) +
geom_line(aes(y = rdata_PP$P$value, col = "Phyto")) +
geom_line(aes(y = rdata_PP$Z$value, col = "Zoo")) +
geom_point(aes(y = rdata_PP$P_obs$value, col = "Phyto Obs"))
ln_alpha <- bi_read(bi_object_PP, "ln_alpha")$value
P <- matrix(bi_read(bi_object_PP, "P")$value,nrow=51,byrow=TRUE)
Z <- matrix(bi_read(bi_object_PP, "Z")$value,nrow=51,byrow=TRUE)
data50 <- bi_generate_dataset(endtime=endTime,
model=PP,
seed="42",
verbose=TRUE,
add_options = list(
noutputs=50))
rdata50 <- bi_read(data50)
df3 <- data.frame(days = c(1:51), hares = rowMeans(P), lynxes = rowMeans(Z),
actHs = rdata50$P$value, actLs = rdata50$Z$value)
ggplot(df3) +
geom_line(aes(x = days, y = hares, col = "Est Phyto")) +
geom_line(aes(x = days, y = lynxes, col = "Est Zoo")) +
geom_line(aes(x = days, y = actHs, col = "Act Phyto")) +
geom_line(aes(x = days, y = actLs, col = "Act Zoo"))
```

# Bibliography

Andrieu, Christophe, Arnaud Doucet, and Roman Holenstein. 2010. “Particle Markov chain Monte Carlo methods.” *Journal of the Royal Statistical Society. Series B: Statistical Methodology* 72 (3): 269–342. doi:10.1111/j.1467-9868.2009.00736.x.

Dureau, Joseph, Konstantinos Kalogeropoulos, and Marc Baguelin. 2013. “Capturing the time-varying drivers of an epidemic using stochastic dynamical systems.” *Biostatistics (Oxford, England)* 14 (3): 541–55. doi:10.1093/biostatistics/kxs052.

Lotka, Alfred J. 1909. “Contribution to the Theory of Periodic Reactions.” *The Journal of Physical Chemistry* 14 (3): 271–74. doi:10.1021/j150111a004.

Murray, Lawrence M. n.d. “Bayesian State-Space Modelling on High-Performance Hardware Using LibBi.”

Volterra, Vito. 1926. “Variazioni e fluttuazioni del numero d’individui in specie animali conviventi.” *Memorie Della R. Accademia Dei Lincei* 6 (2): 31–113. http://www.liberliber.it/biblioteca/v/volterra/variazioni{\_}e{\_}fluttuazioni/pdf/volterra{\_}variazioni{\_}e{\_}fluttuazioni.pdf.