# Introduction

Let us see if we can estimate the parameter for population growth using MCMC in the example in which we used Kalman filtering.

We recall the model. \displaystyle \begin{aligned} \dot{p} & = rp\Big(1 - \frac{p}{k}\Big) \end{aligned} $\displaystyle p = \frac{kp_0\exp rt}{k + p_0(\exp rt - 1)}$

And we are allowed to sample at regular intervals \displaystyle \begin{aligned} p_i &= \frac{kp_0\exp r\Delta T i}{k + p_0(\exp r\Delta T i - 1)} \\ y_i &= p_i + \epsilon_i \end{aligned}

In other words $y_i \sim {\cal{N}}(p_i, \sigma^2)$, where $\sigma$ is known so the likelihood is $\displaystyle p(y\,|\,r) \propto \prod_{i=1}^n \exp{\bigg( -\frac{(y_i - p_i)^2}{2\sigma^2}\bigg)} = \exp{\bigg( -\sum_{i=1}^n \frac{(y_i - p_i)^2}{2\sigma^2}\bigg)}$

Let us assume a prior of $r \sim {\cal{N}}(\mu_0,\sigma_0^2)$ then the posterior becomes $\displaystyle p(r\,|\,y) \propto \exp{\bigg( -\frac{(r - \mu_0)^2}{2\sigma_0^2} \bigg)} \exp{\bigg( -\sum_{i=1}^n \frac{(y_i - p_i)^2}{2\sigma^2}\bigg)}$

## Preamble

> {-# OPTIONS_GHC -Wall                      #-}
> {-# OPTIONS_GHC -fno-warn-name-shadowing   #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults    #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind   #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods  #-}
> {-# OPTIONS_GHC -fno-warn-orphans          #-}

> {-# LANGUAGE NoMonomorphismRestriction     #-}

> module PopGrowthMCMC where
>
> import qualified Data.Vector.Unboxed as V
> import Data.Random.Source.PureMT
> import Data.Random
> import Data.Histogram.Fill
> import Data.Histogram.Generic ( Histogram )


# Implementation

We assume most of the parameters are known with the exception of the the growth rate $r$. We fix this also in order to generate test data.

> k, p0 :: Double
> k = 1.0
> p0 = 0.1

> r, deltaT :: Double
> r = 10.0
> deltaT = 0.0005

> nObs :: Int
> nObs = 150


Here’s the implementation of the logistic function

> logit :: Double -> Double -> Double -> Double
> logit p0 k x = k * p0 * (exp x) / (k + p0 * (exp x - 1))


Let us create some noisy data.

> sigma :: Double
> sigma = 1e-2

> samples :: [Double]
> samples = zipWith (+) mus epsilons
>   where
>     mus = map (logit p0 k . (* (r * deltaT))) (map fromIntegral [0..])
>     epsilons = evalState (sample $replicateM nObs$ rvar (Normal 0.0 sigma)) (pureMT 3)


Arbitrarily let us set the prior to have a rather vague normal distribution.

> mu0, sigma0 :: Double
> mu0 = 5.0
> sigma0 = 1e1

> prior :: Double -> Double
> prior r = exp (-(r - mu0)**2 / (2 * sigma0**2))

> likelihood :: Double -> [Double] -> Double
> likelihood r ys = exp (-sum (zipWith (\y mu -> (y - mu)**2 / (2 * sigma**2)) ys mus))
>   where
>     mus :: [Double]
>     mus = map (logit p0 k . (* (r * deltaT))) (map fromIntegral [0..])

> posterior :: Double -> [Double] -> Double
> posterior r ys = likelihood r ys * prior r


The Metropolis algorithm tells us that we always jump to a better place but only sometimes jump to a worse place. We count the number of acceptances as we go.

> acceptanceProb' :: Double -> Double -> [Double] -> Double
> acceptanceProb' r r' ys = min 1.0 ((posterior r' ys) / (posterior r ys))

> oneStep :: (Double, Int) -> (Double, Double) -> (Double, Int)
> oneStep (r, nAccs) (proposedJump, acceptOrReject) =
>   if acceptOrReject < acceptanceProb' r (r + proposedJump) samples
>   then (r + proposedJump, nAccs + 1)
>   else (r, nAccs)


Here are our proposals.

> normalisedProposals :: Int -> Double -> Int -> [Double]
> normalisedProposals seed sigma nIters =
>   evalState (replicateM nIters (sample (Normal 0.0 sigma)))
>   (pureMT $fromIntegral seed)  We also need samples from the uniform distribution > acceptOrRejects :: Int -> Int -> [Double] > acceptOrRejects seed nIters = > evalState (replicateM nIters (sample stdUniform)) > (pureMT$ fromIntegral seed)


Now we can actually run our simulation. We set the number of jumps and a burn in but do not do any thinning.

> nIters, burnIn :: Int
> nIters = 100000
> burnIn = nIters div 10


Let us start our chain to the mean of the prior. In theory this shoudn’t matter as by the time we have burnt in we should be sampling in the high density region of the distribution.

> startMu :: Double
> startMu = 5.0


This jump size should allow us to sample the region of high density at a reasonable granularity.

> jumpVar :: Double
> jumpVar = 0.01


Now we can test our MCMC implementation.

> test :: Int -> [(Double, Int)]
> test seed =
>   drop burnIn $> scanl oneStep (startMu, 0)$
>   zip (normalisedProposals seed jumpVar nIters)
>       (acceptOrRejects seed nIters)


We put the data into a histogram.

> numBins :: Int
> numBins = 400

> hb :: HBuilder Double (Data.Histogram.Generic.Histogram V.Vector BinD Double)
> hb = forceDouble -<< mkSimple (binD lower numBins upper)
>   where
>     lower = r - 0.5 * sigma0
>     upper = r + 0.5 * sigma0
>
> hist :: Int -> Histogram V.Vector BinD Double
> hist seed = fillBuilder hb (map fst \$ test seed)


With 50 observations we don’t seem to be very certain about the growth rate. With 100 observations we do very much better. And with 150 observations we do even better. 