The Metropolis Algorithm

A Simple Example

In Section 7.1 of Doing Bayesian Data Analysis, John Kruschke gives a simple example of the Metropolis algorithm, in which we generate samples from a distribution without knowing the distribution itself. Of course the example is contrived as we really do know the distribtion. In the particular example, {\cal P}(X = i) = i/k for i = 1…n where n is some fixed natural number and k is a normalising constant.

Here’s the algorithm in Haskell. We use the random-fu package and the rvar package for random variables and the random package to supply random numbers.

{-# LANGUAGE ScopedTypeVariables, NoMonomorphismRestriction #-}

import Data.Random
import Data.RVar
import System.Random
import Control.Monad.State
import Data.List

We pick an explicit seed and set n = 7 (in Kruschke’s example this is the number of islands that a politician visits).

seed :: Int
seed = 2
numIslands :: Int
numIslands = 7

And we pick an arbitrary number of jumps to try in the Metropolis algorithm.

n = 11112

We generate proposed jumps which are either one step up or one step down.

proposedJumps :: Int -> [Int]
proposedJumps seed =
  map f $ fst $
  runState (replicateM n $ sampleRVar $ uniform False True) (mkStdGen seed)
    where f False = negate 1
          f True  = 1

And we generate samples from the uniform distribution on [0, 1] which will allow us to determine whether to accept or reject a proposed jump.

acceptOrRejects :: Int -> [Double]
acceptOrRejects seed =
  fst $ runState (replicateM n $ sampleRVar $ uniform 0 1) (mkStdGen seed)

We pretend we only know a measure of how often we pick a given number but not the normalising constant to make this measure a probability measure.

p n | n >= 1 && n <= numIslands = n
    | otherwise                 = 0

We define a function which defines one step of the Metropolis algorithm.

f currentPosition (proposedJump, acceptOrReject) =
  if acceptOrReject < probAccept
    then
      currentPosition + proposedJump
    else
      currentPosition
  where
    probAccept = min 1 (pNew / pCur)
    pNew = fromIntegral $ currentPosition + proposedJump
    pCur = fromIntegral currentPosition

Let’s try this out with a somewhat arbitrary burn in period starting at position 3.

runMC seed = map (/ total) numVisits
  where
    total     = sum numVisits
    numVisits = map (\j -> fromIntegral $ length $ filter (==j) $ xs)
                    [1 .. numIslands]
    xs        = drop (n `div` 10) $
                scanl f 3 (zip (proposedJumps seed) (acceptOrRejects seed))

We can then compute the root mean square error for this particular sample size.

actual = map (/s) xs
           where
             xs = map fromIntegral [1 .. numIslands]
             s  = sum xs

rmsError n = sqrt $ (/(fromIntegral numIslands)) $
             sum $ map (^2) $ zipWith (-) actual (runMC n)

A Bernoulli Example

We can also code the Metropolis algorithm for the example in which samples are drawn from a Bernoulli distribution. First we can define the likelihood for drawing drawing a specific sequence of 0’s and 1’s from a binomial distribution with parrameter θ. We also define the example sample of data given in Doing Bayesian Data Analysis.


likelihood :: Int -> Int -> Double -> Double
likelihood z n theta = theta^z * (1 - theta)^(n - z)

myData = [1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]

We define a function which defines one step of the Metropolis algorithm.

oneStep currPosition (propJump, acceptOrReject) =
  if acceptOrReject < probAccept
    then
      currPosition + propJump
    else
      currPosition
  where
    probAccept = min 1 (p (currPosition + propJump) / p currPosition)
    p x | x < 0 = 0
        | x > 1 = 0
        | otherwise = pAux myData x
    pAux :: [Double] -> Double -> Double
    pAux xs position = likelihood z n position
      where
        n = length xs
        z = length $ filter (== 1) xs

Finally we need some proposed jumps; following the example we generate these from a normal distribution {\cal N}(0, 0.1).

normals :: [Double]
normals =  fst $ runState (replicateM n (sampleRVar (normal 0 0.1)))
                          (mkStdGen seed)

And now we can run the Metropolis algorithm and, for example, find the mean of the posterior.

accepteds = drop (n `div` 10) $
            scanl oneStep 0.5 (zip normals (acceptOrRejects seed))

mean = total / count
         where
           (total, count) = foldl' f (0.0, 0) accepteds
           f (s, c) x = (s+x, c+1)

Functional R

We can now re-write the example in R in a more functional way.

## An arbitrary number of jumps to try in Metropolis
## algorithm.
trajLength = 11112

## The data represented as a vector.
myData = c( 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0 )

## An arbitrary seed presumably used in generating random
## values from the various distributions.
set.seed(47405)

## We need some proposed jumps; following the example we
## generate these from a normal distribution.
normals = rep (0, trajLength)
for (j in 1:trajLength) {
normals[j] = rnorm( 1, mean = 0, sd = 0.1 )
}

## We generate proposed jumps which are either one step up
## or one step down.
proposedJumps = rep (0, trajLength)
for (j in 1:trajLength) {
proposedJumps[j] =
if ( runif( 1 ) 1 | theta < 0 ] = 0
return ( x )
}

prior = function( position, xs ) {
n = length( xs )
z = sum ( xs == 1)
return ( likelihood ( z, n, position ) )
}

## We define a function which defines one step of the
## Metropolis algorithm.
oneStep = function ( currPosition, propJumpAorR ) {
proposedJump = propJumpAorR [1]
acceptOrReject = propJumpAorR [2]
probAccept = min( 1,
prior( currPosition + proposedJump , myData )
/ prior( currPosition , myData ) )
if ( acceptOrReject < probAccept ) {
trajectory = currPosition + proposedJump
} else {
trajectory = currPosition
}
return ( trajectory )
}

## Pair together the proposed jumps and the probability
## whether a given proposal will be accepted or rejected.
nsAorRs <- list ()
for (i in 1:trajLength)
nsAorRs[[i]] <- c(normals[i], acceptOrRejects[i])

## Fold (well really scanl) over the pairs returning all the
## intermediate results.
trajectory = Reduce( function(a,b) oneStep (a, b),
nsAorRs, accumulate=T,init= 0.5 )

## Drop the values for the burn in period.
burnIn = ceiling( .1 * trajLength )
acceptedTraj = trajectory[burnIn:trajLength]

## Finally return the mean of the posterior.
result = mean ( acceptedTraj )

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s