Introduction
The equation of motion for a pendulum of unit length subject to Gaussian white noise is
We can discretize this via the usual Euler method
where and
The explanation of the precise form of the covariance matrix will be the subject of another blog post; for the purpose of exposition of forward filtering / backward smoothing, this detail is not important.
Assume that we can only measure the horizontal position of the pendulum and further that this measurement is subject to error so that
where .
Particle Filtering can give us an estimate of where the pendulum is and its velocity using all the observations up to that point in time. But now suppose we have observed the pendulum for a fixed period of time then at times earlier than the time at which we stop our observations we now have observations in the future as well as in the past. If we can somehow take account of these future observations then we should be able to improve our estimate of where the pendulum was at any given point in time (as well as its velocity). Forward Filtering / Backward Smoothing is a technique for doing this.
Haskell Preamble
> {# OPTIONS_GHC Wall #}
> {# OPTIONS_GHC fnowarnnameshadowing #}
> {# OPTIONS_GHC fnowarntypedefaults #}
> {# OPTIONS_GHC fnowarnunuseddobind #}
> {# OPTIONS_GHC fnowarnmissingmethods #}
> {# OPTIONS_GHC fnowarnorphans #}
> {# LANGUAGE MultiParamTypeClasses #}
> {# LANGUAGE TypeFamilies #}
> {# LANGUAGE ScopedTypeVariables #}
> {# LANGUAGE ExplicitForAll #}
> {# LANGUAGE DataKinds #}
> {# LANGUAGE FlexibleInstances #}
> {# LANGUAGE MultiParamTypeClasses #}
> {# LANGUAGE FlexibleContexts #}
> {# LANGUAGE TypeFamilies #}
> {# LANGUAGE BangPatterns #}
> {# LANGUAGE GeneralizedNewtypeDeriving #}
> {# LANGUAGE TemplateHaskell #}
> {# LANGUAGE DataKinds #}
> {# LANGUAGE DeriveGeneric #}
> module PendulumSamples ( pendulumSamples
> , pendulumSamples'
> , testFiltering
> , testSmoothing
> , testFilteringG
> , testSmoothingG
> ) where
> import Data.Random hiding ( StdNormal, Normal )
> import Data.Random.Source.PureMT ( pureMT )
> import Control.Monad.State ( evalState, replicateM )
> import qualified Control.Monad.Loops as ML
> import Control.Monad.Writer ( tell, WriterT, lift,
> runWriterT
> )
> import Numeric.LinearAlgebra.Static
> ( R, vector, Sym,
> headTail, matrix, sym,
> diag
> )
> import GHC.TypeLits ( KnownNat )
> import MultivariateNormal ( MultivariateNormal(..) )
> import qualified Data.Vector as V
> import Data.Bits ( shiftR )
> import Data.List ( transpose )
> import Control.Parallel.Strategies
> import GHC.Generics (Generic)
Simulation
Let’s first plot some typical trajectories of the pendulum.
> deltaT, g :: Double
> deltaT = 0.01
> g = 9.81
> type PendulumState = R 2
> type PendulumObs = R 1
> pendulumSample :: MonadRandom m =>
> Sym 2 >
> Sym 1 >
> PendulumState >
> m (Maybe ((PendulumState, PendulumObs), PendulumState))
> pendulumSample bigQ bigR xPrev = do
> let x1Prev = fst $ headTail xPrev
> x2Prev = fst $ headTail $ snd $ headTail xPrev
> eta < sample $ rvar (MultivariateNormal 0.0 bigQ)
> let x1= x1Prev + x2Prev * deltaT
> x2 = x2Prev  g * (sin x1Prev) * deltaT
> xNew = vector [x1, x2] + eta
> x1New = fst $ headTail xNew
> epsilon < sample $ rvar (MultivariateNormal 0.0 bigR)
> let yNew = vector [sin x1New] + epsilon
> return $ Just ((xNew, yNew), xNew)
Let’s try plotting some samples when we are in the linear region with which we are familiar from school .
In this case we expect the horizontal displacement to be approximately equal to the angle of displacement and thus the observations to be symmetric about the actuals.
> bigQ :: Sym 2
> bigQ = sym $ matrix bigQl
> qc1 :: Double
> qc1 = 0.0001
> bigQl :: [Double]
> bigQl = [ qc1 * deltaT^3 / 3, qc1 * deltaT^2 / 2,
> qc1 * deltaT^2 / 2, qc1 * deltaT
> ]
> bigR :: Sym 1
> bigR = sym $ matrix [0.0001]
> m0 :: PendulumState
> m0 = vector [0.01, 0]
> pendulumSamples :: [(PendulumState, PendulumObs)]
> pendulumSamples = evalState (ML.unfoldrM (pendulumSample bigQ bigR) m0) (pureMT 17)
But if we work in a region in which linearity breaks down then the observations are no longer symmetrical about the actuals.
> bigQ' :: Sym 2
> bigQ' = sym $ matrix bigQl'
> qc1' :: Double
> qc1' = 0.01
> bigQl' :: [Double]
> bigQl' = [ qc1' * deltaT^3 / 3, qc1' * deltaT^2 / 2,
> qc1' * deltaT^2 / 2, qc1' * deltaT
> ]
> bigR' :: Sym 1
> bigR' = sym $ matrix [0.1]
> m0' :: PendulumState
> m0' = vector [1.6, 0]
> pendulumSamples' :: [(PendulumState, PendulumObs)]
> pendulumSamples' = evalState (ML.unfoldrM (pendulumSample bigQ' bigR') m0') (pureMT 17)
Filtering
We do not give the theory behind particle filtering. The interested reader can either consult Särkkä (2013) or wait for a future blog post on the subject.
> nParticles :: Int
> nParticles = 500
The usual Bayesian update step.
> type Particles a = V.Vector a
> oneFilteringStep ::
> MonadRandom m =>
> (Particles a > m (Particles a)) >
> (Particles a > Particles b) >
> (b > b > Double) >
> Particles a >
> b >
> WriterT [Particles a] m (Particles a)
> oneFilteringStep stateUpdate obsUpdate weight statePrevs obs = do
> tell [statePrevs]
> stateNews < lift $ stateUpdate statePrevs
> let obsNews = obsUpdate stateNews
> let weights = V.map (weight obs) obsNews
> cumSumWeights = V.tail $ V.scanl (+) 0 weights
> totWeight = V.last cumSumWeights
> vs < lift $ V.replicateM nParticles (sample $ uniform 0.0 totWeight)
> let js = indices cumSumWeights vs
> stateTildes = V.map (stateNews V.!) js
> return stateTildes
The system state and observable.
> data SystemState a = SystemState { x1 :: a, x2 :: a }
> deriving (Show, Generic)
> instance NFData a => NFData (SystemState a)
> newtype SystemObs a = SystemObs { y1 :: a }
> deriving Show
To make the system state update a bit more readable, let’s introduce some lifted arithmetic operators.
> (.+), (.*), (.) :: (Num a) => V.Vector a > V.Vector a > V.Vector a
> (.+) = V.zipWith (+)
> (.*) = V.zipWith (*)
> (.) = V.zipWith ()
The state update itself
> stateUpdate :: Particles (SystemState Double) >
> Particles (SystemState Double)
> stateUpdate xPrevs = V.zipWith SystemState x1s x2s
> where
> ix = V.length xPrevs
>
> x1Prevs = V.map x1 xPrevs
> x2Prevs = V.map x2 xPrevs
>
> deltaTs = V.replicate ix deltaT
> gs = V.replicate ix g
> x1s = x1Prevs .+ (x2Prevs .* deltaTs)
> x2s = x2Prevs . (gs .* (V.map sin x1Prevs) .* deltaTs)
and its noisy version.
> stateUpdateNoisy :: MonadRandom m =>
> Sym 2 >
> Particles (SystemState Double) >
> m (Particles (SystemState Double))
> stateUpdateNoisy bigQ xPrevs = do
> let xs = stateUpdate xPrevs
>
> x1s = V.map x1 xs
> x2s = V.map x2 xs
>
> let ix = V.length xPrevs
> etas < replicateM ix $ sample $ rvar (MultivariateNormal 0.0 bigQ)
>
> let eta1s, eta2s :: V.Vector Double
> eta1s = V.fromList $ map (fst . headTail) etas
> eta2s = V.fromList $ map (fst . headTail . snd . headTail) etas
>
> return (V.zipWith SystemState (x1s .+ eta1s) (x2s .+ eta2s))
The function which maps the state to the observable.
> obsUpdate :: Particles (SystemState Double) >
> Particles (SystemObs Double)
> obsUpdate xs = V.map (SystemObs . sin . x1) xs
And finally a function to calculate the weight of each particle given an observation.
> weight :: forall a n . KnownNat n =>
> (a > R n) >
> Sym n >
> a > a > Double
> weight f bigR obs obsNew = pdf (MultivariateNormal (f obsNew) bigR) (f obs)
The variance of the prior.
> bigP :: Sym 2
> bigP = sym $ diag 0.1
Generate our ensemble of particles chosen from the prior,
> initParticles :: MonadRandom m =>
> m (Particles (SystemState Double))
> initParticles = V.replicateM nParticles $ do
> r < sample $ rvar (MultivariateNormal m0' bigP)
> let x1 = fst $ headTail r
> x2 = fst $ headTail $ snd $ headTail r
> return $ SystemState { x1 = x1, x2 = x2}
run the particle filter,
> runFilter :: Int > [Particles (SystemState Double)]
> runFilter nTimeSteps = snd $ evalState action (pureMT 19)
> where
> action = runWriterT $ do
> xs < lift $ initParticles
> V.foldM
> (oneFilteringStep (stateUpdateNoisy bigQ') obsUpdate (weight f bigR'))
> xs
> (V.fromList $ map (SystemObs . fst . headTail . snd)
> (take nTimeSteps pendulumSamples'))
and extract the estimated position from the filter.
> testFiltering :: Int > [Double]
> testFiltering nTimeSteps = map ((/ (fromIntegral nParticles)). sum . V.map x1)
> (runFilter nTimeSteps)
Smoothing
If we could calculate the marginal smoothing distributions then we might be able to sample from them. Using the Markov assumption of our model that is independent of given , we have
We observe that this is a (continuous state space) Markov process with a nonhomogeneous transition function albeit one which goes backwards in time. Apparently for conditionally Gaussian linear statespace models, this is known as RTS, or RauchTungStriebel smoothing (Rauch, Striebel, and Tung (1965)).
According to Cappé (2008),

It appears to be efficient and stable in the long term (although no proof was available at the time the slides were presented).

It is not sequential (in particular, one needs to store all particle positions and weights).

It has numerical complexity proportional where is the number of particles.
We can use this to sample paths starting at time and working backwards. From above we have
where is some normalisation constant (Z for “Zustandssumme” – sum over states).
From particle filtering we know that
Thus
and we can sample from with probability
Recalling that we have resampled the particles in the filtering algorithm so that their weights are all and abstracting the state update and state density function, we can encode this update step in Haskell as
> oneSmoothingStep :: MonadRandom m =>
> (Particles a > V.Vector a) >
> (a > a > Double) >
> a >
> Particles a >
> WriterT (Particles a) m a
> oneSmoothingStep stateUpdate
> stateDensity
> smoothingSampleAtiPlus1
> filterSamplesAti = do it
> where
> it = do
> let mus = stateUpdate filterSamplesAti
> weights = V.map (stateDensity smoothingSampleAtiPlus1) mus
> cumSumWeights = V.tail $ V.scanl (+) 0 weights
> totWeight = V.last cumSumWeights
> v < lift $ sample $ uniform 0.0 totWeight
> let ix = binarySearch cumSumWeights v
> xnNew = filterSamplesAti V.! ix
> tell $ V.singleton xnNew
> return $ xnNew
To sample a complete path we start with a sample from the filtering distribution at at time (which is also the smoothing distribution)
> oneSmoothingPath :: MonadRandom m =>
> (Int > V.Vector (Particles a)) >
> (a > Particles a > WriterT (Particles a) m a) >
> Int > m (a, V.Vector a)
> oneSmoothingPath filterEstss oneSmoothingStep nTimeSteps = do
> let ys = filterEstss nTimeSteps
> ix < sample $ uniform 0 (nParticles  1)
> let xn = (V.head ys) V.! ix
> runWriterT $ V.foldM oneSmoothingStep xn (V.tail ys)
> oneSmoothingPath' :: (MonadRandom m, Show a) =>
> (Int > V.Vector (Particles a)) >
> (a > Particles a > WriterT (Particles a) m a) >
> Int > WriterT (Particles a) m a
> oneSmoothingPath' filterEstss oneSmoothingStep nTimeSteps = do
> let ys = filterEstss nTimeSteps
> ix < lift $ sample $ uniform 0 (nParticles  1)
> let xn = (V.head ys) V.! ix
> V.foldM oneSmoothingStep xn (V.tail ys)
Of course we need to run through the filtering distributions starting at
> filterEstss :: Int > V.Vector (Particles (SystemState Double))
> filterEstss n = V.reverse $ V.fromList $ runFilter n
> testSmoothing :: Int > Int > [Double]
> testSmoothing m n = V.toList $ evalState action (pureMT 23)
> where
> action = do
> xss < V.replicateM n $
> snd <$> (oneSmoothingPath filterEstss (oneSmoothingStep stateUpdate (weight h bigQ')) m)
> let yss = V.fromList $ map V.fromList $
> transpose $
> V.toList $ V.map (V.toList) $
> xss
> return $ V.map (/ (fromIntegral n)) $ V.map V.sum $ V.map (V.map x1) yss
By eye we can see we get a better fit
and calculating the mean square error for filtering gives against the mean square error for smoothing of ; this confirms the fit really is better as one would hope.
Unknown Gravity
Let us continue with the same example but now assume that is unknown and that we wish to estimate it. Let us also assume that our apparatus is not subject to noise.
Again we have
But now when we discretize it we include a third variable
where
Again we assume that we can only measure the horizontal position of the pendulum so that
where .
> type PendulumStateG = R 3
> pendulumSampleG :: MonadRandom m =>
> Sym 3 >
> Sym 1 >
> PendulumStateG >
> m (Maybe ((PendulumStateG, PendulumObs), PendulumStateG))
> pendulumSampleG bigQ bigR xPrev = do
> let x1Prev = fst $ headTail xPrev
> x2Prev = fst $ headTail $ snd $ headTail xPrev
> x3Prev = fst $ headTail $ snd $ headTail $ snd $ headTail xPrev
> eta < sample $ rvar (MultivariateNormal 0.0 bigQ)
> let x1= x1Prev + x2Prev * deltaT
> x2 = x2Prev  g * (sin x1Prev) * deltaT
> x3 = x3Prev
> xNew = vector [x1, x2, x3] + eta
> x1New = fst $ headTail xNew
> epsilon < sample $ rvar (MultivariateNormal 0.0 bigR)
> let yNew = vector [sin x1New] + epsilon
> return $ Just ((xNew, yNew), xNew)
> pendulumSampleGs :: [(PendulumStateG, PendulumObs)]
> pendulumSampleGs = evalState (ML.unfoldrM (pendulumSampleG bigQg bigRg) mG) (pureMT 29)
> data SystemStateG a = SystemStateG { gx1 :: a, gx2 :: a, gx3 :: a }
> deriving Show
The state update itself
> stateUpdateG :: Particles (SystemStateG Double) >
> Particles (SystemStateG Double)
> stateUpdateG xPrevs = V.zipWith3 SystemStateG x1s x2s x3s
> where
> ix = V.length xPrevs
>
> x1Prevs = V.map gx1 xPrevs
> x2Prevs = V.map gx2 xPrevs
> x3Prevs = V.map gx3 xPrevs
>
> deltaTs = V.replicate ix deltaT
> x1s = x1Prevs .+ (x2Prevs .* deltaTs)
> x2s = x2Prevs . (x3Prevs .* (V.map sin x1Prevs) .* deltaTs)
> x3s = x3Prevs
and its noisy version.
> stateUpdateNoisyG :: MonadRandom m =>
> Sym 3 >
> Particles (SystemStateG Double) >
> m (Particles (SystemStateG Double))
> stateUpdateNoisyG bigQ xPrevs = do
> let ix = V.length xPrevs
>
> let xs = stateUpdateG xPrevs
>
> x1s = V.map gx1 xs
> x2s = V.map gx2 xs
> x3s = V.map gx3 xs
>
> etas < replicateM ix $ sample $ rvar (MultivariateNormal 0.0 bigQ)
> let eta1s, eta2s, eta3s :: V.Vector Double
> eta1s = V.fromList $ map (fst . headTail) etas
> eta2s = V.fromList $ map (fst . headTail . snd . headTail) etas
> eta3s = V.fromList $ map (fst . headTail . snd . headTail . snd . headTail) etas
>
> return (V.zipWith3 SystemStateG (x1s .+ eta1s) (x2s .+ eta2s) (x3s .+ eta3s))
The function which maps the state to the observable.
> obsUpdateG :: Particles (SystemStateG Double) >
> Particles (SystemObs Double)
> obsUpdateG xs = V.map (SystemObs . sin . gx1) xs
The mean and variance of the prior.
> mG :: R 3
> mG = vector [1.6, 0.0, 8.00]
> bigPg :: Sym 3
> bigPg = sym $ matrix [
> 1e6, 0.0, 0.0
> , 0.0, 1e6, 0.0
> , 0.0, 0.0, 1e2
> ]
Parameters for the state update; note that the variance is not exactly the same as in the formulation above.
> bigQg :: Sym 3
> bigQg = sym $ matrix bigQgl
> qc1G :: Double
> qc1G = 0.0001
> sigmaG :: Double
> sigmaG = 1.0e2
> bigQgl :: [Double]
> bigQgl = [ qc1G * deltaT^3 / 3, qc1G * deltaT^2 / 2, 0.0,
> qc1G * deltaT^2 / 2, qc1G * deltaT, 0.0,
> 0.0, 0.0, sigmaG
> ]
The noise of the measurement.
> bigRg :: Sym 1
> bigRg = sym $ matrix [0.1]
Generate the ensemble of particles from the prior,
> initParticlesG :: MonadRandom m =>
> m (Particles (SystemStateG Double))
> initParticlesG = V.replicateM nParticles $ do
> r < sample $ rvar (MultivariateNormal mG bigPg)
> let x1 = fst $ headTail r
> x2 = fst $ headTail $ snd $ headTail r
> x3 = fst $ headTail $ snd $ headTail $ snd $ headTail r
> return $ SystemStateG { gx1 = x1, gx2 = x2, gx3 = x3}
run the particle filter,
> runFilterG :: Int > [Particles (SystemStateG Double)]
> runFilterG n = snd $ evalState action (pureMT 19)
> where
> action = runWriterT $ do
> xs < lift $ initParticlesG
> V.foldM
> (oneFilteringStep (stateUpdateNoisyG bigQg) obsUpdateG (weight f bigRg))
> xs
> (V.fromList $ map (SystemObs . fst . headTail . snd) (take n pendulumSampleGs))
and extract the estimated parameter from the filter.
> testFilteringG :: Int > [Double]
> testFilteringG n = map ((/ (fromIntegral nParticles)). sum . V.map gx3) (runFilterG n)
Again we need to run through the filtering distributions starting at
> filterGEstss :: Int > V.Vector (Particles (SystemStateG Double))
> filterGEstss n = V.reverse $ V.fromList $ runFilterG n
> testSmoothingG :: Int > Int > ([Double], [Double], [Double])
> testSmoothingG m n = (\(x, y, z) > (V.toList x, V.toList y, V.toList z)) $
> mkMeans $
> chunks
> where
>
> chunks =
> V.fromList $ map V.fromList $
> transpose $
> V.toList $ V.map (V.toList) $
> chunksOf m $
> snd $ evalState (runWriterT action) (pureMT 23)
>
> mkMeans yss = (
> V.map (/ (fromIntegral n)) $ V.map V.sum $ V.map (V.map gx1) yss,
> V.map (/ (fromIntegral n)) $ V.map V.sum $ V.map (V.map gx2) yss,
> V.map (/ (fromIntegral n)) $ V.map V.sum $ V.map (V.map gx3) yss
> )
>
> action =
> V.replicateM n $
> oneSmoothingPath' filterGEstss
> (oneSmoothingStep stateUpdateG (weight hG bigQg))
> m
Again by eye we get a better fit but note that we are using the samples in which the state update is noisy as well as the observation so we don’t expect to get a very good fit.
Notes
Helpers for Converting Types
> f :: SystemObs Double > R 1
> f = vector . pure . y1
> h :: SystemState Double > R 2
> h u = vector [x1 u , x2 u]
> hG :: SystemStateG Double > R 3
> hG u = vector [gx1 u , gx2 u, gx3 u]
Helpers for the Inverse CDF
That these are helpers for the inverse CDF is delayed to another blog post.
> indices :: V.Vector Double > V.Vector Double > V.Vector Int
> indices bs xs = V.map (binarySearch bs) xs
> binarySearch :: (Ord a) =>
> V.Vector a > a > Int
> binarySearch vec x = loop 0 (V.length vec  1)
> where
> loop !l !u
>  u <= l = l
>  otherwise = let e = vec V.! k in if x <= e then loop l k else loop (k+1) u
> where k = l + (u  l) `shiftR` 1
Vector Helpers
> chunksOf :: Int > V.Vector a > V.Vector (V.Vector a)
> chunksOf n xs = ys
> where
> l = V.length xs
> m = 1 + (l  1) `div` n
> ys = V.unfoldrN m (\us > Just (V.take n us, V.drop n us)) xs
Bibliography
Rauch, H. E., C. T. Striebel, and F. Tung. 1965. “Maximum Likelihood Estimates of Linear Dynamic Systems.” Journal of the American Institute of Aeronautics and Astronautics 3 (8): 1445–50.
Särkkä, Simo. 2013. Bayesian Filtering and Smoothing. New York, NY, USA: Cambridge University Press.