Post available from my new site. Sadly WordPress doesn’t allow me to render the html exported by a Jupyter notebook.
Cartography in Haskell
Introduction
This blog started off life as a blog post on how I use nix but somehow transformed itself into a “how I do data visualisation” blog post. The nix is still here though quietly doing its work in the background.
Suppose you want to analyze your local election results and visualize them using a choropleth but you’d like to use Haskell. You could try using the shapefile package but you will have to do a lot of the heavy lifting yourself (see here for an extended example). But you know folks using R can produce such things in a few lines of code and you also know from experience that unless you are very disciplined large R scripts can quickly become unmaintainable. Can you get the best of both worlds? It seems you can if you use inliner. But now you have a bit of configuration problem: how do you make sure that all the Haskell packages that you need and all the R packages that you need are available and how can you make sure that your results are reproducible? My answer to use nix.
Acknowledgements
I’d like to thank all my colleagues at Tweag who have had the patience to indulge my questions especially zimbatm and also all folks on the #nixos irc channel.
Downloading and Building
The sources are on github.
You’ll have to build my Include
pandoc filter somehow and then
pandoc s HowIUseNix.lhs filter=./Include t markdown+lhs > HowIUseNixExpanded.lhs
BlogLiterately HowIUseNixExpanded.lhs > HowIUseNixExpanded.html
Analysing the Data
Before looking at the .nix file I use, let’s actually analyse the data. First some preamble
> {# OPTIONS_GHC Wall #}
> {# OPTIONS_GHC fnowarnunuseddobind #}
> {# LANGUAGE TemplateHaskell #}
> {# LANGUAGE QuasiQuotes #}
> {# LANGUAGE DataKinds #}
> {# LANGUAGE FlexibleContexts #}
> {# LANGUAGE TypeOperators #}
> {# LANGUAGE OverloadedStrings #}
> import Prelude as P
> import qualified Language.R as R
> import Language.R.QQ
> import System.Directory
> import System.FilePath
> import Control.Monad
> import Data.List (stripPrefix, nub)
> import qualified Data.Foldable as Foldable
> import Frames hiding ( (:&) )
> import Control.Lens
> import qualified Data.FuzzySet as Fuzzy
> import Data.Text (pack, unpack)
> import qualified Control.Foldl as Foldl
> import Pipes.Prelude (fold)
We can scrape the data for the Kington on Thames local elections and put it in a CSV file. Using the Haskell equivalent of data frames / pandas, Frames, we can “read” in the data. I say “read” in quotes because if this were a big data set (it’s not) then it might not fit in memory and we’d have to process it as a stream. I’ve included the sheet in the github repo so you can try reproducing my results.
> tableTypes "ElectionResults" "data/Election Results 2018 v1.csv"
> loadElectionResults :: IO (Frame ElectionResults)
> loadElectionResults = inCoreAoS (readTable "data/Election Results 2018 v1.csv")
We could analyse the data set in order to find the ward (aka electoral district) names but I happen to know them.
> wardNames :: [Text]
> wardNames = [ "Berrylands"
> , "Norbiton"
> , "Old Malden"
> , "Coombe Hill"
> , "Beverley"
> , "Chessington North & Hook"
> , "Tolworth and Hook Rise"
> , "Coombe Vale"
> , "St James"
> , "Tudor"
> , "Alexandra"
> , "Canbury"
> , "Surbiton Hill"
> , "Chessington South"
> , "Grove"
> , "St Marks"
> ]
We need a query to get both the total number of votes cast in the ward and the number of votes cast for what is now the party which controls the borough. We can create two filters to run in parallel, summing as we go.
> getPartyWard :: Text > Text > Foldl.Fold ElectionResults (Int, Int)
> getPartyWard p w =
> (,) <$> (Foldl.handles (filtered (\x > x ^. ward == w)) .
> Foldl.handles votes) Foldl.sum
> <*> (Foldl.handles (filtered (\x > x ^. ward == w &&
> x ^. party == p)) .
> Foldl.handles votes) Foldl.sum
We can combine all the filters and sums into one big query where they all run in parallel.
> getVotes :: Foldl.Fold ElectionResults [(Int, Int)]
> getVotes = sequenceA $ map (getPartyWard "Liberal Democrat") wardNames
I downloaded the geographical data from here but I imagine there are many other sources.
First we read all the shapefiles in the directory we downloaded.
> root :: FilePath
> root = "data/KingstonuponThames"
> main :: IO ()
> main = do
> ns < runSafeT $ Foldl.purely fold getVotes (readTable "data/Election Results 2018 v1.csv")
> let ps :: [Double]
> ps = map (\(x, y) > 100.0 * fromIntegral y / fromIntegral x) ns
> let percentLD = zip (map unpack wardNames) ps
>
> ds < getDirectoryContents root
> let es = filter (\x > takeExtension x == ".shp") ds
> let fs = case xs of
> Nothing > error "Ward prefix does not match"
> Just ys > ys
> where
> xs = sequence $
> map (fmap dropExtension) $
> map (stripPrefix "KingstonuponThames_") es
Sadly, when we get the ward names from the file names they don’t match the ward names in the electoral data sheet. Fortunately, fuzzy matching resolves this problem.
> rs < loadElectionResults
> let wardSet :: Fuzzy.FuzzySet
> wardSet = Fuzzy.fromList . nub . Foldable.toList . fmap (^. ward) $ rs
>
> let gs = case sequence $ map (Fuzzy.getOne wardSet) $ map pack fs of
> Nothing > error "Ward name and ward filename do not match"
> Just xs > xs
Rather than making the colour of each ward vary continuously depending on the percentage share of the vote, let’s create 4 buckets to make it easier to see at a glance how the election results played out.
> let bucket :: Double > String
> bucket x  x < 40 = "whitesmoke"
>  x < 50 = "yellow1"
>  x < 60 = "yellow2"
>  otherwise = "yellow3"
>
> let cs :: [String]
> cs = case sequence $ map (\w > lookup w percentLD) (map unpack gs) of
> Nothing > error "Wrong name in table"
> Just xs > map bucket xs
Now we move into the world of R, courtesy of inliner. First we load up the required libraries. Judging by urls like this, it is not always easy to install the R binding to GDAL (Geospatial Data Abstraction Library). But with nix, we don’t even have to use install.packages(...)
.
> R.runRegion $ do
> [r library(rgdal) ]
> [r library(ggplot2) ]
> [r library(dplyr) ]
Next we create an empty map (I have no idea what this would look like if it were plotted).
> map0 < [r ggplot() ]
And now we can fold over the shape data for each ward and its colour to build the basic map we want.
> mapN < foldM
> (\m (f, c) > do
> let wward = root </> f
> shapefileN < [r readOGR(wward_hs) ]
> shapefileN_df < [r fortify(shapefileN_hs) ]
> withColours < [r mutate(shapefileN_df_hs, colour=rep(c_hs, nrow(shapefileN_df_hs))) ]
> [r m_hs +
> geom_polygon(data = withColours_hs,
> aes(x = long, y = lat, group = group, fill=colour),
> color = 'gray', size = .2) ]) map0 (zip es cs)
Now we can annotate the map with all the details needed to make it intelligible.
> map_projected < [r mapN_hs + coord_map() +
> ggtitle("Kingston on Thames 2018 Local Election\nLib Dem Percentage Vote by Ward") +
> theme(legend.position="bottom", plot.title = element_text(lineheight=.8, face="bold", hjust=0.5)) +
> scale_fill_manual(values=c("whitesmoke", "yellow1", "yellow2", "yellow3"),
> name="Vote (%)",
> labels=c("0%40% Lib Dem", "40%50% Lib Dem", "50%60% Lib Dem", "60%100% Lib Dem")) ]
And finally save it in a file.
> [r jpeg(filename="diagrams/kingston.jpg") ]
> [r print(map_projected_hs) ]
> [r dev.off() ]
> return ()
How I Really Use nix
I haven’t found a way of interspersing text with nix code so here are some comments.

I use neither all the Haskell packages listed nor all the R packages. I tend to put a shell.nix file in the directory where I am working and then
nixshell shell.nix I nixpkgs=~/nixpkgs
as at the moment I seem to need packages which were recently added to nixpkgs. 
I couldn’t get the tests for
inliner
to run on MACos so I addeddontCheck
. 
Some Haskell packages in nixpkgs have their bounds set too strictly so I have used
doJailbreak
.
{ pkgs ? import {}, compiler ? "ghc822", doBenchmark ? false }:
let
f = { mkDerivation, haskell, base, foldl, Frames, fuzzyset
, inliner, integration, lens
, pandoctypes, plots
, diagramsrasterific
, diagrams
, diagramssvg
, diagramscontrib
, R, random, stdenv
, templatehaskell, temporary }:
mkDerivation {
pname = "mrp";
version = "1.0.0";
src = ./.;
isLibrary = false;
isExecutable = true;
executableHaskellDepends = [
base
diagrams
diagramsrasterific
diagramssvg
diagramscontrib
(haskell.lib.dontCheck inliner)
foldl
Frames
fuzzyset
integration
lens
pandoctypes
plots
random
templatehaskell
temporary
];
executableSystemDepends = [
R
pkgs.rPackages.dplyr
pkgs.rPackages.ggmap
pkgs.rPackages.ggplot2
pkgs.rPackages.knitr
pkgs.rPackages.maptools
pkgs.rPackages.reshape2
pkgs.rPackages.rgeos
pkgs.rPackages.rgdal
pkgs.rPackages.rstan
pkgs.rPackages.zoo];
license = stdenv.lib.licenses.bsd3;
};
haskellPackages = if compiler == "default"
then pkgs.haskellPackages
else pkgs.haskell.packages.${compiler};
myHaskellPackages = haskellPackages.override {
overrides = self: super: with pkgs.haskell.lib; {
diagramsrasterific = doJailbreak super.diagramsrasterific;
diagramssvg = doJailbreak super.diagramssvg;
diagramscontrib = doJailbreak super.diagramscontrib;
diagrams = doJailbreak super.diagrams;
inliner = dontCheck super.inliner;
pipesgroup = doJailbreak super.pipesgroup;
};
};
variant = if doBenchmark then pkgs.haskell.lib.doBenchmark else pkgs.lib.id;
drv = variant (myHaskellPackages.callPackage f {});
in
if pkgs.lib.inNixShell then drv.env else drv
Workshop on Numerical Programming in Functional Languages (NPFL)
I’m the chair this year for the first(!) ACM SIGPLAN International
Workshop on Numerical Programming in Functional Languages (NPFL),
which will be colocated with ICFP this September in St. Louis,
Missouri, USA.
Please consider submitting something! All you have to do is submit
between half a page and a page describing your talk. There will be no
proceedings so all you need do is prepare your slides or demo. We do
plan to video the talks for the benefit of the wider world.
We have Eva Darulova giving the keynote.
The submission deadline is Friday 13th July but I and the Committee
would love to see your proposals earlier.
See here for more information. Please contact me if you have any
questions!
Implicit Runge Kutta and GNU Scientific Library
Introduction
These are some very hasty notes on RungeKutta methods and IRK2 in particular. I make no apologies for missing lots of details. I may try and put these in a more digestible form but not today.
Some Uncomprehensive Theory
In general, an implicit RungeKutta method is given by
where
and
Traditionally this is written as a Butcher tableau:
and even more succintly as
For a GaußLegendre method we set the values of the to the zeros of the shifted Legendre polynomials.
An explicit expression for the shifted Legendre polynomials is given by
The first few shifted Legendre polynomials are:
Setting
then the colocation method gives
For we have and thus and the Butcher tableau is
that is, the implicit RK2 method aka the midpoint method.
For we have and thus and and the Butcher tableau is
that is, the implicit RK4 method.
Implicit RK2
Explicitly
where
Substituting in the values from the tableau, we have
where
and further inlining and substitution gives
which can be recognised as the midpoint method.
Implementation
A common package for solving ODEs is gsl which Haskell interfaces via the hmatrixgsl package. Some of gsl is coded with the conditional compilation flag DEBUG e.g. in msbdf.c but sadly not in the simpler methods maybe because they were made part of the package some years earlier. We can add our own of course; here’s a link for reference.
Let’s see how the Implicit RungeKutta Order 2 method does on the following system taken from the gsl documenation
which can be rewritten as
but replacing gsl_odeiv2_step_rk8pd with gsl_odeiv2_step_rk2imp.
Here’s the first few steps
rk2imp_apply: t=0.00000e+00, h=1.00000e06, y:1.00000e+00 0.00000e+00
 evaluate jacobian
( 2, 2)[0,0]: 0
( 2, 2)[0,1]: 1
( 2, 2)[1,0]: 1
( 2, 2)[1,1]: 0
YZ:1.00000e+00 5.00000e07
 error estimates: 0.00000e+00 8.35739e20
rk2imp_apply: t=1.00000e06, h=5.00000e06, y:1.00000e+00 1.00000e06
 evaluate jacobian
( 2, 2)[0,0]: 0
( 2, 2)[0,1]: 1
( 2, 2)[1,0]: 0.99997999999999998
( 2, 2)[1,1]: 1.00008890058234101e11
YZ:1.00000e+00 3.50000e06
 error estimates: 1.48030e16 1.04162e17
rk2imp_apply: t=6.00000e06, h=2.50000e05, y:1.00000e+00 6.00000e06
 evaluate jacobian
( 2, 2)[0,0]: 0
( 2, 2)[0,1]: 1
( 2, 2)[1,0]: 0.999880000000002878
( 2, 2)[1,1]: 3.59998697518904009e10
YZ:1.00000e+00 1.85000e05
 error estimates: 1.48030e16 1.30208e15
rk2imp_apply: t=3.10000e05, h=1.25000e04, y:1.00000e+00 3.10000e05
 evaluate jacobian
( 2, 2)[0,0]: 0
( 2, 2)[0,1]: 1
( 2, 2)[1,0]: 0.999380000000403723
( 2, 2)[1,1]: 9.6099972424212865e09
YZ:1.00000e+00 9.35000e05
 error estimates: 0.00000e+00 1.62760e13
rk2imp_apply: t=1.56000e04, h=6.25000e04, y:1.00000e+00 1.56000e04
 evaluate jacobian
( 2, 2)[0,0]: 0
( 2, 2)[0,1]: 1
( 2, 2)[1,0]: 0.996880000051409643
( 2, 2)[1,1]: 2.4335999548874554e07
YZ:1.00000e+00 4.68500e04
 error estimates: 1.55431e14 2.03450e11
Let’s see if we can reproduce this in a fast and loose way in Haskell
To make our code easier to write and read let’s lift some arithmetic operators to act on lists (we should really use the Naperian package).
> module RK2Imp where
> instance Num a => Num [a] where
> (+) = zipWith (+)
> (*) = zipWith (*)
The actual algorithm is almost trivial
> rk2Step :: Fractional a => a > a > [a] > [a] > [a]
> rk2Step h t y_n0 y_n1 = y_n0 + (repeat h) * f (t + 0.5 * h) ((repeat 0.5) * (y_n0 + y_n1))
The example Van der Pol oscillator with the same parameter and initial conditions as in the gsl example.
> f :: Fractional b => a > [b] > [b]
> f t [y0, y1] = [y1, y0  mu * y1 * (y0 * y0  1)]
> where
> mu = 10.0
> y_init :: [Double]
> y_init = [1.0, 0.0]
We can use corecursion to find the fixed point of the RungeKutta equation arbitrarily choosing the 10th iteration!
> y_1s, y_2s, y_3s :: [[Double]]
> y_1s = y_init : map (rk2Step 1.0e6 0.0 y_init) y_1s
> nC :: Int
> nC = 10
> y_1, y_2, y_3 :: [Double]
> y_1 = y_1s !! nC
This gives
ghci> y_1
[0.9999999999995,9.999999999997499e7]
which is not too far from the value given by gsl.
y:1.00000e+00 1.00000e06
Getting the next time and step size from the gsl DEBUG information we can continue
> y_2s = y_1 : map (rk2Step 5.0e6 1.0e6 y_1) y_2s
>
> y_2 = y_2s !! nC
ghci> y_2
[0.999999999982,5.999999999953503e6]
gsl gives
y:1.00000e+00 6.00000e06
> y_3s = y_2 : map (rk2Step 2.5e5 6.0e6 y_2) y_3s
> y_3 = y_3s !! nC
One more time
ghci> y_3
[0.9999999995194999,3.099999999372456e5]
And gsl gives
y:1.00000e+00 3.10000e05
Nix
The nixshell which to run this is here and the C code can be built using something like
gcc ode.c lgsl lgslcblas o ode
(in the nix shell).
The Haskell code can be used inside a ghci session.
Reproducibility and Old Faithful
Introduction
For the blog post still being written on variatonal methods, I referred to the still excellent Bishop (2006) who uses as his example data, the data available in R for the geyser in Yellowstone National Park called “Old Faithful”.
While explaining this to another statistician, they started to ask about the dataset. Since I couldn’t immediately answer their questions (when was it collected? over how long? how was it collected?), I started to look at the dataset more closely. The more I dug into where the data came from, the less certain I became that the dataset actually reflected how Old Faithful actually behaves.
Here’s what I found. On the way, I use Haskell, R embedded in Haskell, data frames in Haskell and nix.
The Investigation
First some imports and language extensions.
> {# OPTIONS_GHC Wall #}
> {# OPTIONS_GHC fnowarnunusedtopbinds #}
>
> {# LANGUAGE QuasiQuotes #}
> {# LANGUAGE ScopedTypeVariables #}
> {# LANGUAGE DataKinds #}
> {# LANGUAGE FlexibleContexts #}
> {# LANGUAGE TemplateHaskell #}
> {# LANGUAGE TypeOperators #}
> {# LANGUAGE TypeSynonymInstances #}
> {# LANGUAGE FlexibleInstances #}
> {# LANGUAGE QuasiQuotes #}
> {# LANGUAGE UndecidableInstances #}
> {# LANGUAGE ExplicitForAll #}
> {# LANGUAGE ScopedTypeVariables #}
> {# LANGUAGE OverloadedStrings #}
> {# LANGUAGE GADTs #}
> {# LANGUAGE TypeApplications #}
>
>
> module Main (main) where
>
> import Prelude as P
>
> import Control.Lens
>
> import Data.Foldable
> import Frames hiding ( (:&) )
>
> import Data.Vinyl (Rec(..))
>
> import qualified Language.R as R
> import Language.R (R)
> import Language.R.QQ
> import Data.Int
>
> import Plots as P
> import qualified Diagrams.Prelude as D
> import Diagrams.Backend.Rasterific
>
> import Data.Time.Clock(UTCTime, NominalDiffTime, diffUTCTime)
> import Data.Time.Clock.POSIX(posixSecondsToUTCTime)
>
> import Data.Attoparsec.Text
> import Control.Applicative
R provides many datasets presumably for ease of use. We can access the “Old Faithful” data set in Haskell using some very simple inline R.
> eRs :: R s [Double]
> eRs = R.dynSEXP <$> [r faithful$eruptions ]
>
> wRs :: R s [Int32]
> wRs = R.dynSEXP <$> [r faithful$waiting ]
And then plot the dataset using
> kSaxis :: [(Double, Double)] > P.Axis B D.V2 Double
> kSaxis xs = P.r2Axis &~ do
> P.scatterPlot' xs
>
> plotOF :: IO ()
> plotOF = do
> es' < R.runRegion $ do eRs
> ws' < R.runRegion $ do wRs
> renderRasterific "diagrams/Test.png"
> (D.dims2D 500.0 500.0)
> (renderAxis $ kSaxis $ P.zip es' (P.map fromIntegral ws'))
In the documentation for this dataset, the source is given as Härdle (2012). In here the data are given as a table in Table 3 on page 201. It contains 270 observations not 272 as the R dataset does! Note that there seems to be a correlation between intereruption time and duration of eruption and eruptions seem to fall into one of two groups.
The documentation also says “See Also geyser in package MASS for the Azzalini–Bowman version”. So let’s look at this and plot it.
> eAltRs :: R s [Double]
> eAltRs = R.dynSEXP <$> [r geyser$duration ]
>
> wAltRs :: R s [Int32]
> wAltRs = R.dynSEXP <$> [r geyser$waiting ]
>
> plotOF' :: IO ()
> plotOF' = do
> es < R.runRegion $ do _ < [r library(MASS) ]
> eAltRs
> ws < R.runRegion $ do wAltRs
> renderRasterific "diagrams/TestR.png"
> (D.dims2D 500.0 500.0)
> (renderAxis $ kSaxis $ P.zip es (P.map fromIntegral ws))
A quick look at the first few entries suggests this is the data in Table 1 of Azzalini and Bowman (1990) but NB I didn’t check each item.
It looks quite different to the faithful dataset. But to answer my fellow statistician’s questions: “This analysis deals with data which were collected continuously from August 1st until August 15th, 1985”.
It turns out there is a website which collects data for a large number of geysers and has historical datasets for them including old faithful. The data is in a slightly awkward format and it is not clear (to me at any rate) what some of the fields mean.
Let’s examine the first 1000 observations in the data set using the Haskell package Frames.
First we ask Frames to automatically type the data and read it into a frame.
> tableTypes "OldFaithFul" "/Users/dom/Dropbox/Tidy/NumMethHaskell/variational/Old_Faithful_eruptions_1000.csv"
>
> loadOldFaithful :: IO (Frame OldFaithFul)
> loadOldFaithful = inCoreAoS (readTable "/Users/dom/Dropbox/Tidy/NumMethHaskell/variational/Old_Faithful_eruptions_1000.csv")
Declare some columns and helper functions to manipulate the times.
> declareColumn "eruptionTimeNew" ''UTCTime
> declareColumn "eruptionTimeOld" ''UTCTime
> declareColumn "eruptionTimeDiff" ''NominalDiffTime
> declareColumn "eruptionTimeRat" ''Rational
> declareColumn "eruptionTimeDoub" ''Double
> declareColumn "parsedDuration" ''Double
>
> epochToUTC :: Integral a => a > UTCTime
> epochToUTC = posixSecondsToUTCTime . fromIntegral
>
> getEruptionTimeNew :: OldFaithFul > Record '[EruptionTimeNew]
> getEruptionTimeNew x = pure (Col ((epochToUTC (x ^. eruptionTimeEpoch)))) :& Nil
>
> getEruptionTimeOld :: OldFaithFul > Record '[EruptionTimeOld]
> getEruptionTimeOld x = pure (Col ((epochToUTC (x ^. eruptionTimeEpoch)))) :& Nil
>
> getEruptionTimeDiff :: Record '[EruptionTimeOld, EruptionTimeNew] > Record '[EruptionTimeDoub]
> getEruptionTimeDiff x = pure (Col ((/60.0) $ fromRational $ toRational $ diffUTCTime (x ^. eruptionTimeNew) (x ^. eruptionTimeOld))) :& Nil
The duration is expressed as a string mainly as e.g. “3.5m” and “3.5min” but occasionally in other formats. Using NaNs to represent data we can’t parse is poor practice but Frames seems to object to types such as Maybe Double.
> getDuration :: OldFaithFul > Record '["duration" :> Text]
> getDuration = (rcast @'[Duration])
>
> getParsedDuration :: Record '[Duration, EruptionTimeDoub] > Record '[ParsedDuration]
> getParsedDuration x = pure (Col ((f $ (parseOnly parseDuration) (x ^. duration)))) :& Nil
> where
> f (Left _) = 0.0 / 0.0
> f (Right y) = y
>
> parseDuration :: Parser Double
> parseDuration =
> do d < double
> _ < char 'm'
> return d
> <>
> do d < double
> _ < string "min"
> return d
To get the times between eruptions we need to zip the frame with its tail so we define this helper function.
> dropFrame :: Int > Frame r > Frame r
> dropFrame n s = Frame ((frameLength s)  1) (\i > frameRow s (i + n))
> main :: IO ()
> main = do
> x < loadOldFaithful
Get the current eruption times and the previous eruption times and put them in a frame.
> let tn = dropFrame 1 $ fmap (\b > (getEruptionTimeNew b)) x
> let tp = fmap (\b > (getEruptionTimeOld b)) x
> let tpns = zipFrames tp tn
Get the durations of the eruptions and exclude any gaps when no observations were taken; arbitrarily if the gap is greate than 1.5 hours.
> let ds = fmap getDuration x
> let tds = zipFrames ds (fmap getEruptionTimeDiff tpns)
> let b = filterFrame (\u > (u ^. eruptionTimeDoub) <= 150) tds
Parse some of the durations and put the durations and the intereruption gap into a single frame.
> let c = filterFrame (\u > not $ isNaN $ u ^. parsedDuration) $
> fmap getParsedDuration b
> let d = zipFrames b c
> let g = fmap (\u > (u ^. parsedDuration, u ^. eruptionTimeDoub)) d
Finally we can yet another data set of observations of old faithful.
> renderRasterific "diagrams/TestH1000.png"
> (D.dims2D 500.0 500.0)
> (renderAxis $ kSaxis $ toList g)
> R.withEmbeddedR R.defaultConfig $ do
> plotOF
> plotOF'
> putStrLn "Finished"
To me this doesn’t look like either of the other two datasets. Perhaps the R faithful data set should be validated, possibly retired and maybe replaced by something with firmer foundations?
Nix
I use nix on MACos for package management both for Haskell and R – no more install.packages problems. Here’s my shell.nix file also available here: here.
{ nixpkgs ? import {}, compiler ? "ghc822", doBenchmark ? false }:
let
inherit (nixpkgs) pkgs;
f = { mkDerivation, array, base, bytestring, cassava, containers
, datasets, diagramslib, diagramsrasterific, foldl, Frames, ghcprim, hmatrix, hmatrixgsl
, inliner, lens, mtl, pipes, plots, randomfu, R, randomsource
, stdenv, templatehaskell, text, typelitswitnesses, vector, vinyl }:
mkDerivation {
pname = "variational";
version = "0.1.0.0";
src = ./.;
isLibrary = false;
isExecutable = true;
executableHaskellDepends = [
array
base
bytestring
cassava
containers
datasets
diagramslib
diagramsrasterific
foldl
Frames
ghcprim
hmatrix
inliner
lens
mtl
pipes
plots
randomfu
randomsource
templatehaskell
text
typelitswitnesses
vector
vinyl
];
executableSystemDepends = [
R
pkgs.rPackages.anytime
pkgs.rPackages.ggplot2
pkgs.rPackages.maptools
pkgs.rPackages.reshape2
pkgs.rPackages.rgeos
pkgs.rPackages.rgdal
pkgs.rPackages.rstan ];
license = stdenv.lib.licenses.bsd3;
};
haskellPackages = if compiler == "default"
then pkgs.haskellPackages
else pkgs.haskell.packages.${compiler};
variant = if doBenchmark then pkgs.haskell.lib.doBenchmark else pkgs.lib.id;
drv = variant (haskellPackages.callPackage f {});
in
if pkgs.lib.inNixShell then drv.env else drv
References
Azzalini, A, and Adrian Bowman. 1990. “Bowman, A.w.: a Look at Some Data on the Old Faithful Geyser. Appl. Stat. 39, 357365” 39 (January).
Bishop, C.M. 2006. Pattern Recognition and Machine Learning. Information Science and Statistics. Springer. https://books.google.co.uk/books?id=kTNoQgAACAAJ.
Härdle, W. 2012. Smoothing Techniques: With Implementation in S. Springer Series in Statistics. Springer New York. https://books.google.co.uk/books?id=sijUBwAAQBAJ.
Haskell for Numerics?
Introduction
Summary
Back in January, a colleague pointed out to me that GHC did not produce very efficient code for performing floating point abs. I have yet to produce a writeup of my notes about hacking on GHC: in summary it wasn’t as difficult as I had feared and the #ghc folks were extremely helpful.
But maybe getting GHC to produce high performance numerical code is “swimming uphill”. Below is a comparison of a “state of the art” numerical language, Julia, and an alternative Haskell approach, using a Haskell domainspecific embedded language (DSEL), accelerate. The problem is a real one albeit quite simple from a programming point of view and should therefore be taken with some grains of salt.
A bit more background
The reason for improving GHC’s (numerical) performance is that I’d like to have a fast way of performing statistical inference either via Hamiltonian Monte Carlo or Sequential Monte Carlo. Stan is my “go to” tool for offline analysis but its inference engine is fixed and according to Fast Forward Labs: “it’s very awkward to productize”. What one would really like is the ability to choose and compose inference methods like can be done in monadbayes. Although I haven’t compared monadbayes and Stan in very much depth, the latter seems faster in my very limited experience.
The view of one of the folks that’s worked hard on making Haskell a great tool for doing numerical applications is that trying to get GHC to produce llvm on which tools like polly can work is “swimming up hill”.
A lot of people who work on numerical problems use matlab but it seems like if you want to put the code into production then you have to rewrite it. Julia is an attempt to improve this situation and also provide higher performance.
A sort of manifesto
To summarise: what we’d like is typesafe blazingly fast numerical code. Typesafe can give us e.g. static guarantees that matrix multiplication is between compatible matrices and assurances that the units are correct. Here’s an example of static typing helping ensure the correctness of Kalman filter usage and here’s a package that I have used successfully on a mediumsized project to ensure all units are correct.
Symplectic Integrators
Let’s take a simple solver, encode it in Haskell and Julia and see how well we can do.
> {# OPTIONS_GHC Wall #}
> {# OPTIONS_GHC fnowarntypedefaults #}
>
> {# LANGUAGE FlexibleContexts #}
> {# LANGUAGE TypeOperators #}
> module Symplectic (
> runSteps
> , runSteps'
> , reallyRunSteps'
> , inits
> , h
> , bigH1
> , hs
> , nabla1
> , nabla2
> , runStepsH98
> , bigH2BodyH98
> ) where
>
> import Data.Number.Symbolic
> import Numeric.AD
> import Prelude as P
> import Data.Array.Accelerate as A hiding ((^))
> import Data.Array.Accelerate.LLVM.Native as CPU
> import Data.Array.Accelerate.Linear hiding (trace)
> import Data.Array.Accelerate.Control.Lens
> import qualified Linear as L
The StörmerVerlet scheme is an implicit symplectic method of order 2. Being symplectic is important as it preserves the energy of the systems being solved. If, for example, we were to use RK4 to simulate planetary motion then the planets would either crash into the sun or leave the solar system: not a very accurate representation of reality.
Let’s assume that the Hamiltonian is of the form then this becomes an explicit scheme.
Simple Pendulum
Consider the following Hamiltonian for the pendulum:
> bigH1 :: P.Floating a => a > a > a
> bigH1 q p = p^2 / 2  cos q
> ham1 :: P.Floating a => [a] > a
> ham1 [qq1, pp1] = 0.5 * pp1^2  cos qq1
> ham1 _ = error "Hamiltonian defined for 2 generalized coordinates only"
Although it is trivial to find the derivatives in this case, let us check using automatic symbolic differentiation
> q1, q2, p1, p2 :: Sym a
> q1 = var "q1"
> q2 = var "q2"
> p1 = var "p1"
> p2 = var "p2"
> nabla1 :: [[Sym Double]]
> nabla1 = jacobian ((\x > [x]) . ham1) [q1, p1]
ghci> nabla1
[[sin q1,0.5*p1+0.5*p1]]
which after a bit of simplification gives us and or
> nablaQ, nablaP :: P.Floating a => a > a
> nablaQ = sin
> nablaP = id
One step of the StörmerVerlet
> oneStep :: P.Floating a => (a > a) >
> (a > a) >
> a >
> (a, a) >
> (a, a)
> oneStep nabalQQ nablaPP hh (qPrev, pPrev) = (qNew, pNew)
> where
> h2 = hh / 2
> pp2 = pPrev  h2 * nabalQQ qPrev
> qNew = qPrev + hh * nablaPP pp2
> pNew = pp2  h2 * nabalQQ qNew
> h :: Double
> h = 0.01
> hs :: (Double, Double) > [(Double, Double)]
> hs = P.iterate (oneStep nablaQ nablaP h)
We can check that the energy is conserved directly
ghci> P.map (P.uncurry bigH1) $ P.take 5 $ hs (pi/4, 0.0)
[0.7071067811865476,0.7071067816284917,0.7071067829543561,0.7071067851642338,0.7071067882582796]
ghci> P.map (P.uncurry bigH1) $ P.take 5 $ P.drop 1000 $ hs (pi/4, 0.0)
[0.7071069557227465,0.7071069737863691,0.7071069927439544,0.7071070125961187,0.7071070333434369]
Two body problem
Newton’s equations of motions for the two body problem are
And we can rewrite those to use Hamilton’s equations with this Hamiltonian
The advantage of using this small example is that we know the exact solution should we need it.
> ham2 :: P.Floating a => [a] > a
> ham2 [qq1, qq2, pp1, pp2] = 0.5 * (pp1^2 + pp2^2)  recip (sqrt (qq1^2 + qq2^2))
> ham2 _ = error "Hamiltonian defined for 4 generalized coordinates only"
Again we can calculate the derivatives
> nabla2 :: [[Sym Double]]
> nabla2 = jacobian ((\x > [x]) . ham2) [q1, q2, p1, p2]
ghci> (P.mapM_ . P.mapM_) putStrLn . P.map (P.map show) $ nabla2
q1/(2.0*sqrt (q1*q1+q2*q2))/sqrt (q1*q1+q2*q2)/sqrt (q1*q1+q2*q2)+q1/(2.0*sqrt (q1*q1+q2*q2))/sqrt (q1*q1+q2*q2)/sqrt (q1*q1+q2*q2)
q2/(2.0*sqrt (q1*q1+q2*q2))/sqrt (q1*q1+q2*q2)/sqrt (q1*q1+q2*q2)+q2/(2.0*sqrt (q1*q1+q2*q2))/sqrt (q1*q1+q2*q2)/sqrt (q1*q1+q2*q2)
0.5*p1+0.5*p1
0.5*p2+0.5*p2
which after some simplification becomes
Here’s one step of StörmerVerlet using Haskell 98.
> oneStepH98 :: Double > V2 (V2 Double) > V2 (V2 Double)
> oneStepH98 hh prev = V2 qNew pNew
> where
> h2 = hh / 2
> hhs = V2 hh hh
> hh2s = V2 h2 h2
> pp2 = psPrev  hh2s * nablaQ' qsPrev
> qNew = qsPrev + hhs * nablaP' pp2
> pNew = pp2  hh2s * nablaQ' qNew
> qsPrev = prev ^. L._x
> psPrev = prev ^. L._y
> nablaQ' qs = V2 (qq1 / r) (qq2 / r)
> where
> qq1 = qs ^. L._x
> qq2 = qs ^. L._y
> r = (qq1 ^ 2 + qq2 ^ 2) ** (3/2)
> nablaP' ps = ps
And here is the same thing using accelerate.
> oneStep2 :: Double > Exp (V2 (V2 Double)) > Exp (V2 (V2 Double))
> oneStep2 hh prev = lift $ V2 qNew pNew
> where
> h2 = hh / 2
> hhs = lift $ V2 hh hh
> hh2s = lift $ V2 h2 h2
> pp2 = psPrev  hh2s * nablaQ' qsPrev
> qNew = qsPrev + hhs * nablaP' pp2
> pNew = pp2  hh2s * nablaQ' qNew
> qsPrev :: Exp (V2 Double)
> qsPrev = prev ^. _x
> psPrev = prev ^. _y
> nablaQ' :: Exp (V2 Double) > Exp (V2 Double)
> nablaQ' qs = lift (V2 (qq1 / r) (qq2 / r))
> where
> qq1 = qs ^. _x
> qq2 = qs ^. _y
> r = (qq1 ^ 2 + qq2 ^ 2) ** (3/2)
> nablaP' :: Exp (V2 Double) > Exp (V2 Double)
> nablaP' ps = ps
With initial values below the solution is an ellipse with eccentricity .
> e, q10, q20, p10, p20 :: Double
> e = 0.6
> q10 = 1  e
> q20 = 0.0
> p10 = 0.0
> p20 = sqrt ((1 + e) / (1  e))
> initsH98 :: V2 (V2 Double)
> initsH98 = V2 (V2 q10 q20) (V2 p10 p20)
>
> inits :: Exp (V2 (V2 Double))
> inits = lift initsH98
We can either keep all the steps of the simulation using accelerate and the CPU
> nSteps :: Int
> nSteps = 10000
>
> dummyInputs :: Acc (Array DIM1 (V2 (V2 Double)))
> dummyInputs = A.use $ A.fromList (Z :. nSteps) $
> P.replicate nSteps (V2 (V2 0.0 0.0) (V2 0.0 0.0))
> runSteps :: Array DIM1 (V2 (V2 Double))
> runSteps = CPU.run $ A.scanl (\s _x > (oneStep2 h s)) inits dummyInputs
Or we can do the same in plain Haskell
> dummyInputsH98 :: [V2 (V2 Double)]
> dummyInputsH98 = P.replicate nSteps (V2 (V2 0.0 0.0) (V2 0.0 0.0))
> runStepsH98 :: [V2 (V2 Double)]
> runStepsH98= P.scanl (\s _x > (oneStepH98 h s)) initsH98 dummyInputsH98
The fact that the phase diagram for the two objects is periodic is encouraging.
And again we can check directly that the energy is conserved.
> bigH2BodyH98 :: V2 (V2 Double) > Double
> bigH2BodyH98 z = ke + pe
> where
> q = z ^. L._x
> p = z ^. L._y
> pe = let V2 q1' q2' = q in negate $ recip (sqrt (q1'^2 + q2'^2))
> ke = let V2 p1' p2' = p in 0.5 * (p1'^2 + p2'^2)
ghci> P.maximum $ P.map bigH2BodyH98 $ P.take 100 $ P.drop 1000 runStepsH98
0.49964056333352014
ghci> P.minimum $ P.map bigH2BodyH98 $ P.take 100 $ P.drop 1000 runStepsH98
0.49964155784917147
We’d like to measure performance and running the above for many steps might use up all available memory. Let’s confine ourselves to looking at the final result.
> runSteps' :: Int > Exp (V2 (V2 Double)) > Exp (V2 (V2 Double))
> runSteps' nSteps = A.iterate (lift nSteps) (oneStep2 h)
> reallyRunSteps' :: Int > Array DIM1 (V2 (V2 Double))
> reallyRunSteps' nSteps = CPU.run $
> A.scanl (\s _x > runSteps' nSteps s) inits
> (A.use $ A.fromList (Z :. 1) [V2 (V2 0.0 0.0) (V2 0.0 0.0)])
Performance
Accelerate’s LLVM
Let’s see what accelerate generates with
{# OPTIONS_GHC Wall #}
{# OPTIONS_GHC fnowarnnameshadowing #}
{# OPTIONS_GHC fnowarntypedefaults #}
{# OPTIONS_GHC fnowarnunuseddobind #}
{# OPTIONS_GHC fnowarnmissingmethods #}
{# OPTIONS_GHC fnowarnorphans #}
import Symplectic
main :: IO ()
main = do
putStrLn $ show $ reallyRunSteps' 100000000
It’s a bit verbose but we can look at the key “loop”: while5.
[ 0.023] llvm: generated 102 instructions in 10 blocks
[ 0.023] llvm: generated 127 instructions in 15 blocks
[ 0.023] llvm: generated 86 instructions in 7 blocks
[ 0.024] llvm: generated 84 instructions in 7 blocks
[ 0.024] llvm: generated 18 instructions in 4 blocks
[ 0.071] llvm: optimisation did work? True
[ 0.072] ; ModuleID = 'scanS'
target datalayout = "em:oi64:64f80:128n8:16:32:64S128"
target triple = "x86_64appledarwin15.5.0"
@.memset_pattern = private unnamed_addr constant [2 x double] [double 2.000000e+00, double 2.000000e+00], align 16
@.memset_pattern.1 = private unnamed_addr constant [2 x double] [double 4.000000e01, double 4.000000e01], align 16
; Function Attrs: nounwind
define void @scanS(i64 %ix.start, i64 %ix.end, double* noalias nocapture %out.ad3, double* noalias nocapture %out.ad2, double* noalias nocapture %out.ad1, double* noalias nocapture %out.ad0, i64 %out.sh0, double* noalias nocapture readnone %fv0.ad3, double* noalias nocapture readnone %fv0.ad2, double* noalias nocapture readnone %fv0.ad1, double* noalias nocapture readnone %fv0.ad0, i64 %fv0.sh0) local_unnamed_addr #0 {
entry:
%0 = add i64 %fv0.sh0, 1
%1 = icmp slt i64 %ix.start, %ix.end
br i1 %1, label %while1.top.preheader, label %while1.exit
while1.top.preheader: ; preds = %entry
%2 = add i64 %ix.start, 1
%3 = mul i64 %2, %fv0.sh0
br label %while1.top
while1.top: ; preds = %while3.exit, %while1.top.preheader
%indvars.iv = phi i64 [ %3, %while1.top.preheader ], [ %indvars.iv.next, %while3.exit ]
%4 = phi i64 [ %ix.start, %while1.top.preheader ], [ %52, %while3.exit ]
%5 = mul i64 %4, %fv0.sh0
%6 = mul i64 %4, %0
%7 = getelementptr double, double* %out.ad0, i64 %6
store double 2.000000e+00, double* %7, align 8
%8 = getelementptr double, double* %out.ad1, i64 %6
store double 0.000000e+00, double* %8, align 8
%9 = getelementptr double, double* %out.ad2, i64 %6
store double 0.000000e+00, double* %9, align 8
%10 = getelementptr double, double* %out.ad3, i64 %6
store double 4.000000e01, double* %10, align 8
%11 = add i64 %5, %fv0.sh0
%12 = icmp slt i64 %5, %11
br i1 %12, label %while3.top.preheader, label %while3.exit
while3.top.preheader: ; preds = %while1.top
br label %while3.top
while3.top: ; preds = %while3.top.preheader, %while5.exit
%.in = phi i64 [ %44, %while5.exit ], [ %6, %while3.top.preheader ]
%13 = phi i64 [ %51, %while5.exit ], [ %5, %while3.top.preheader ]
%14 = phi <2 x double> [ %43, %while5.exit ], [ <double 2.000000e+00, double 0.000000e+00>, %while3.top.preheader ]
%15 = phi <2 x double> [ %32, %while5.exit ], [ <double 0.000000e+00, double 4.000000e01>, %while3.top.preheader ]
br label %while5.top
while5.top: ; preds = %while5.top, %while3.top
%16 = phi i64 [ 0, %while3.top ], [ %19, %while5.top ]
%17 = phi <2 x double> [ %14, %while3.top ], [ %43, %while5.top ]
%18 = phi <2 x double> [ %15, %while3.top ], [ %32, %while5.top ]
%19 = add nuw nsw i64 %16, 1
%20 = extractelement <2 x double> %18, i32 1
%21 = fmul fast double %20, %20
%22 = extractelement <2 x double> %18, i32 0
%23 = fmul fast double %22, %22
%24 = fadd fast double %21, %23
%25 = tail call double @llvm.pow.f64(double %24, double 1.500000e+00) #2
%26 = insertelement <2 x double> undef, double %25, i32 0
%27 = shufflevector <2 x double> %26, <2 x double> undef, <2 x i32> zeroinitializer
%28 = fdiv fast <2 x double> %18, %27
%29 = fmul fast <2 x double> %28, <double 5.000000e03, double 5.000000e03>
%30 = fsub fast <2 x double> %17, %29
%31 = fmul fast <2 x double> %30, <double 1.000000e02, double 1.000000e02>
%32 = fadd fast <2 x double> %31, %18
%33 = extractelement <2 x double> %32, i32 1
%34 = fmul fast double %33, %33
%35 = extractelement <2 x double> %32, i32 0
%36 = fmul fast double %35, %35
%37 = fadd fast double %34, %36
%38 = tail call double @llvm.pow.f64(double %37, double 1.500000e+00) #2
%39 = insertelement <2 x double> undef, double %38, i32 0
%40 = shufflevector <2 x double> %39, <2 x double> undef, <2 x i32> zeroinitializer
%41 = fdiv fast <2 x double> %32, %40
%42 = fmul fast <2 x double> %41, <double 5.000000e03, double 5.000000e03>
%43 = fsub fast <2 x double> %30, %42
%exitcond = icmp eq i64 %19, 100000000
br i1 %exitcond, label %while5.exit, label %while5.top
while5.exit: ; preds = %while5.top
%44 = add i64 %.in, 1
%45 = getelementptr double, double* %out.ad0, i64 %44
%46 = extractelement <2 x double> %43, i32 0
store double %46, double* %45, align 8
%47 = getelementptr double, double* %out.ad1, i64 %44
%48 = extractelement <2 x double> %43, i32 1
store double %48, double* %47, align 8
%49 = getelementptr double, double* %out.ad2, i64 %44
store double %35, double* %49, align 8
%50 = getelementptr double, double* %out.ad3, i64 %44
store double %33, double* %50, align 8
%51 = add i64 %13, 1
%exitcond7 = icmp eq i64 %51, %indvars.iv
br i1 %exitcond7, label %while3.exit.loopexit, label %while3.top
while3.exit.loopexit: ; preds = %while5.exit
br label %while3.exit
while3.exit: ; preds = %while3.exit.loopexit, %while1.top
%52 = add nsw i64 %4, 1
%indvars.iv.next = add i64 %indvars.iv, %fv0.sh0
%exitcond8 = icmp eq i64 %52, %ix.end
br i1 %exitcond8, label %while1.exit.loopexit, label %while1.top
while1.exit.loopexit: ; preds = %while3.exit
br label %while1.exit
while1.exit: ; preds = %while1.exit.loopexit, %entry
ret void
}
; Function Attrs: nounwind
define void @scanP1(i64 %ix.start, i64 %ix.end, i64 %ix.stride, i64 %ix.steps, double* noalias nocapture %out.ad3, double* noalias nocapture %out.ad2, double* noalias nocapture %out.ad1, double* noalias nocapture %out.ad0, i64 %out.sh0, double* noalias nocapture %tmp.ad3, double* noalias nocapture %tmp.ad2, double* noalias nocapture %tmp.ad1, double* noalias nocapture %tmp.ad0, i64 %tmp.sh0, double* noalias nocapture readonly %fv0.ad3, double* noalias nocapture readonly %fv0.ad2, double* noalias nocapture readonly %fv0.ad1, double* noalias nocapture readonly %fv0.ad0, i64 %fv0.sh0) local_unnamed_addr #0 {
entry:
%0 = mul i64 %ix.stride, %ix.start
%1 = add i64 %0, %ix.stride
%2 = icmp sle i64 %1, %fv0.sh0
%3 = select i1 %2, i64 %1, i64 %fv0.sh0
%4 = icmp eq i64 %ix.start, 0
%5 = add i64 %0, 1
%6 = select i1 %4, i64 %0, i64 %5
br i1 %4, label %if5.exit, label %if5.else
if5.else: ; preds = %entry
%7 = getelementptr double, double* %fv0.ad3, i64 %0
%8 = load double, double* %7, align 8
%9 = getelementptr double, double* %fv0.ad2, i64 %0
%10 = load double, double* %9, align 8
%11 = getelementptr double, double* %fv0.ad1, i64 %0
%12 = load double, double* %11, align 8
%13 = getelementptr double, double* %fv0.ad0, i64 %0
%14 = load double, double* %13, align 8
%15 = insertelement <2 x double> undef, double %14, i32 0
%16 = insertelement <2 x double> %15, double %12, i32 1
%17 = insertelement <2 x double> undef, double %10, i32 0
%18 = insertelement <2 x double> %17, double %8, i32 1
br label %if5.exit
if5.exit: ; preds = %entry, %if5.else
%19 = phi i64 [ %5, %if5.else ], [ %0, %entry ]
%20 = phi <2 x double> [ %16, %if5.else ], [ <double 2.000000e+00, double 0.000000e+00>, %entry ]
%21 = phi <2 x double> [ %18, %if5.else ], [ <double 0.000000e+00, double 4.000000e01>, %entry ]
%22 = getelementptr double, double* %out.ad0, i64 %6
%23 = extractelement <2 x double> %20, i32 0
store double %23, double* %22, align 8
%24 = getelementptr double, double* %out.ad1, i64 %6
%25 = extractelement <2 x double> %20, i32 1
store double %25, double* %24, align 8
%26 = getelementptr double, double* %out.ad2, i64 %6
%27 = extractelement <2 x double> %21, i32 0
store double %27, double* %26, align 8
%28 = getelementptr double, double* %out.ad3, i64 %6
%29 = extractelement <2 x double> %21, i32 1
store double %29, double* %28, align 8
%30 = icmp slt i64 %19, %3
br i1 %30, label %while9.top.preheader, label %while9.exit
while9.top.preheader: ; preds = %if5.exit
br label %while9.top
while9.top: ; preds = %while9.top.preheader, %while11.exit
%.in = phi i64 [ %62, %while11.exit ], [ %6, %while9.top.preheader ]
%31 = phi i64 [ %69, %while11.exit ], [ %19, %while9.top.preheader ]
%32 = phi <2 x double> [ %61, %while11.exit ], [ %20, %while9.top.preheader ]
%33 = phi <2 x double> [ %50, %while11.exit ], [ %21, %while9.top.preheader ]
br label %while11.top
while11.top: ; preds = %while11.top, %while9.top
%34 = phi i64 [ 0, %while9.top ], [ %37, %while11.top ]
%35 = phi <2 x double> [ %32, %while9.top ], [ %61, %while11.top ]
%36 = phi <2 x double> [ %33, %while9.top ], [ %50, %while11.top ]
%37 = add nuw nsw i64 %34, 1
%38 = extractelement <2 x double> %36, i32 1
%39 = fmul fast double %38, %38
%40 = extractelement <2 x double> %36, i32 0
%41 = fmul fast double %40, %40
%42 = fadd fast double %39, %41
%43 = tail call double @llvm.pow.f64(double %42, double 1.500000e+00) #2
%44 = insertelement <2 x double> undef, double %43, i32 0
%45 = shufflevector <2 x double> %44, <2 x double> undef, <2 x i32> zeroinitializer
%46 = fdiv fast <2 x double> %36, %45
%47 = fmul fast <2 x double> %46, <double 5.000000e03, double 5.000000e03>
%48 = fsub fast <2 x double> %35, %47
%49 = fmul fast <2 x double> %48, <double 1.000000e02, double 1.000000e02>
%50 = fadd fast <2 x double> %49, %36
%51 = extractelement <2 x double> %50, i32 1
%52 = fmul fast double %51, %51
%53 = extractelement <2 x double> %50, i32 0
%54 = fmul fast double %53, %53
%55 = fadd fast double %52, %54
%56 = tail call double @llvm.pow.f64(double %55, double 1.500000e+00) #2
%57 = insertelement <2 x double> undef, double %56, i32 0
%58 = shufflevector <2 x double> %57, <2 x double> undef, <2 x i32> zeroinitializer
%59 = fdiv fast <2 x double> %50, %58
%60 = fmul fast <2 x double> %59, <double 5.000000e03, double 5.000000e03>
%61 = fsub fast <2 x double> %48, %60
%exitcond = icmp eq i64 %37, 100000000
br i1 %exitcond, label %while11.exit, label %while11.top
while11.exit: ; preds = %while11.top
%62 = add i64 %.in, 1
%63 = getelementptr double, double* %out.ad0, i64 %62
%64 = extractelement <2 x double> %61, i32 0
store double %64, double* %63, align 8
%65 = getelementptr double, double* %out.ad1, i64 %62
%66 = extractelement <2 x double> %61, i32 1
store double %66, double* %65, align 8
%67 = getelementptr double, double* %out.ad2, i64 %62
store double %53, double* %67, align 8
%68 = getelementptr double, double* %out.ad3, i64 %62
store double %51, double* %68, align 8
%69 = add nsw i64 %31, 1
%70 = icmp slt i64 %69, %3
br i1 %70, label %while9.top, label %while9.exit.loopexit
while9.exit.loopexit: ; preds = %while11.exit
br label %while9.exit
while9.exit: ; preds = %while9.exit.loopexit, %if5.exit
%71 = phi <2 x double> [ %20, %if5.exit ], [ %61, %while9.exit.loopexit ]
%72 = phi <2 x double> [ %21, %if5.exit ], [ %50, %while9.exit.loopexit ]
%73 = getelementptr double, double* %tmp.ad0, i64 %ix.start
%74 = extractelement <2 x double> %71, i32 0
store double %74, double* %73, align 8
%75 = getelementptr double, double* %tmp.ad1, i64 %ix.start
%76 = extractelement <2 x double> %71, i32 1
store double %76, double* %75, align 8
%77 = getelementptr double, double* %tmp.ad2, i64 %ix.start
%78 = extractelement <2 x double> %72, i32 0
store double %78, double* %77, align 8
%79 = getelementptr double, double* %tmp.ad3, i64 %ix.start
%80 = extractelement <2 x double> %72, i32 1
store double %80, double* %79, align 8
ret void
}
; Function Attrs: nounwind
define void @scanP2(i64 %ix.start, i64 %ix.end, double* noalias nocapture %tmp.ad3, double* noalias nocapture %tmp.ad2, double* noalias nocapture %tmp.ad1, double* noalias nocapture %tmp.ad0, i64 %tmp.sh0, double* noalias nocapture readnone %fv0.ad3, double* noalias nocapture readnone %fv0.ad2, double* noalias nocapture readnone %fv0.ad1, double* noalias nocapture readnone %fv0.ad0, i64 %fv0.sh0) local_unnamed_addr #0 {
entry:
%0 = add i64 %ix.start, 1
%1 = icmp slt i64 %0, %ix.end
br i1 %1, label %while1.top.preheader, label %while1.exit
while1.top.preheader: ; preds = %entry
%2 = getelementptr double, double* %tmp.ad3, i64 %ix.start
%3 = load double, double* %2, align 8
%4 = getelementptr double, double* %tmp.ad2, i64 %ix.start
%5 = load double, double* %4, align 8
%6 = getelementptr double, double* %tmp.ad1, i64 %ix.start
%7 = load double, double* %6, align 8
%8 = getelementptr double, double* %tmp.ad0, i64 %ix.start
%9 = load double, double* %8, align 8
%10 = insertelement <2 x double> undef, double %5, i32 0
%11 = insertelement <2 x double> %10, double %3, i32 1
%12 = insertelement <2 x double> undef, double %9, i32 0
%13 = insertelement <2 x double> %12, double %7, i32 1
br label %while1.top
while1.top: ; preds = %while3.exit, %while1.top.preheader
%14 = phi i64 [ %45, %while3.exit ], [ %0, %while1.top.preheader ]
%15 = phi <2 x double> [ %44, %while3.exit ], [ %13, %while1.top.preheader ]
%16 = phi <2 x double> [ %33, %while3.exit ], [ %11, %while1.top.preheader ]
br label %while3.top
while3.top: ; preds = %while3.top, %while1.top
%17 = phi i64 [ 0, %while1.top ], [ %20, %while3.top ]
%18 = phi <2 x double> [ %15, %while1.top ], [ %44, %while3.top ]
%19 = phi <2 x double> [ %16, %while1.top ], [ %33, %while3.top ]
%20 = add nuw nsw i64 %17, 1
%21 = extractelement <2 x double> %19, i32 1
%22 = fmul fast double %21, %21
%23 = extractelement <2 x double> %19, i32 0
%24 = fmul fast double %23, %23
%25 = fadd fast double %22, %24
%26 = tail call double @llvm.pow.f64(double %25, double 1.500000e+00) #2
%27 = insertelement <2 x double> undef, double %26, i32 0
%28 = shufflevector <2 x double> %27, <2 x double> undef, <2 x i32> zeroinitializer
%29 = fdiv fast <2 x double> %19, %28
%30 = fmul fast <2 x double> %29, <double 5.000000e03, double 5.000000e03>
%31 = fsub fast <2 x double> %18, %30
%32 = fmul fast <2 x double> %31, <double 1.000000e02, double 1.000000e02>
%33 = fadd fast <2 x double> %32, %19
%34 = extractelement <2 x double> %33, i32 1
%35 = fmul fast double %34, %34
%36 = extractelement <2 x double> %33, i32 0
%37 = fmul fast double %36, %36
%38 = fadd fast double %35, %37
%39 = tail call double @llvm.pow.f64(double %38, double 1.500000e+00) #2
%40 = insertelement <2 x double> undef, double %39, i32 0
%41 = shufflevector <2 x double> %40, <2 x double> undef, <2 x i32> zeroinitializer
%42 = fdiv fast <2 x double> %33, %41
%43 = fmul fast <2 x double> %42, <double 5.000000e03, double 5.000000e03>
%44 = fsub fast <2 x double> %31, %43
%exitcond = icmp eq i64 %20, 100000000
br i1 %exitcond, label %while3.exit, label %while3.top
while3.exit: ; preds = %while3.top
%45 = add i64 %14, 1
%46 = getelementptr double, double* %tmp.ad0, i64 %14
%47 = extractelement <2 x double> %44, i32 0
store double %47, double* %46, align 8
%48 = getelementptr double, double* %tmp.ad1, i64 %14
%49 = extractelement <2 x double> %44, i32 1
store double %49, double* %48, align 8
%50 = getelementptr double, double* %tmp.ad2, i64 %14
store double %36, double* %50, align 8
%51 = getelementptr double, double* %tmp.ad3, i64 %14
store double %34, double* %51, align 8
%exitcond7 = icmp eq i64 %45, %ix.end
br i1 %exitcond7, label %while1.exit.loopexit, label %while1.top
while1.exit.loopexit: ; preds = %while3.exit
br label %while1.exit
while1.exit: ; preds = %while1.exit.loopexit, %entry
ret void
}
; Function Attrs: nounwind
define void @scanP3(i64 %ix.start, i64 %ix.end, i64 %ix.stride, double* noalias nocapture %out.ad3, double* noalias nocapture %out.ad2, double* noalias nocapture %out.ad1, double* noalias nocapture %out.ad0, i64 %out.sh0, double* noalias nocapture readonly %tmp.ad3, double* noalias nocapture readonly %tmp.ad2, double* noalias nocapture readonly %tmp.ad1, double* noalias nocapture readonly %tmp.ad0, i64 %tmp.sh0, double* noalias nocapture readnone %fv0.ad3, double* noalias nocapture readnone %fv0.ad2, double* noalias nocapture readnone %fv0.ad1, double* noalias nocapture readnone %fv0.ad0, i64 %fv0.sh0) local_unnamed_addr #0 {
entry:
%0 = add i64 %ix.start, 1
%1 = mul i64 %0, %ix.stride
%2 = add i64 %1, %ix.stride
%3 = icmp sle i64 %2, %out.sh0
%4 = select i1 %3, i64 %2, i64 %out.sh0
%5 = add i64 %1, 1
%6 = add i64 %4, 1
%7 = icmp slt i64 %5, %6
br i1 %7, label %while1.top.preheader, label %while1.exit
while1.top.preheader: ; preds = %entry
%8 = getelementptr double, double* %tmp.ad0, i64 %ix.start
%9 = load double, double* %8, align 8
%10 = getelementptr double, double* %tmp.ad1, i64 %ix.start
%11 = load double, double* %10, align 8
%12 = getelementptr double, double* %tmp.ad2, i64 %ix.start
%13 = load double, double* %12, align 8
%14 = getelementptr double, double* %tmp.ad3, i64 %ix.start
%15 = load double, double* %14, align 8
%16 = insertelement <2 x double> undef, double %9, i32 0
%17 = insertelement <2 x double> %16, double %11, i32 1
%18 = insertelement <2 x double> undef, double %13, i32 0
%19 = insertelement <2 x double> %18, double %15, i32 1
br label %while1.top
while1.top: ; preds = %while1.top.preheader, %while3.exit
%20 = phi i64 [ %55, %while3.exit ], [ %5, %while1.top.preheader ]
br label %while3.top
while3.top: ; preds = %while3.top, %while1.top
%21 = phi i64 [ 0, %while1.top ], [ %24, %while3.top ]
%22 = phi <2 x double> [ %17, %while1.top ], [ %48, %while3.top ]
%23 = phi <2 x double> [ %19, %while1.top ], [ %37, %while3.top ]
%24 = add nuw nsw i64 %21, 1
%25 = extractelement <2 x double> %23, i32 1
%26 = fmul fast double %25, %25
%27 = extractelement <2 x double> %23, i32 0
%28 = fmul fast double %27, %27
%29 = fadd fast double %26, %28
%30 = tail call double @llvm.pow.f64(double %29, double 1.500000e+00) #2
%31 = insertelement <2 x double> undef, double %30, i32 0
%32 = shufflevector <2 x double> %31, <2 x double> undef, <2 x i32> zeroinitializer
%33 = fdiv fast <2 x double> %23, %32
%34 = fmul fast <2 x double> %33, <double 5.000000e03, double 5.000000e03>
%35 = fsub fast <2 x double> %22, %34
%36 = fmul fast <2 x double> %35, <double 1.000000e02, double 1.000000e02>
%37 = fadd fast <2 x double> %36, %23
%38 = extractelement <2 x double> %37, i32 1
%39 = fmul fast double %38, %38
%40 = extractelement <2 x double> %37, i32 0
%41 = fmul fast double %40, %40
%42 = fadd fast double %39, %41
%43 = tail call double @llvm.pow.f64(double %42, double 1.500000e+00) #2
%44 = insertelement <2 x double> undef, double %43, i32 0
%45 = shufflevector <2 x double> %44, <2 x double> undef, <2 x i32> zeroinitializer
%46 = fdiv fast <2 x double> %37, %45
%47 = fmul fast <2 x double> %46, <double 5.000000e03, double 5.000000e03>
%48 = fsub fast <2 x double> %35, %47
%exitcond = icmp eq i64 %24, 100000000
br i1 %exitcond, label %while3.exit, label %while3.top
while3.exit: ; preds = %while3.top
%49 = getelementptr double, double* %out.ad0, i64 %20
%50 = extractelement <2 x double> %48, i32 0
store double %50, double* %49, align 8
%51 = getelementptr double, double* %out.ad1, i64 %20
%52 = extractelement <2 x double> %48, i32 1
store double %52, double* %51, align 8
%53 = getelementptr double, double* %out.ad2, i64 %20
store double %40, double* %53, align 8
%54 = getelementptr double, double* %out.ad3, i64 %20
store double %38, double* %54, align 8
%55 = add i64 %20, 1
%56 = icmp slt i64 %55, %6
br i1 %56, label %while1.top, label %while1.exit.loopexit
while1.exit.loopexit: ; preds = %while3.exit
br label %while1.exit
while1.exit: ; preds = %while1.exit.loopexit, %entry
ret void
}
; Function Attrs: norecurse nounwind
define void @generate(i64 %ix.start, i64 %ix.end, double* noalias nocapture %out.ad3, double* noalias nocapture %out.ad2, double* noalias nocapture %out.ad1, double* noalias nocapture %out.ad0, i64 %out.sh0, double* noalias nocapture readnone %fv0.ad3, double* noalias nocapture readnone %fv0.ad2, double* noalias nocapture readnone %fv0.ad1, double* noalias nocapture readnone %fv0.ad0, i64 %fv0.sh0) local_unnamed_addr #1 {
entry:
%0 = icmp sgt i64 %ix.end, %ix.start
br i1 %0, label %while1.top.preheader, label %while1.exit
while1.top.preheader: ; preds = %entry
%scevgep = getelementptr double, double* %out.ad1, i64 %ix.start
%scevgep1 = bitcast double* %scevgep to i8*
%1 = sub i64 %ix.end, %ix.start
%2 = shl i64 %1, 3
call void @llvm.memset.p0i8.i64(i8* %scevgep1, i8 0, i64 %2, i32 8, i1 false)
%scevgep2 = getelementptr double, double* %out.ad2, i64 %ix.start
%scevgep23 = bitcast double* %scevgep2 to i8*
call void @llvm.memset.p0i8.i64(i8* %scevgep23, i8 0, i64 %2, i32 8, i1 false)
%scevgep4 = getelementptr double, double* %out.ad0, i64 %ix.start
%scevgep45 = bitcast double* %scevgep4 to i8*
call void @memset_pattern16(i8* %scevgep45, i8* bitcast ([2 x double]* @.memset_pattern to i8*), i64 %2) #0
%scevgep6 = getelementptr double, double* %out.ad3, i64 %ix.start
%scevgep67 = bitcast double* %scevgep6 to i8*
call void @memset_pattern16(i8* %scevgep67, i8* bitcast ([2 x double]* @.memset_pattern.1 to i8*), i64 %2) #0
br label %while1.exit
while1.exit: ; preds = %while1.top.preheader, %entry
ret void
}
; Function Attrs: nounwind readonly
declare double @llvm.pow.f64(double, double) #2
; Function Attrs: argmemonly nounwind
declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i32, i1) #3
; Function Attrs: argmemonly
declare void @memset_pattern16(i8* nocapture, i8* nocapture readonly, i64) #4
attributes #0 = { nounwind }
attributes #1 = { norecurse nounwind }
attributes #2 = { nounwind readonly }
attributes #3 = { argmemonly nounwind }
attributes #4 = { argmemonly }
And then we can run it for steps and see how long it takes.
9,031,008 bytes allocated in the heap
1,101,152 bytes copied during GC
156,640 bytes maximum residency (2 sample(s))
28,088 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 16 colls, 0 par 0.002s 0.006s 0.0004s 0.0041s
Gen 1 2 colls, 0 par 0.002s 0.099s 0.0496s 0.0990s
INIT time 0.000s ( 0.002s elapsed)
MUT time 10.195s ( 10.501s elapsed)
GC time 0.004s ( 0.105s elapsed)
EXIT time 0.000s ( 0.000s elapsed)
Total time 10.206s ( 10.608s elapsed)
%GC time 0.0% (1.0% elapsed)
Alloc rate 885,824 bytes per MUT second
Productivity 100.0% of total user, 99.0% of total elapsed
Julia’s LLVM
Let’s try the same problem in Julia.
using StaticArrays
e = 0.6
q10 = 1  e
q20 = 0.0
p10 = 0.0
p20 = sqrt((1 + e) / (1  e))
h = 0.01
x1 = SVector{2,Float64}(q10, q20)
x2 = SVector{2,Float64}(p10, p20)
x3 = SVector{2,SVector{2,Float64}}(x1,x2)
@inline function oneStep(h, prev)
h2 = h / 2
@inbounds qsPrev = prev[1]
@inbounds psPrev = prev[2]
@inline function nablaQQ(qs)
@inbounds q1 = qs[1]
@inbounds q2 = qs[2]
r = (q1^2 + q2^2) ^ (3/2)
return SVector{2,Float64}(q1 / r, q2 / r)
end
@inline function nablaPP(ps)
return ps
end
p2 = psPrev  h2 * nablaQQ(qsPrev)
qNew = qsPrev + h * nablaPP(p2)
pNew = p2  h2 * nablaQQ(qNew)
return SVector{2,SVector{2,Float64}}(qNew, pNew)
end
function manyStepsFinal(n,h,prev)
for i in 1:n
prev = oneStep(h,prev)
end
return prev
end
final = manyStepsFinal(100000000,h,x3)
print(final)
Again we can see how long it takes
22.20 real 22.18 user 0.32 sys
Surprisingly it takes longer but I am Julia novice so it could be some rookie error. Two things occur to me:

Let’s look at the llvm and see if we can we can find an explanation.

Let’s analyse execution time versus number of steps to see what the code generation cost is and the execution cost. It may be that Julia takes longer to generate code but has better execution times.
define void @julia_oneStep_71735(%SVector.65* noalias sret, double, %SVector.65*) #0 {
top:
%3 = fmul double %1, 5.000000e01
%4 = getelementptr inbounds %SVector.65, %SVector.65* %2, i64 0, i32 0, i64 0, i32 0, i64 0
%5 = load double, double* %4, align 1
%6 = getelementptr inbounds %SVector.65, %SVector.65* %2, i64 0, i32 0, i64 0, i32 0, i64 1
%7 = load double, double* %6, align 1
%8 = getelementptr inbounds %SVector.65, %SVector.65* %2, i64 0, i32 0, i64 1, i32 0, i64 0
%9 = load double, double* %8, align 1
%10 = getelementptr inbounds %SVector.65, %SVector.65* %2, i64 0, i32 0, i64 1, i32 0, i64 1
%11 = load double, double* %10, align 1
%12 = fmul double %5, %5
%13 = fmul double %7, %7
%14 = fadd double %12, %13
%15 = call double @"julia_^_71741"(double %14, double 1.500000e+00) #0
%16 = fdiv double %5, %15
%17 = fdiv double %7, %15
%18 = fmul double %3, %16
%19 = fmul double %3, %17
%20 = fsub double %9, %18
%21 = fsub double %11, %19
%22 = fmul double %20, %1
%23 = fmul double %21, %1
%24 = fadd double %5, %22
%25 = fadd double %7, %23
%26 = fmul double %24, %24
%27 = fmul double %25, %25
%28 = fadd double %26, %27
%29 = call double @"julia_^_71741"(double %28, double 1.500000e+00) #0
%30 = fdiv double %24, %29
%31 = fdiv double %25, %29
%32 = fmul double %3, %30
%33 = fmul double %3, %31
%34 = fsub double %20, %32
%35 = fsub double %21, %33
%36 = getelementptr inbounds %SVector.65, %SVector.65* %0, i64 0, i32 0, i64 0, i32 0, i64 0
store double %24, double* %36, align 8
%37 = getelementptr inbounds %SVector.65, %SVector.65* %0, i64 0, i32 0, i64 0, i32 0, i64 1
store double %25, double* %37, align 8
%38 = getelementptr inbounds %SVector.65, %SVector.65* %0, i64 0, i32 0, i64 1, i32 0, i64 0
store double %34, double* %38, align 8
%39 = getelementptr inbounds %SVector.65, %SVector.65* %0, i64 0, i32 0, i64 1, i32 0, i64 1
store double %35, double* %39, align 8
ret void
}
nothing
We can see two things:

Julia doesn’t use SIMD by default. We can change this by using
O3
. In the event (I don’t reproduce it here), this makes very little difference to performance. 
Julia generates
%15 = call double @"julia_^_71741"(double %14, double 1.500000e+00) #0
whereas GHC generates
%26 = tail call double @llvm.pow.f64(double %25, double 1.500000e+00) #2
Now it is entirely possible that this results in usage of different libMs, the Julia calling openlibm and GHC calling the system libm which on my machine is the one that comes with MAC OS X and is apparently quite a lot faster. We could try compiling the actual llvm and replacing the Julia calls with
pow
but maybe that is the subject for another blog post.
Just in case, “onthefly” compilation is obscuring runtime performance let’s try running both the Haskell and Julia for 20m, 40m, 80m and 100m steps.
Haskell
linreg([20,40,80,100],[2.0,4.0,7.9,10.1])
(0.03000000000000025,0.1005)
Julia
linreg([20,40,80,100],[5.7,9.8,18.1,22.2])
(1.5600000000000005,0.2065)
Cleary the negative compilation time for Haskell is wrong but I think it’s fair to infer that Julia has a higher start up cost and Haskell is 2 times quicker but as noted above this may be because of different math libraries.
Colophon
I used shake to build some of the material “on the fly”. There are still some manual steps to producing the blog post. The code can be downloaded from github.
Trouble with Tribbles
Introduction
Tribbles originate from the planet Iota Geminorum IV and, according to Dr. McCoy, are born pregnant. No further details are given but we can follow Gurtin and MacCamy (1974) and perhaps recover some of what happens on the Enterprise.
Of course, agedependent population models are of more than fictional use and can be applied, for example, to modelling the progression of Malaria in infected hosts. We roughly follow some of J. J. Thibodeaux and Schlittenhardt (2011) who themselves reference Belair, Mackey, and Mahaffy (1995).
Of interest to Haskellers are:

The use of the hmatrix package which now contains functions to solve tridiagonal systems used in this post. You will need to use HEAD until a new hackage / stackage release is made. My future plan is to use CUDA via accelerate and compare.

The use of dimensions in a mediumsized example. It would have been nice to have tried the units package but it seemed harder work to use and, as ever, “Time’s wingèd chariot” was the enemy.
The source for this post can be downloaded from github.
AgeDependent Populations
McKendrick / von Foerster
McKendrick and von Foerster independently derived a model of agedependent population growth.
Let be the density of females of age at time . The number of females between ages and are thus . Assuming individuals are born at age , we have
where is the death rate density and denotes the rate of entry to the cohort of age . Dividing by we obtain
which in the limit becomes
We can further assume that the rate of entry to a cohort is proportional to the density of individuals times a velocity of aging .
Occasionally there is some reason to assume that aging one year is different to experiencing one year but we further assume .
We thus obtain
Gurtin / MacCamy
To solve any PDE we need boundary and initial conditions. The number of births at time is
where is the natality aka birthmodulus and
and we further assume that the initial condition
for some given .
Gurtin and MacCamy (1974) focus on the situation where
and we can also assume that the birth rate of Tribbles decreases exponentially with age and further that Tribbles can live forever. Gurtin and MacCamy (1974) then transform the PDE to obtain a pair of linked ODEs which can then be solved numerically.
Of course, we know what happens in the Enterprise and rather than continue with this example, let us turn our attention to the more serious subject of Malaria.
Malaria
I realise now that I went a bit overboard with references. Hopefully they don’t interrupt the flow too much.
The World Health Organisation (WHO) estimated that in 2015 there were 214 million new cases of malaria resulting in 438,000 deaths (source: Wikipedia).
The lifecycle of the plasmodium parasite that causes malaria is extremely ingenious. J. J. Thibodeaux and Schlittenhardt (2011) model the human segment of the plasmodium lifecycle and further propose a way of determing an optimal treatment for an infected individual. Hall et al. (2013) also model the effect of an antimalarial. Let us content ourselves with reproducing part of the paper by J. J. Thibodeaux and Schlittenhardt (2011).
At one part of its sojourn in humans, plasmodium infects erythrocytes aka red blood cells. These latter contain haemoglobin (aka hemoglobin). The process by which red blood cells are produced, Erythropoiesis, is modulated in a feedback loop by erythropoietin. The plasmodium parasite severely disrupts this process. Presumably the resulting loss of haemoglobin is one reason that an infected individual feels ill.
As can be seen in the overview by Torbett and Friedman (2009), the full feedback loop is complex. So as not to lose ourselves in the details and following J. J. Thibodeaux and Schlittenhardt (2011) and Belair, Mackey, and Mahaffy (1995), we consider a model with two compartments.

Precursors: prototype erythrocytes developing in the bone marrow with being the density of such cells of age at time .

Erythrocytes: mature red blood cells circulating in the blood with being the density of such cells of age at time .
where is the birth rate of precursors and is the death rate of erythrocytes, is the maturation rate of precursors and where
As boundary conditions, we have that the number of precursors maturing must equal the production of number of erythrocytes
and the production of the of the number of precursors depends on the level of erythropoietin
where is some proportionality function.
As initial conditions, we have
We can further model the erythropoietin dynamics as
where is the feedback function from the kidneys and the decay rate, depends on the total precursor population (Sawyer, Krantz, and Goldwasser (1987)) although this often is taken to be a constant and I would feel more comfortable with a more recent citation and where
As initial condition we have
A Finite Difference Attempt
Let us try solving the above model using a finite difference scheme observing that we currently have no basis for whether it has a solution and whether the finite difference scheme approximates such a solution! We follow J. J. Thibodeaux and Schlittenhardt (2011) who give a proof of convergence presumably with some conditions; any failure of the scheme is entirely mine.
Divide up the age and time ranges, , and into equal subintervals, , and where
where , and .
Denoting and similarly we obtain
and
Rearranging we get
Writing
We can express the above in matrix form
Finally we can write
A Haskell Implementation
> {# OPTIONS_GHC Wall #}
> {# LANGUAGE TypeFamilies #}
> {# LANGUAGE NoImplicitPrelude #}
> {# LANGUAGE FlexibleContexts #}
> {# LANGUAGE DataKinds #}
> {# LANGUAGE TypeOperators #}
> module Tribbles where
> import qualified Prelude as P
> import Numeric.Units.Dimensional.Prelude hiding (Unit)
> import Numeric.Units.Dimensional
> import Numeric.LinearAlgebra
> import Numeric.Integration.TanhSinh
> import Control.Monad.Writer
> import Control.Monad.Loops
Substances like erythropoietin (EPO) are measured in International Units and these cannot be converted to Moles (see Jelkmann (2009) for much more detail) so we have to pretend it really is measured in Moles as there seems to be no easy way to define what the dimensional package calls a base dimension. A typical amount for a person is 15 milliIU / milllitre but can reach much higher levels after loss of blood.
> muPerMl :: (Fractional a, Num a) => Unit 'NonMetric DConcentration a
> muPerMl = (milli mole) / (milli litre)
> bigE'0 :: Concentration Double
> bigE'0 = 15.0 *~ muPerMl
Let’s set up our grid. We take these from Ackleh et al. (2006) but note that J. J. Thibodeaux and Schlittenhardt (2011) seem to have .
> deltaT, deltaMu, deltaNu :: Time Double
> deltaT = 0.05 *~ day
> deltaMu = 0.01 *~ day
> deltaNu = 0.05 *~ day
> bigT :: Time Double
> bigT = 100.0 *~ day
> muF, nuF :: Time Double
> muF = 5.9 *~ day
> nuF = 120.0 *~ day
> bigK :: Int
> bigK = floor (bigT / deltaT /~ one)
> n1 :: Int
> n1 = floor (muF / deltaMu /~ one)
> n2 :: Int
> n2 = floor (nuF / deltaNu /~ one)
> ts :: [Time Double]
> ts = take bigK $ 0.0 *~ day : (map (+ deltaT) ts)
The birth rate for precursors
> betaThibodeaux :: Time Double >
> Frequency Double
> betaThibodeaux mu
>  mu < (0 *~ day) = error "betaThibodeaux: negative age"
>  mu < (3 *~ day) = (2.773 *~ (one / day))
>  otherwise = (0.0 *~ (one /day))
> alphaThibodeaux :: Concentration Double >
> Frequency Double
> alphaThibodeaux e = (0.5 *~ (muPerMl / day)) / ((1 *~ muPerMl) + e)
> sigmaThibodeaux :: Time Double >
> Time Double >
> Concentration Double >
> Frequency Double
> sigmaThibodeaux mu _t e = gThibodeaux e * (betaThibodeaux mu  alphaThibodeaux e)
and an alternative birth rate
> betaAckleh :: Time Double > Frequency Double
> betaAckleh mu
>  mu < (0 *~ day) = error "betaAckleh: negative age"
>  mu < (3 *~ day) = 2.773 *~ (one / day)
>  otherwise = 0.000 *~ (one / day)
> sigmaAckleh :: Time Double >
> Time Double >
> Concentration Double >
> Frequency Double
> sigmaAckleh mu _t e = betaAckleh mu * gAckleh e
J. J. Thibodeaux and Schlittenhardt (2011) give the maturation rate of precursors as
> gThibodeaux :: Concentration Double > Dimensionless Double
> gThibodeaux e = d / n
> where
> n = ((3.02 *~ one) * e + (0.31 *~ muPerMl))
> d = (30.61 *~ muPerMl) + e
and Ackleh et al. (2006) give this as
> gAckleh :: Concentration Double > Dimensionless Double
> gAckleh _e = 1.0 *~ one
As in J. J. Thibodeaux and Schlittenhardt (2011) we give quantities in terms of cells per kilogram of body weight. Note that these really are moles on this occasion.
> type CellDensity = Quantity (DAmountOfSubstance / DTime / DMass)
Let’s set the initial conditions.
> p'0 :: Time Double > CellDensity Double
> p'0 mu' = (1e11 *~ one) * pAux mu'
> where
> pAux mu
>  mu < (0 *~ day) = error "p'0: negative age"
>  mu < (3 *~ day) = 8.55e6 *~ (mole / day / kilo gram) *
> exp ((2.7519 *~ (one / day)) * mu)
>  otherwise = 8.55e6 *~ (mole / day / kilo gram) *
> exp (8.319 *~ one  (0.0211 *~ (one / day)) * mu)
> m_0 :: Time Double > CellDensity Double
> m_0 nu' = (1e11 *~ one) * mAux nu'
> where
> mAux nu
>  nu < (0 *~ day) = error "m_0: age less than zero"
>  otherwise = 0.039827 *~ (mole / day / kilo gram) *
> exp (((0.0083) *~ (one / day)) * nu)
And check that these give plausible results.
> m_0Untyped :: Double > Double
> m_0Untyped nu = m_0 (nu *~ day) /~ (mole / day / kilo gram)
> p'0Untyped :: Double > Double
> p'0Untyped mu = p'0 (mu *~ day) /~ (mole / day / kilo gram)
ghci> import Numeric.Integration.TanhSinh
ghci> result $ relative 1e6 $ parTrap m_0Untyped 0.001 (nuF /~ day)
3.0260736659043414e11
ghci> result $ relative 1e6 $ parTrap p'0Untyped 0.001 (muF /~ day)
1.0453999900927126e10
We can now create the components for the first matrix equation.
> g'0 :: Dimensionless Double
> g'0 = gThibodeaux bigE'0
> d_1'0 :: Int > Dimensionless Double
> d_1'0 i = (1 *~ one) + (g'0 * deltaT / deltaMu)
>  deltaT * sigmaThibodeaux ((fromIntegral i *~ one) * deltaMu) undefined bigE'0
> lowers :: [Dimensionless Double]
> lowers = replicate n1 (negate $ g'0 * deltaT / deltaMu)
> diags :: [Dimensionless Double]
> diags = g'0 : map d_1'0 [1..n1]
> uppers :: [Dimensionless Double]
> uppers = replicate n1 (0.0 *~ one)
J. J. Thibodeaux and Schlittenhardt (2011) does not give a definition for so we use the equivalent from Ackleh et al. (2006) which references Banks et al. (2003): “ erythrocytes/kg body weight mL plasma/mU Epo/day”
> s_0 :: Time Double >
> Quantity (DAmountOfSubstance / DTime / DMass / DConcentration) Double
> s_0 = const ((1e11 *~ one) * (4.45e7 *~ (mole / day / kilo gram / muPerMl)))
> b'0 :: [CellDensity Double]
> b'0 = (s_0 (0.0 *~ day) * bigE'0) : (take n1 $ map p'0 (iterate (+ deltaMu) deltaMu))
With these components in place we can now solve the implicit scheme and get the age distribution of precursors after one time step.
> p'1 :: Matrix Double
> p'1 = triDiagSolve (fromList (map (/~ one) lowers))
> (fromList (map (/~ one) diags))
> (fromList (map (/~ one) uppers))
> (((n1 P.+1 )><1) (map (/~ (mole / second / kilo gram)) b'0))
In order to create the components for the second matrix equation, we need the death rates of mature erythrocytes
> gammaThibodeaux :: Time Double >
> Time Double >
> Quantity (DAmountOfSubstance / DMass) Double >
> Frequency Double
> gammaThibodeaux _nu _t _bigM = 0.0083 *~ (one / day)
We note an alternative for the death rate
> gammaAckleh :: Time Double >
> Time Double >
> Quantity (DAmountOfSubstance / DMass) Double >
> Frequency Double
> gammaAckleh _nu _t bigM = (0.01 *~ (kilo gram / mole / day)) * bigM + 0.0001 *~ (one / day)
For the intial mature erythrocyte population we can either use the integral of the initial distribution
> bigM'0 :: Quantity (DAmountOfSubstance / DMass) Double
> bigM'0 = r *~ (mole / kilo gram)
> where
> r = result $ relative 1e6 $ parTrap m_0Untyped 0.001 (nuF /~ day)
ghci> bigM'0
3.0260736659043414e11 kg^1 mol
or we can use the sum of the values used in the finite difference approximation
> bigM'0' :: Quantity (DAmountOfSubstance / DMass) Double
> bigM'0' = (* deltaNu) $ sum $ map m_0 $ take n2 $ iterate (+ deltaNu) (0.0 *~ day)
ghci> bigM'0'
3.026741454719976e11 kg^1 mol
Finally we can create the components
> d_2'0 :: Int > Dimensionless Double
> d_2'0 i = (1 *~ one) + (g'0 * deltaT / deltaNu)
> + deltaT * gammaThibodeaux ((fromIntegral i *~ one) * deltaNu) undefined bigM'0
> lowers2 :: [Dimensionless Double]
> lowers2 = replicate n2 (negate $ deltaT / deltaNu)
> diags2 :: [Dimensionless Double]
> diags2 = (1.0 *~ one) : map d_2'0 [1..n2]
> uppers2 :: [Dimensionless Double]
> uppers2 = replicate n2 (0.0 *~ one)
> b_2'0 :: [CellDensity Double]
> b_2'0 = (g'0 * ((p'1 `atIndex` (n1,0)) *~ (mole / second / kilo gram))) :
> (take n2 $ map m_0 (iterate (+ deltaNu) deltaNu))
and then solve the implicit scheme to get the distribution of mature erythrocytes one time step ahead
> m'1 :: Matrix Double
> m'1 = triDiagSolve (fromList (map (/~ one) lowers2))
> (fromList (map (/~ one) diags2))
> (fromList (map (/~ one) uppers2))
> (((n2 P.+ 1)><1) (map (/~ (mole / second / kilo gram)) b_2'0))
We need to complete the homeostatic loop by implmenting the feedback from the kidneys to the bone marrow. Ackleh and Thibodeaux (2013) and Ackleh et al. (2006) give as
> fAckleh :: Time Double >
> Quantity (DAmountOfSubstance / DMass) Double >
> Quantity (DConcentration / DTime) Double
> fAckleh _t bigM = a / ((1.0 *~ one) + k * (bigM' ** r))
> where
> a = 15600 *~ (muPerMl / day)
> k = 0.0382 *~ one
> r = 6.96 *~ one
> bigM' = ((bigM /~ (mole / kilo gram)) *~ one) * (1e11 *~ one)
The much older Belair, Mackey, and Mahaffy (1995) gives as
> fBelair :: Time Double >
> Quantity (DAmountOfSubstance / DMass) Double >
> Quantity (DConcentration / DTime) Double
> fBelair _t bigM = a / ((1.0 *~ one) + k * (bigM' ** r))
> where
> a = 6570 *~ (muPerMl / day)
> k = 0.0382 *~ one
> r = 6.96 *~ one
> bigM' = ((bigM /~ (mole / kilo gram)) *~ one) * (1e11 *~ one)
For the intial precursor population we can either use the integral of the initial distribution
result $ relative 1e6 $ parTrap p'0Untyped 0.001 (muF /~ day)
> bigP'0 :: Quantity (DAmountOfSubstance / DMass) Double
> bigP'0 = r *~ (mole / kilo gram)
> where
> r = result $ relative 1e6 $ parTrap p'0Untyped 0.001 (muF /~ day)
ghci> bigP'0
1.0453999900927126e10 kg^1 mol
or we can use the sum of the values used in the finite difference approximation
> bigP'0' :: Quantity (DAmountOfSubstance / DMass) Double
> bigP'0' = (* deltaMu) $ sum $ map p'0 $ take n1 $ iterate (+ deltaMu) (0.0 *~ day)
ghci> bigP'0'
1.0438999930030743e10 kg^1 mol
J. J. Thibodeaux and Schlittenhardt (2011) give the following for
> a_E :: Quantity (DAmountOfSubstance / DMass) Double > Frequency Double
> a_E bigP = ((n / d) /~ one) *~ (one / day)
> where
> n :: Dimensionless Double
> n = bigP * (13.8 *~ (kilo gram / mole)) + 0.04 *~ one
> d :: Dimensionless Double
> d = (bigP /~ (mole / kilo gram)) *~ one + 0.08 *~ one
but from Ackleh et al. (2006)
The only biological basis for the latter is that the decay rate of erythropoietin should be an increasing function of the precursor population and this function remains in the range 0.50–6.65
and, given this is at variance with their given function, it may be safer to use their alternative of
> a_E' :: Quantity (DAmountOfSubstance / DMass) Double > Frequency Double
> a_E' _bigP = 6.65 *~ (one / day)
We now further calculate the concentration of EPO one time step ahead.
> f'0 :: Quantity (DConcentration / DTime) Double
> f'0 = fAckleh undefined bigM'0
> bigE'1 :: Concentration Double
> bigE'1 = (bigE'0 + deltaT * f'0) / (1.0 *~ one + deltaT * a_E' bigP'0)
Having done this for one time step starting at , it’s easy to generalize this to an arbitrary time step.
> d_1 :: Dimensionless Double >
> Concentration Double >
> Int >
> Dimensionless Double
> d_1 g e i = (1 *~ one) + (g * deltaT / deltaMu)
>  deltaT * sigmaThibodeaux ((fromIntegral i *~ one) * deltaMu) undefined e
> d_2 :: Quantity (DAmountOfSubstance / DMass) Double >
> Int >
> Dimensionless Double
> d_2 bigM i = (1 *~ one) + deltaT / deltaNu
> + deltaT * gammaThibodeaux ((fromIntegral i *~ one) * deltaNu) undefined bigM
> oneStepM :: (Matrix Double, Matrix Double, Concentration Double, Time Double) >
> Writer [(Quantity (DAmountOfSubstance / DMass) Double,
> Quantity (DAmountOfSubstance / DMass) Double,
> Concentration Double)]
> (Matrix Double, Matrix Double, Concentration Double, Time Double)
> oneStepM (psPrev, msPrev, ePrev, tPrev) = do
> let
> g = gThibodeaux ePrev
> ls = replicate n1 (negate $ g * deltaT / deltaMu)
> ds = g : map (d_1 g ePrev) [1..n1]
> us = replicate n1 (0.0 *~ one)
> b1'0 = (s_0 tPrev * ePrev) /~ (mole / second / kilo gram)
> b1 = asColumn $ vjoin [scalar b1'0, subVector 1 n1 $ flatten psPrev]
> psNew :: Matrix Double
> psNew = triDiagSolve (fromList (map (/~ one) ls))
> (fromList (map (/~ one) ds))
> (fromList (map (/~ one) us))
> b1
> ls2 = replicate n2 (negate $ deltaT / deltaNu)
> bigM :: Quantity (DAmountOfSubstance / DMass) Double
> bigM = (* deltaNu) $ ((sumElements msPrev) *~ (mole / kilo gram / second))
> ds2 = (1.0 *~ one) : map (d_2 bigM) [1..n2]
> us2 = replicate n2 (0.0 *~ one)
> b2'0 = (g * (psNew `atIndex` (n1, 0) *~ (mole / second / kilo gram))) /~
> (mole / second / kilo gram)
> b2 = asColumn $ vjoin [scalar b2'0, subVector 1 n2 $ flatten msPrev]
> msNew :: Matrix Double
> msNew = triDiagSolve (fromList (map (/~ one) ls2))
> (fromList (map (/~ one) ds2))
> (fromList (map (/~ one) us2))
> b2
> bigP :: Quantity (DAmountOfSubstance / DMass) Double
> bigP = (* deltaMu) $ sumElements psPrev *~ (mole / kilo gram / second)
> f :: Quantity (DConcentration / DTime) Double
> f = fAckleh undefined bigM
> eNew :: Concentration Double
> eNew = (ePrev + deltaT * f) / (1.0 *~ one + deltaT * a_E' bigP)
> tNew = tPrev + deltaT
> tell [(bigP, bigM, ePrev)]
> return (psNew, msNew, eNew, tNew)
We can now run the model for 100 days.
> ys :: [(Quantity (DAmountOfSubstance / DMass) Double,
> Quantity (DAmountOfSubstance / DMass) Double,
> Concentration Double)]
> ys = take 2000 $
> snd $
> runWriter $
> iterateM_ oneStepM ((((n1 P.+1 )><1) (map (/~ (mole / second / kilo gram)) b'0)),
> (((n2 P.+ 1)><1) $ (map (/~ (mole / second / kilo gram)) b_2'0)),
> bigE'0,
> (0.0 *~ day))
And now we can plot what happens for a period of 100 days.
References
Ackleh, Azmy S., and Jeremy J. Thibodeaux. 2013. “A secondorder finite difference approximation for a mathematical model of erythropoiesis.” Numerical Methods for Partial Differential Equations, no. November: n/a–n/a. doi:10.1002/num.21778.
Ackleh, Azmy S., Keng Deng, Kazufumi Ito, and Jeremy Thibodeaux. 2006. “A Structured Erythropoiesis Model with Nonlinear Cell Maturation Velocity and Hormone Decay Rate.” Mathematical Biosciences 204 (1): 21–48. doi:http://dx.doi.org/10.1016/j.mbs.2006.08.004.
Banks, H T, Cammey E Cole, Paul M Schlosser, and Hien T Tran. 2003. “Modeling and Optimal Regulation of Erythropoiesis Subject to Benzene Intoxication.” https://www.ncsu.edu/crsc/reports/ftp/pdf/crsctr0349.pdf.
Belair, Jacques, Michael C. Mackey, and Joseph M. Mahaffy. 1995. “AgeStructured and TwoDelay Models for Erythropoiesis.” Mathematical Biosciences 128 (1): 317–46. doi:http://dx.doi.org/10.1016/00255564(94)00078E.
Gurtin, Morton E, and Richard C MacCamy. 1974. “NonLinear AgeDependent Population Dynamics.” Archive for Rational Mechanics and Analysis 54 (3). Springer: 281–300.
Hall, Adam J, Michael J Chappell, John AD Aston, and Stephen A Ward. 2013. “Pharmacokinetic Modelling of the AntiMalarial Drug Artesunate and Its Active Metabolite Dihydroartemisinin.” Computer Methods and Programs in Biomedicine 112 (1). Elsevier: 1–15.
Jelkmann, Wolfgang. 2009. “Efficacy of Recombinant Erythropoietins: Is There Unity of International Units?” Nephrology Dialysis Transplantation 24 (5): 1366. doi:10.1093/ndt/gfp058.
Sawyer, Stephen T, SB Krantz, and E Goldwasser. 1987. “Binding and ReceptorMediated Endocytosis of Erythropoietin in Friend VirusInfected Erythroid Cells.” Journal of Biological Chemistry 262 (12). ASBMB: 5554–62.
Thibodeaux, Jeremy J., and Timothy P. Schlittenhardt. 2011. “Optimal Treatment Strategies for Malaria Infection.” Bulletin of Mathematical Biology 73 (11): 2791–2808. doi:10.1007/s1153801196508.
Torbett, Bruce E., and Jeffrey S. Friedman. 2009. “Erythropoiesis: An Overview.” In Erythropoietins, Erythropoietic Factors, and Erythropoiesis: Molecular, Cellular, Preclinical, and Clinical Biology, edited by Steven G. Elliott, Mary Ann Foote, and Graham Molineux, 3–18. Basel: Birkhäuser Basel. doi:10.1007/9783764386986_1.
Warming up for NUTS (No UTurn)
I have been thinking about writing a blog on why the no uturn sampler (NUTS) works rather than describing the actual algorithm. This led me to look at Jared Tobin’s Haskell implementation. His example tries to explore the Himmelblau function but only finds one local minima. This is not unexpected; as the excellent Stan manual notes
Being able to carry out such invariant inferences in practice is an altogether different matter. It is almost always intractable to find even a single posterior mode, much less balance the exploration of the neighborhoods of multiple local maxima according to the probability masses.
and
For HMC and NUTS, the problem is that the sampler gets stuck in one of the two "bowls" around the modes and cannot gather enough energy from random momentum assignment to move from one mode to another.
rm(list = ls(all.names=TRUE))
unlink(".RData")
rstan::stan_version()
## [1] "2.12.0"
rstan_options(auto_write = TRUE)
On the Rosenbrock function it fares much better.
knitr::include_graphics("RosenbrockA.png")
We can’t (at least I don’t know how to) try Stan out on Rosenbrock as its not a distribution but we can try it out on another nasty problem: the funnel. Some of this is taken directly from the Stan manual.
Here’s the Stan:
parameters {
real y;
vector[9] x;
}
model {
y ~ normal(0,3);
x ~ normal(0,exp(y/2));
}
which we can run with the following R:
funnel_fit < stan(file='funnel.stan', cores=4, iter=10000)
## Warning: There were 92 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
## http://mcstan.org/misc/warnings.html#divergenttransitionsafterwarmup
## Warning: Examine the pairs() plot to diagnose sampling problems
funnel_samples < extract(funnel_fit,permuted=TRUE,inc_warmup=FALSE);
funnel_df < data.frame(x1=funnel_samples$x[,1],y=funnel_samples$y[])
Plotting the data requires some unpleasantness but shows the neck of the funnel does not get explored. So even HMC and NUTS do not perform well.
midpoints < function(x, dp=2){
lower < as.numeric(gsub(",.*","",gsub("\\(\\[\\)\\]","", x)))
upper < as.numeric(gsub(".*,","",gsub("\\(\\[\\)\\]","", x)))
return(round(lower+(upperlower)/2, dp))
}
df < funnel_df[funnel_df$x1 < 20 & funnel_df$x1 > 20 & funnel_df$y < 9 & funnel_df$y > 9,]
x_c < cut(df$x1, 20)
y_c < cut(df$y, 20)
z < table(x_c, y_c)
z_df < as.data.frame(z)
a_df < data.frame(x=midpoints(z_df$x_c),y=midpoints(z_df$y_c),f=z_df$Freq)
m = as.matrix(dcast(a_df,x ~ y))
## Using f as value column: use value.var to override.
hist3D(x=m[,"x"],y=as.double(colnames(m)[2:21]),z=(as.matrix(dcast(a_df,x ~ y)))[,2:21], border="black",ticktype = "detailed",theta=5,phi=20)
## Using f as value column: use value.var to override.
Since the analytic form of the distribution is known, one can apply a trick to correct this problem and then one is sampling from unit normals!
parameters {
real y_raw;
vector[9] x_raw;
}
transformed parameters {
real y;
vector[9] x;
y < 3.0 * y_raw;
x < exp(y/2) * x_raw;
}
model {
y_raw ~ normal(0,1);
x_raw ~ normal(0,1);
}
And now the neck of the funnel is explored.
funnel_fit < stan(file='funnel_reparam.stan', cores=4, iter=10000)
funnel_samples < extract(funnel_fit,permuted=TRUE,inc_warmup=FALSE);
funnel_df < data.frame(x1=funnel_samples$x[,1],y=funnel_samples$y[])
df < funnel_df[funnel_df$x1 < 20 & funnel_df$x1 > 20 & funnel_df$y < 9 & funnel_df$y > 9,]
x_c < cut(df$x1, 20)
y_c < cut(df$y, 20)
z < table(x_c, y_c)
z_df < as.data.frame(z)
a_df < data.frame(x=midpoints(z_df$x_c),y=midpoints(z_df$y_c),f=z_df$Freq)
m = as.matrix(dcast(a_df,x ~ y))
## Using f as value column: use value.var to override.
hist3D(x=m[,"x"],y=as.double(colnames(m)[2:21]),z=(as.matrix(dcast(a_df,x ~ y)))[,2:21], border="black",ticktype = "detailed",theta=5,phi=20)
## Using f as value column: use value.var to override.
We’d expect the Haskell implementation to also fail to explore the neck. Maybe I will return to this after the article on why NUTS works.
Calling Haskell from C
As part of improving the random number generation story for Haskell, I want to be able to use the testu01 library with the minimal amount of Haskell wrapping. testu01 assumes that there is a C function which returns the random number. The ghc manual gives an example but does not give all the specifics. These are my notes on how to get the example working under OS X (El Capitain 10.11.5 to be precise).
The Haskell:
{# OPTIONS_GHC Wall #}
{# LANGUAGE ForeignFunctionInterface #}
module Foo where
foreign export ccall foo :: Int > IO Int
foo :: Int > IO Int
foo n = return (length (f n))
f :: Int > [Int]
f 0 = []
f n = n:(f (n1))
The .cabal:
name: testviac
version: 0.1.0.0
homepage: TBD
license: MIT
author: Dominic Steinitz
maintainer: idontgetoutmuch@gmail.com
category: System
buildtype: Simple
cabalversion: >=1.10
executable Foo.dylib
mainis: Foo.hs
otherextensions: ForeignFunctionInterface
builddepends: base >=4.7 && =0.6 && <0.7
hssourcedirs: src
defaultlanguage: Haskell2010
includedirs: src
ghcoptions: O2 shared fPIC dynamic
extralibraries: HSrtsghc8.0.1
On my computer running
cabal install
places the library in
~/Library/Haskell/ghc8.0.1/lib/testviac0.1.0.0/bin
The C:
#include
#include "HsFFI.h"
#include "../dist/build/Foo.dylib/Foo.dylibtmp/Foo_stub.h"
int main(int argc, char *argv[])
{
int i;
hs_init(&argc, &argv);
for (i = 0; i < 5; i++) {
printf("%d\n", foo(2500));
}
hs_exit();
return 0;
}
On my computer this can be compiled with
gcc6 Bar.c
~/Library/Haskell/ghc8.0.1/lib/testviac0.1.0.0/bin/Foo.dylib
I/Library/Frameworks/GHC.framework/Versions/8.0.1x86_64/usr/lib/ghc8.0.1/include
L/Library/Frameworks/GHC.framework/Versions/8.0.1x86_64/usr/lib/ghc8.0.1/rts
lHSrtsghc8.0.1
and can be run with
DYLD_LIBRARY_PATH=
~/Library/Haskell/ghc8.0.1/lib/testviac0.1.0.0/bin:
/Library/Frameworks/GHC.framework/Versions/8.0.1x86_64/usr/lib/ghc8.0.1/rts
N.B. setting DYLD_LIBRARY_PATH
like this is not recommended as it is a good way of breaking things. I have tried setting DYLD_FALLBACK_LIBRARY_PATH
but only to get an error message. Hopefully, at some point I will be able to post a robust way of getting the executable to pick up the required dynamic libraries.
UK / South Korea Trade: A Bayesian Analysis
Introduction
I was intrigued by a tweet by the UK Chancellor of the Exchequer stating "exports [to South Korea] have doubled over the last year. Now worth nearly £11bn” and a tweet by a Member of the UK Parliament stating South Korea "our second fastest growing trading partner". Although I have never paid much attention to trade statistics, both these statements seemed surprising. But these days it’s easy enough to verify such statements. It’s also an opportunity to use the techniques I believe data scientists in (computer) game companies use to determine how much impact a new feature has on the game’s consumers.
One has to be slightly careful with trade statistics as they come in many different forms, e.g., just goods or goods and services etc. When I provide software and analyses to US organisations, I am included in the services exports from the UK to the US.
Let’s analyse goods first before moving on to goods and services.
Getting the Data
First let’s get hold of the quarterly data from the UK Office of National Statistics.
ukstats < "https://www.ons.gov.uk"
bop < "economy/nationalaccounts/balanceofpayments"
ds < "datasets/tradeingoodsmretsallbopeu2013timeseriesspreadsheet/current/mret.csv"
mycsv < read.csv(paste(ukstats,"file?uri=",bop,ds,sep="/"),stringsAsFactors=FALSE)
Now we can find the columns that refer to Korea.
ns < which(grepl("Korea", names(mycsv)))
length(ns)
## [1] 3
names(mycsv[ns[1]])
## [1] "BoP.consistent..South.Korea..Exports..SA................................"
names(mycsv[ns[2]])
## [1] "BoP.consistent..South.Korea..Imports..SA................................"
names(mycsv[ns[3]])
## [1] "BoP.consistent..South.Korea..Balance..SA................................"
Now we can pull out the relevant information and create a data frame of it.
korean < mycsv[grepl("Korea", names(mycsv))]
imports < korean[grepl("Imports", names(korean))]
exports < korean[grepl("Exports", names(korean))]
balance < korean[grepl("Balance", names(korean))]
df < data.frame(mycsv[grepl("Title", names(mycsv))],
imports,
exports,
balance)
colnames(df) < c("Title", "Imports", "Exports", "Balance")
startQ < which(grepl("1998 Q1",df$Title))
endQ < which(grepl("2016 Q3",df$Title))
dfQ < df[startQ:endQ,]
We can now plot the data.
tab < data.frame(kr=as.numeric(dfQ$Exports),
krLabs=as.numeric(as.Date(as.yearqtr(dfQ$Title,format='%Y Q%q'))))
ggplot(tab, aes(x=as.Date(tab$krLabs), y=tab$kr)) + geom_line() +
theme(legend.position="bottom") +
ggtitle("Goods Exports UK / South Korea (Quarterly)") +
theme(plot.title = element_text(hjust = 0.5)) +
xlab("Date") +
ylab("Value (£m)")
For good measure let’s plot the annual data.
startY < grep("^1998$",df$Title)
endY < grep("^2015$",df$Title)
dfYear < df[startY:endY,]
tabY < data.frame(kr=as.numeric(dfYear$Exports),
krLabs=as.numeric(dfYear$Title))
ggplot(tabY, aes(x=tabY$krLabs, y=tabY$kr)) + geom_line() +
theme(legend.position="bottom") +
ggtitle("Goods Exports UK / South Korea (Annual)") +
theme(plot.title = element_text(hjust = 0.5)) +
xlab("Date") +
ylab("Value (£m)")
And the monthly data.
startM < grep("1998 JAN",df$Title)
endM < grep("2016 OCT",df$Title)
dfMonth < df[startM:endM,]
tabM < data.frame(kr=as.numeric(dfMonth$Exports),
krLabs=as.numeric(as.Date(as.yearmon(dfMonth$Title,format='%Y %B'))))
ggplot(tabM, aes(x=as.Date(tabM$krLabs), y=tabM$kr)) + geom_line() +
theme(legend.position="bottom") +
ggtitle("Goods Exports UK / South Korea (Monthly)") +
theme(plot.title = element_text(hjust = 0.5)) +
xlab("Date") +
ylab("Value (£m)")
It looks like some change took place in 2011 but nothing to suggest either that "export have doubled over the last year" or that South Korea is "our second fastest growing partner". That some sort of change did happen is further supported by the fact a Free Trade Agreement between the EU and Korea was put in place in 2011.
But was there really a change? And what sort of change was it? Sometimes it’s easy to imagine patterns where there are none.
With this warning in mind let us see if we can get a better feel from the numbers as to what happened.
The Model
Let us assume that the data for exports are approximated by a linear function of time but that there is a change in the slope and the offset at some point during observation.
Since we are going to use stan to infer the parameters for this model and stan cannot handle discrete parameters, we need to marginalize out this (discrete) parameter. I hope to do the same analysis with LibBi which seems more suited to time series analysis and which I believe will not require such a step.
Setting D = {y_{i}}_{i = 1}^{N} we can calculate the likelihood
stan operates on the log scale and thus requires the log likelihood
where
and where the log sum of exponents function is defined by
The log sum of exponents function allows the model to be coded directly in Stan using the builtin function , which provides both arithmetic stability and efficiency for mixture model calculations.
Stan
Here’s the model in stan. Sadly I haven’t found a good way of divvying up .stan
files in a .Rmd
file so that it still compiles.
data {
int<lower=1> N;
real x[N];
real y[N];
}
parameters {
real mu1;
real mu2;
real gamma1;
real gamma2;
real<lower=0> sigma1;
real<lower=0> sigma2;
}
transformed parameters {
vector[N] log_p;
real mu;
real sigma;
log_p = rep_vector(log(N), N);
for (tau in 1:N)
for (i in 1:N) {
mu = i < tau ? (mu1 * x[i] + gamma1) : (mu2 * x[i] + gamma2);
sigma = i < tau ? sigma1 : sigma2;
log_p[tau] = log_p[tau] + normal_lpdf(y[i]  mu, sigma);
}
}
model {
mu1 ~ normal(0, 10);
mu2 ~ normal(0, 10);
gamma1 ~ normal(0, 10);
gamma2 ~ normal(0, 10);
sigma1 ~ normal(0, 10);
sigma2 ~ normal(0, 10);
target += log_sum_exp(log_p);
}
generated quantities {
int<lower=1,upper=N> tau;
tau = categorical_rng(softmax(log_p));
}
The above, although mimicking our mathematical model, has quadratic complexity and we can use the trick in the stan manual to make it linear albeit with less clarity.
data {
int<lower=1> N;
real x[N];
real y[N];
}
parameters {
real mu1;
real mu2;
real gamma1;
real gamma2;
real<lower=0> sigma1;
real<lower=0> sigma2;
}
transformed parameters {
vector[N] log_p;
{
vector[N+1] log_p_e;
vector[N+1] log_p_l;
log_p_e[1] = 0;
log_p_l[1] = 0;
for (i in 1:N) {
log_p_e[i + 1] = log_p_e[i] + normal_lpdf(y[i]  mu1 * x[i] + gamma1, sigma1);
log_p_l[i + 1] = log_p_l[i] + normal_lpdf(y[i]  mu2 * x[i] + gamma2, sigma2);
}
log_p = rep_vector(log(N) + log_p_l[N + 1], N) + head(log_p_e, N)  head(log_p_l, N);
}
}
model {
mu1 ~ normal(0, 10);
mu2 ~ normal(0, 10);
gamma1 ~ normal(0, 10);
gamma2 ~ normal(0, 10);
sigma1 ~ normal(0, 10);
sigma2 ~ normal(0, 10);
target += log_sum_exp(log_p);
}
generated quantities {
int<lower=1,upper=N> tau;
tau = categorical_rng(softmax(log_p));
}
Let’s run this model with the monthly data.
NM < nrow(tabM)
KM < ncol(tabM)
yM < tabM$kr
XM < data.frame(tabM,rep(1,NM))[,2:3]
fitM < stan(
file = "lrchangepointng.stan",
data = list(x = XM$krLabs, y = yM, N = length(yM)),
chains = 4,
warmup = 1000,
iter = 10000,
cores = 4,
refresh = 500,
seed=42
)
## Warning: There were 662 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
## http://mcstan.org/misc/warnings.html#divergenttransitionsafterwarmup
## Warning: Examine the pairs() plot to diagnose sampling problems
Looking at the results below we see a multimodal distribution so a mean is not of much use.
histData < hist(extract(fitM)$tau,plot=FALSE,breaks=c(seq(1,length(yM),1)))
histData$counts
## [1] 18000 0 0 0 0 0 0 0 0 0 0
## [12] 0 0 0 0 0 0 0 0 0 0 0
## [23] 0 0 0 0 0 0 0 0 0 0 0
## [34] 0 0 0 0 0 0 0 0 0 0 0
## [45] 0 0 0 0 0 0 0 0 0 0 0
## [56] 0 0 0 0 0 0 0 0 0 0 0
## [67] 0 0 0 0 0 0 0 0 0 0 0
## [78] 0 0 0 0 0 0 0 0 0 0 0
## [89] 0 0 0 0 0 0 0 0 0 0 0
## [100] 0 0 0 0 0 0 0 0 0 0 0
## [111] 0 0 0 0 0 0 0 1 4 12 16
## [122] 16 107 712 8132 0 0 0 0 0 0 0
## [133] 0 0 0 0 0 0 0 0 0 0 0
## [144] 0 0 0 0 0 0 0 0 0 0 0
## [155] 0 0 0 0 0 0 0 0 0 0 25
## [166] 171 2812 0 0 0 0 0 0 0 0 0
## [177] 0 0 0 0 0 0 0 0 0 0 0
## [188] 0 0 0 0 0 0 0 0 0 0 0
## [199] 0 0 0 0 0 0 0 0 0 0 0
## [210] 0 0 0 0 0 0 0 0 0 0 0
## [221] 0 0 0 0 5992
We can get a pictorial representation of the maxima so that the multimodality is even clearer.
min_indexes = which(diff( sign(diff( c(0,histData$counts,0)))) == 2)
max_indexes = which(diff( sign(diff( c(0,histData$counts,0)))) == 2)
modeData = data.frame(x=1:length(histData$counts),y=histData$counts)
min_locs = modeData[min_indexes,]
max_locs = modeData[max_indexes,]
plot(modeData$y, type="l")
points( min_locs, col="red", pch=19, cex=1 )
points( max_locs, col="green", pch=19, cex=1 )
My interpretation is that the evidence (data) says there is probably no changepoint (a change at the beginning or end is no change) but there might be a change at intermediate data points.
We can see something strange (maybe a large single export?) happened at index 125 which translates to 2008MAY.
The mode at index 167 which translates to 2011NOV corresponds roughly to the EU / Korea trade agreement.
Let us assume that there really was a material difference in trade at this latter point. We can fit a linear regression before this point and one after this point.
Here’s the stan
data {
int<lower=1> N;
int<lower=1> K;
matrix[N,K] X;
vector[N] y;
}
parameters {
vector[K] beta;
real<lower=0> sigma;
}
model {
y ~ normal(X * beta, sigma);
}
And here’s the R to fit the before and after data. We fit the model, pull out the parameters for the regression and pull out the covariates
N < length(yM)
M < max_locs$x[3]
fite < stan(file = 'LR.stan',
data = list(N = M, K = ncol(XM), y = yM[1:M], X = XM[1:M,]),
pars=c("beta", "sigma"),
chains=3,
cores=3,
iter=3000,
warmup=1000,
refresh=1)
se < extract(fite, pars = c("beta", "sigma"), permuted=TRUE)
estCovParamsE < colMeans(se$beta)
fitl < stan(file = 'LR.stan',
data = list(N = NM, K = ncol(XM), y = yM[(M+1):N], X = XM[(M+1):N,]),
pars=c("beta", "sigma"),
chains=3,
cores=3,
iter=3000,
warmup=1000,
refresh=1)
sl < extract(fitl, pars = c("beta", "sigma"), permuted=TRUE)
estCovParamsL < colMeans(sl$beta)
Make predictions
linRegPredsE < data.matrix(XM) %*% estCovParamsE
linRegPredsL < data.matrix(XM) %*% estCovParamsL
ggplot(tabM, aes(x=as.Date(tabM$krLabs), y=tabM$kr)) +
geom_line(aes(x = as.Date(tabM$krLabs), y = tabM$kr, col = "Actual")) +
geom_line(data=tabM[1:M,], aes(x = as.Date(tabM$krLabs[1:M]), y = linRegPredsE[(1:M),1], col = "Fit (Before FTA)")) +
geom_line(data=tabM[(M+1):N,], aes(x = as.Date(tabM$krLabs[(M+1):N]), y = linRegPredsL[((M+1):N),1], col = "Fit (After FTA)")) +
theme(legend.position="bottom") +
ggtitle("Goods Exports UK / South Korea (Monthly)") +
theme(plot.title = element_text(hjust = 0.5)) +
xlab("Date") +
ylab("Value (£m)")
An Intermediate Conclusion and Goods and Services (Pink Book)
So we didn’t manage to substantiate either the Chancellor’s claim or the Member of Parliament’s claim.
But it may be that we can if we look at Goods and Services then we might be able to see the numbers resulting in the claims.
pb < "datasets/pinkbook/current/pb.csv"
pbcsv < read.csv(paste(ukstats,"file?uri=",bop,pb,sep="/"),stringsAsFactors=FALSE)
This has a lot more information albeit only annually.
pbns < grep("Korea", names(pbcsv))
length(pbns)
## [1] 21
lapply(pbns,function(x) names(pbcsv[x]))
## [[1]]
## [1] "BoP..Current.Account..Goods...Services..Imports..South.Korea............"
##
## [[2]]
## [1] "BoP..Current.Account..Current.Transfer..Balance..South.Korea............"
##
## [[3]]
## [1] "BoP..Current.Account..Goods...Services..Balance..South.Korea............"
##
## [[4]]
## [1] "IIP..Assets..Total.South.Korea.........................................."
##
## [[5]]
## [1] "Trade.in.Services.replaces.1.A.B....Exports.Credits...South.Korea...nsa."
##
## [[6]]
## [1] "IIP...Liabilities...Total...South.Korea................................."
##
## [[7]]
## [1] "BoP..Total.income..Balance..South.Korea................................."
##
## [[8]]
## [1] "BoP..Total.income..Debits..South.Korea.................................."
##
## [[9]]
## [1] "BoP..Total.income..Credits..South.Korea................................."
##
## [[10]]
## [1] "BoP..Current.account..Balance..South.Korea.............................."
##
## [[11]]
## [1] "BoP..Current.account..Debits..South.Korea..............................."
##
## [[12]]
## [1] "BoP..Current.account..Credits..South.Korea.............................."
##
## [[13]]
## [1] "IIP...Net...Total....South.Korea........................................"
##
## [[14]]
## [1] "Trade.in.Services.replaces.1.A.B....Imports.Debits...South.Korea...nsa.."
##
## [[15]]
## [1] "BoP..Current.Account..Services..Total.Balance..South.Korea.............."
##
## [[16]]
## [1] "Bop.consistent..Balance..NSA..South.Korea..............................."
##
## [[17]]
## [1] "Bop.consistent..Im..NSA..South.Korea...................................."
##
## [[18]]
## [1] "Bop.consistent..Ex..NSA..South.Korea...................................."
##
## [[19]]
## [1] "Current.Transfers...Exports.Credits...South.Korea...nsa................."
##
## [[20]]
## [1] "Current.Transfers...Imports.Debits...South.Korea...nsa.................."
##
## [[21]]
## [1] "BoP..Current.Account..Goods...Services..Exports..South.Korea............"
Let’s just look at exports.
koreanpb < pbcsv[grepl("Korea", names(pbcsv))]
exportspb < koreanpb[grepl("Exports", names(koreanpb))]
names(exportspb)
## [1] "Trade.in.Services.replaces.1.A.B....Exports.Credits...South.Korea...nsa."
## [2] "Current.Transfers...Exports.Credits...South.Korea...nsa................."
## [3] "BoP..Current.Account..Goods...Services..Exports..South.Korea............"
The last column gives exports of Goods and Services so let’s draw a chart of it.
pb < data.frame(pbcsv[grepl("Title", names(pbcsv))],
exportspb[3])
colnames(pb) < c("Title", "Exports")
startpbY < which(grepl("1999",pb$Title))
endpbY < which(grepl("2015",pb$Title))
pbY < pb[startpbY:endpbY,]
tabpbY < data.frame(kr=as.numeric(pbY$Exports),
krLabs=as.numeric(pbY$Title))
ggplot(tabpbY, aes(x=tabpbY$krLabs, y=tabpbY$kr)) + geom_line() +
theme(legend.position="bottom") +
ggtitle("Goods and Services Exports UK / South Korea (Annual)") +
theme(plot.title = element_text(hjust = 0.5)) +
xlab("Date") +
ylab("Value (£m)")
No joy here either to any of the claims. Still it’s been an interesting exercise.