# Introduction

Suppose you want to sample from the truncated normal distribution. One way to do this is to use rejection sampling. But if you do this naïvely then you will run into performance problems. The excellent Devroye (1986) who references Marsaglia (1964) gives an efficient rejection sampling scheme using the Rayleigh distribution. The random-fu package uses the Exponential distribution.

# Performance

``````> {-# OPTIONS_GHC -Wall                     #-}
> {-# OPTIONS_GHC -fno-warn-type-defaults   #-}
> {-# OPTIONS_GHC -fno-warn-unused-do-bind  #-}
> {-# OPTIONS_GHC -fno-warn-missing-methods #-}
> {-# OPTIONS_GHC -fno-warn-orphans         #-}
``````
``````> {-# LANGUAGE FlexibleContexts             #-}
``````
``````> import Control.Monad
> import Data.Random
> import qualified Data.Random.Distribution.Normal as N
``````
``````> import Data.Random.Source.PureMT
``````

Here’s the naïve implementation.

``````> naiveReject :: Double -> RVar Double
> naiveReject x = doit
>   where
>     doit = do
>       y <- N.stdNormal
>       if y < x
>         then doit
>         else return y
``````

And here’s an implementation using random-fu.

``````> expReject :: Double -> RVar Double
> expReject x = N.normalTail x
``````

Let’s try running both of them

``````> n :: Int
> n = 10000000
``````
``````> lower :: Double
> lower = 2.0
``````
``````> testExp :: [Double]
> testExp = evalState (replicateM n \$ sample (expReject lower)) (pureMT 3)
``````
``````> testNaive :: [Double]
> testNaive = evalState (replicateM n \$ sample (naiveReject lower)) (pureMT 3)
``````
``````> main :: IO ()
> main = do
>   print \$ sum testExp
>   print \$ sum testNaive
``````

Let’s try building and running both the naïve and better tuned versions.

``ghc -O2 CompareRejects.hs``

As we can see below we get 59.98s and 4.28s, a compelling reason to use the tuned version. And the difference in performance will get worse the less of the tail we wish to sample from.

## Tuned

``````2.3731610476911187e7
11,934,195,432 bytes allocated in the heap
5,257,328 bytes copied during GC
44,312 bytes maximum residency (2 sample(s))
21,224 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed)  Avg pause  Max pause
Gen  0     23145 colls,     0 par    0.09s    0.11s     0.0000s    0.0001s
Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0002s

INIT    time    0.00s  (  0.00s elapsed)
MUT     time    4.19s  (  4.26s elapsed)
GC      time    0.09s  (  0.11s elapsed)
EXIT    time    0.00s  (  0.00s elapsed)
Total   time    4.28s  (  4.37s elapsed)

%GC     time       2.2%  (2.6% elapsed)

Alloc rate    2,851,397,967 bytes per MUT second

Productivity  97.8% of total user, 95.7% of total elapsed``````

## Naïve

``````2.3732073159369867e7
260,450,762,656 bytes allocated in the heap
111,891,960 bytes copied during GC
85,536 bytes maximum residency (2 sample(s))
76,112 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed)  Avg pause  Max pause
Gen  0     512768 colls,     0 par    1.86s    2.24s     0.0000s    0.0008s
Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0002s

INIT    time    0.00s  (  0.00s elapsed)
MUT     time   58.12s  ( 58.99s elapsed)
GC      time    1.86s  (  2.24s elapsed)
EXIT    time    0.00s  (  0.00s elapsed)
Total   time    59.98s  ( 61.23s elapsed)

%GC     time       3.1%  (3.7% elapsed)

Alloc rate    4,481,408,869 bytes per MUT second

Productivity  96.9% of total user, 94.9% of total elapsed``````

# Bibliography

Devroye, L. 1986. Non-Uniform Random Variate Generation. Springer-Verlag. http://books.google.co.uk/books?id=mEw\_AQAAIAAJ.

Marsaglia, G. 1964. “Generating a Variable from the Tail of the Normal Distribution.” J-Technometrics 6 (1): 101–2.