A Triple Birthday Paradox

A few weeks ago, two of my colleagues and I had a birthday in the same week. What are the odds of that? There’s about 25 people in the department in which I work.

We consider every outcome where pairs of people share birthdays but no 3 or more people share birthdays. Then we can use the birthday paradox trick and subtract the proportion of these from 1.

To be absolutely precise we ought to consider outcomes where no one shares a birthday but this is sufficiently small in comparison in the case we are interested in that we can ignore it.

We begin with some imports and a dumb function to calculate multinomial coefficients.


import qualified Data.Map as Map
import System.Random
import Prelude hiding (replicate, zipWith, length)

binomial n 0 = 1
binomial 0 k = 0
binomial n k = n * binomial (n - 1) (k - 1) `div` k

multinomial xs =
  (factorial (sum xs)) `div` (product [ factorial x | x<-xs ])
    where
      factorial n = product [1..n]

Let’s take a smaller example. Suppose we have 9 people in a department and we have divided the year up into 11 periods.

We want the cases where there are exactly two birthdays in a period. For example, 2 people might have a birthday in period 1, 7 people might have birthdays in periods 2–8 and no-one has a birthday in periods 9–11. This would give us \binom{11}{1}\binom{10}{7} ways of choosing people. We also need to consider that the 2 people sharing the birthday might share them in period 2, 3, …, 11. There are \binom{9}{2,1,1,1,1,1,1,1,0} ways of this happening. Letting x stand for the total number of ways that 2 people might share exactly 2 birthdays in a given period, we have:


\begin{aligned} x& = \binom{11}{1}\binom{10}{7}\binom{9}{2,1,1,1,1,1,1,1,0} \\ & + \binom{11}{2}\binom{9}{5} \binom{9}{2,2,1,1,1,1,1,0,0} \\ & + \binom{11}{3}\binom{8}{3} \binom{9}{2,2,2,1,1,1,0,0,0} \\ & + \binom{11}{4}\binom{7}{1} \binom{9}{2,2,2,2,1,0,0,0,0}\end{aligned}

In Haskell, we can write this as:

nPeople :: Integer
nPeople = 9
nPeriods :: Integer
nPeriods = 11

x = binomial 11 1 * binomial 10 7 * multinomial [2,1,1,1,1,1,1,1,0] +
    binomial 11 2 * binomial 9  5 * multinomial [2,2,1,1,1,1,1,0,0] +
    binomial 11 3 * binomial 8  3 * multinomial [2,2,2,1,1,1,0,0,0] +
    binomial 11 4 * binomial 7  1 * multinomial [2,2,2,2,1,0,0,0,0]

And therefore we expect this proportion of exactly 2 shared birthdays

y = (fromInteger x) / (fromInteger nPeriods^nPeople)

Generalizing, we first need to be able to generate the desired multinomial coefficient.

coeff x' y' = take y (ns 2) ++ take (x - 2*y) (ns 1) ++ take y (ns 0)
  where x    = fromIntegral x'
        y    = fromIntegral y'
        ns n = n:(ns n)

For example:

*Main> coeff 11 1
[2,1,1,1,1,1,1,1,1,1,0]

Now we can calculate the number of ways of getting exactly 2 birthdays in any single period

terms x y = sum $ map (oneTerm x y) [1..(x `div` 2)]
  where
    oneTerm x y n = binomial y n *
                    binomial (y - n) (x - (2 * n)) *
                    multinomial (coeff x n)

And finally we can calculate the probability of getting exactly 2 birthdays in any single period.

atMostTwoShared x y = fromInteger (terms x y) / fromInteger (y^x)

Because it’s easy to double count or forget a particulare case, it’s best to check via Monte Carlo:

nSims = 10000

monteCarlo =
  runs >>= return . Map.map (/(fromInteger nSims)) . counts where
    counts rs      = foldr (\b -> Map.insertWith (+) b 1) Map.empty rs
    runs           = sequence $
                     take (fromIntegral nSims) $
                     repeat (bDays >>= return . maxOnOneDay)
    bDays          = sequence $
                     take (fromIntegral nPeople) $
                     repeat $ pickBirthday
    maxOnOneDay bs = maximum $
                     Map.elems $
                     foldr (\b -> Map.insertWith (+) b 1) Map.empty bs
    pickBirthday   = getStdRandom (randomR (1,nPeriods))

main = do
  ps <- monteCarlo
  putStrLn $ show ps
  putStrLn $ show $ atMostTwoShared nPeople nPeriods

Surprisingly the answer for 3 people sharing a birthday in a period where there are 9 people and 11 periods is 43%.

The answer we are after is for 25 people and 52 periods (weeks in the year) and this is 50%.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s