A few weeks ago, two of my colleagues and I had a birthday in the same week. What are the odds of that? There’s about 25 people in the department in which I work.
We consider every outcome where pairs of people share birthdays but no 3 or more people share birthdays. Then we can use the birthday paradox trick and subtract the proportion of these from 1.
To be absolutely precise we ought to consider outcomes where no one shares a birthday but this is sufficiently small in comparison in the case we are interested in that we can ignore it.
We begin with some imports and a dumb function to calculate multinomial coefficients.
import qualified Data.Map as Map import System.Random import Prelude hiding (replicate, zipWith, length) binomial n 0 = 1 binomial 0 k = 0 binomial n k = n * binomial (n - 1) (k - 1) `div` k multinomial xs = (factorial (sum xs)) `div` (product [ factorial x | x<-xs ]) where factorial n = product [1..n]
Let’s take a smaller example. Suppose we have 9 people in a department and we have divided the year up into 11 periods.
We want the cases where there are exactly two birthdays in a period. For example, 2 people might have a birthday in period 1, 7 people might have birthdays in periods 2–8 and no-one has a birthday in periods 9–11. This would give us ways of choosing people. We also need to consider that the 2 people sharing the birthday might share them in period 2, 3, …, 11. There are ways of this happening. Letting x stand for the total number of ways that 2 people might share exactly 2 birthdays in a given period, we have:
In Haskell, we can write this as:
nPeople :: Integer nPeople = 9 nPeriods :: Integer nPeriods = 11 x = binomial 11 1 * binomial 10 7 * multinomial [2,1,1,1,1,1,1,1,0] + binomial 11 2 * binomial 9 5 * multinomial [2,2,1,1,1,1,1,0,0] + binomial 11 3 * binomial 8 3 * multinomial [2,2,2,1,1,1,0,0,0] + binomial 11 4 * binomial 7 1 * multinomial [2,2,2,2,1,0,0,0,0]
And therefore we expect this proportion of exactly 2 shared birthdays
y = (fromInteger x) / (fromInteger nPeriods^nPeople)
Generalizing, we first need to be able to generate the desired multinomial coefficient.
coeff x' y' = take y (ns 2) ++ take (x - 2*y) (ns 1) ++ take y (ns 0) where x = fromIntegral x' y = fromIntegral y' ns n = n:(ns n)
*Main> coeff 11 1 [2,1,1,1,1,1,1,1,1,1,0]
Now we can calculate the number of ways of getting exactly 2 birthdays in any single period
terms x y = sum $ map (oneTerm x y) [1..(x `div` 2)] where oneTerm x y n = binomial y n * binomial (y - n) (x - (2 * n)) * multinomial (coeff x n)
And finally we can calculate the probability of getting exactly 2 birthdays in any single period.
atMostTwoShared x y = fromInteger (terms x y) / fromInteger (y^x)
Because it’s easy to double count or forget a particulare case, it’s best to check via Monte Carlo:
nSims = 10000 monteCarlo = runs >>= return . Map.map (/(fromInteger nSims)) . counts where counts rs = foldr (\b -> Map.insertWith (+) b 1) Map.empty rs runs = sequence $ take (fromIntegral nSims) $ repeat (bDays >>= return . maxOnOneDay) bDays = sequence $ take (fromIntegral nPeople) $ repeat $ pickBirthday maxOnOneDay bs = maximum $ Map.elems $ foldr (\b -> Map.insertWith (+) b 1) Map.empty bs pickBirthday = getStdRandom (randomR (1,nPeriods)) main = do ps <- monteCarlo putStrLn $ show ps putStrLn $ show $ atMostTwoShared nPeople nPeriods
Surprisingly the answer for 3 people sharing a birthday in a period where there are 9 people and 11 periods is 43%.
The answer we are after is for 25 people and 52 periods (weeks in the year) and this is 50%.