The Birthday Problem: A Piece of Mathematical Entertainment

THE BIRTHDAY PROBLEM: A PIECE OF MATHEMATICAL ENTERTAINMENT

So here I am, ready to talk about MATHS! *dodges*. Yes, that's right, even though I'm not a real master of the topic (the real one would probably be Carrie), I'm a sciencey person, and I love mathematics. I'm going to discuss a so-called paradox (it's not, really) that will lead us to a crazy, mythical journey to the lands of Combinatorics, Set Theory, and Probability. Don't worry, it won't be difficult (if it was, it wouldn't be me telling you about it). So here we go!

1. What is this about?

You're in a room with a lot of people. Let's say 50. What do you think are the odds of two people sharing a birthday? Oh, easy peasy, you'll say. There are 365 days in a year, more or less, and 50 people here. So I would bet my butt the answer is "not very likely". Well, oh boy, you'd be absolutely wrong. The answer is actually a whopping 97% chance that would happen.

If your reaction is absolute surprise, yay! You'll probably keep reading. If it's just a shrug, you're everything that's wrong with the world. Either that or you knew already. In that case, good for you! Maybe this proof will be fun anyway.

FUN!

2. Functions: those thingies

Before we do the proof, it'd be good to introduce the big concept that I'm going to use here, since it'd be messy to be constantly stopping the flow of the demonstration to say "oh, by the way, this is that". So, functions:

Let's say we have two sets of dots, A and B.

Look at my mad MS Paint skillz

Also, let's call the elements of A = {a1, a2, a3, a4, a5}, and the elements of B = {b1, b2, b3, b4}. We say that b (an element of B) is the image of a (an element of A) if there's a correspondence between a and b, and we'll represent it with an arrow that goes from a to b. We'll write that as f(a) = b. For example, here's the representation of f(a3) = b2.

Boom, maths.

A function is, basically, a possible distribution of those arrows between both sets, but with one condition: every "origin dot" (in this case, an element of A) has one and only one arrow coming out of it. I could say that more sciencely but who cares. For example, this is a function:

Yes, there are two arrows that end up in b1, but that's a "destiny dot", so it's okay.

This is not a function:

You should be ashamed.

This is most definitely not a function:

It's an axolotl.

There are also different types of functions. We only care about one: injective functions. Injective functions are those where no "destiny dot" can be reached by more than one arrow. As you can see, the function I represented earlier (you know, the one that was correct and wasn't an axolotl) wasn't injective: b1 had arrows coming from both a2 and a5. Since the number of elements of A is bigger that the number of elements of B, you can see we actually can't construct an injective function, since every element of A has to have an image. So let's get the number of elements of A down to three. Remember that though we have to make sure every element of A has one arrow, not every element of B has to receive one (that would be a surjective function). Here's an injective function:

Look at those glorious arrows.

That's pretty much all the "complicated" mathy stuff we need to know, since I'd say the combinatorics part is pretty intuitive.

3. What does this have to do with anything?

Well, we can turn our problem into a function:

There are n people in the room. I know using actual numbers is easier, but letters are better because they are generic. We can make n a number when we're done, don't worry. Also there are 365 days in a year (yes, okay, 366, but that extra day happens once every 1500 days, so shut up). Let's call the set of days in a year Y, and the set of people N. We can build a function F: N -> Y, that is, one that establishes a correspondence between the people and the days, so that the arrows come out of the people and end in the days. We know the cardinal (the number of elements) of Y is 365, and the cardinal of N is n. N is probably smaller than 365, otherwise this would all be trivial, but that doesn't matter a lot.

Relevant.

What are we trying to find here? Let's recall: we want to see if between those n people, there are at least two with a shared birthday. Function-wise, this means: how many functions can we construct here that aren't injective (that is, there is a day of the year receiving two arrows or more)?

First of all, let's see how many functions are there, injective or not. Since each of the n people has 365 possibilities for their birthday, and every birthday is independent from the others (that is, n1 having a birthday on y1 doesn't affect the birthdays of anyone else) it's not difficult to see that the number of possible functions is 365*365*365...*365, n times. That is...

...a lot.

Every possible drawing of arrows that represents a function is included there: if it's not, it isn't a function (most likely it's an axolotl). This number will serve as our "possible cases" component when we calculate the probability.

Now we need the number of not-injective functions. That is tricky. But we have a way out! We can easily calculate how many injective functions there are. That is, the cases where every person out of the n people has a different birthday.

This is pretty intuitive too (hopefully). The first person has 365 possibilities, the second one 364 (they can't share a birthday with the first person, but every other day is fair game), and so on... so the n-th person has 365-n+1 possible birthdays. As before, the total number we get is obtained multiplying all of this: 365*364*363*...*(365-n+1). The final result isn't as pretty, but we can fancy it up using factorials (the factorial of an integer is the result of multiplying every integer smaller or equal, down to one, and it's written q!, q being the integer. For example, 3! is 3*2*1 = 6. You get the idea). The result this time turns out to be...

So beautiful.

As you probably know, the probability of something happening is the number of favorable cases, when it happens, divided by the number of all cases possible. And it turns out we have both those things! We must divide the second number we obtained (those cases where every person has a different birthday) by the first (every possible case, ever). So this is the probability of that:

Seriously having a bit of Stendhal syndrome right now.

But wait! This isn't what we were trying to do! We wanted to know when there were people with shared birthdays, you despicable goon! This is the complete opposite of that! Right, I know. That's easily solvable.

I hate you so much.

The probability of any case whatsoever is favorable cases (any case whatsoever) divided by possible cases (any case whatsoever), that is, 1. Since the cases we need are the ones that aren't counted on our formula, we can just subtract the formula from the probability of any case, and get exactly what we want. I'll give you a couple of seconds to swallow all that.

4. The anti-climax

So now we can change n to any number that we want. What are the odds that two or more people will share a birthday out of a set of 10 people? 20 people? 50 people? Just put that number instead of n and you're set.

Or you would be if those numbers weren't so big you probably don't have a calculator powerful enough to handle it. So you'll have to do it step by step, don't just put 365! in your calculator because it will explode. But believe me, it works.

So you're going to take the leap of faith whether you read all of that or not.

Banner image attributed to Frgonzg