Bayes Theorem can get a little more complex. Let’s take a look at a small example and what we’ll do here is we’ll mess a bit with the prior probability. So again, we have Alex and Brenda in the office, and we saw someone pass by quickly and we don’t know who the person is. So let’s say we look more carefully at their schedules and we realized that Alex actually works from the office most of the time. He comes by three days a week. And Brenda travels a lot for work, so, she actually comes to the office only one day a week. So initially, without knowing anything about the red sweater, all we know is that it’s three times more likely to see Alex than to see Brenda. Therefore our prior probabilities are 0.75 for Alex and 0.25 for Brenda. And let’s say that we have this happening throughout all the weeks, but now we use our extra knowledge which is that the person we saw had a red sweater. The rule is still as before, as Alex wears red twice a week and Brenda wears red three times a week. So, naively we would think that the real probabilities are not exactly 0.75 or 0.25 because Brenda wears a red sweater more than Alex, so they should be a little closer to each other. Let’s calculate them. So, we’ll do the following, let’s think of the columns as weeks instead. So, now for each five-day work week, Alex wears red twice and Brenda three times. So, we colored the days they wore red. Now, since we know the person wore red, we forget about the times that they didn’t. So we have nine times someone wore red. Six of them are Alex and three of them are Brenda. Therefore, among nine times we saw someone wearing red, two-thirds of the times it with Alex and one third of the time it was Brenda. Thus, our posterior probabilities are two-thirds or 0.67 for Alex and one third or 0.33 for Brenda. So it looks like we did a little bit of magic. Let’s do this again in a more mathematical way. We saw a person and initially all we know is that it’s Alex with a 75% probability and Brenda with a 25% probability since Alex comes to the office three times a week and Brenda once a week. But now new information comes to light which is that the person is wearing a red sweater and the data says that Alex wears red two times a week. So now we look at Alex. What is the probability that he’s wearing red? Since a work week has five days and the probability of him wearing red is two-fifths or 0.4. And the probability of him not wearing red is the complement, so 0.6. Same thing with Brenda, since she wears red three three a week, then the probability of her wearing red today is 0.6 and the probability of her not wearing red is 0.4. Now, by the formula of conditional probability, the probability that these two will happen is the product of the two probabilities P of Alex, times P of red given Alex. Therefore, the probability of the person we saw is Alex and that they’re wearing red is precisely 0.75 times 0.4. We multiply them and put the result here. We calculate the other probabilities in the same way, that probability of the person we saw is Alex and that he’s not wearing red is 0.75 times 0.6. The probability of the person we saw is Brenda and that she’s wearing red, is again the product of these probabilities, which is 0.25 times 0.6. And finally, the probability of the person we saw is Brenda and she’s not wearing red is 0.25 times 0.4. And now here’s where the Bayesian magic happens, are you ready? We have four possible scenarios and you can check that these four probabilities add to one. But we know one thing, that the person we saw was wearing red. Therefore, out of these four scenarios, only two are plausible, the two when the person is wearing red. So, we forget about the other two. Now, since our new universe consists of only these two scenarios, then the probability should be higher, but their ratio should still be the same with respect to each other. This means, we need to normalize them or equivalently, divide them by something so that they now add to one. The thing we should divide them by, is the sum of the two. So, our new probability of the person being Alex is the top one, namely, 0.75 times 0.4 divided by the sum of the two, namely, 0.75 times four, plus 0.25 zero times 0.6. This is precisely two-thirds or 0.67, and now we can see that the complement is the probability that the person is Brenda, which is one third or 0.33. If we take Brenda’s probability and divide it by the sum of both probabilities we can see that we get one third as desired. And that’s it, that is Bayes Theorem at its full potential.