Now, here’s where the word naive comes in Naive Bayes. We’re going to make a pretty naive assumption here. Let’s look at the probability of two events happening together, so P of A and B. We can also read this a P of A intersection B. And we’re going to say that this is the product of P of A and P of B. Now, this only happens when the two events are independent. If they’re not, then this is not true. For example, if A is the event of it being hot outside and B is the event of it being cold outside, then they both have a positive probability. But now, what’s the probability of both events happening at the same time? This will be zero, since it can’t be hot and cold at the same time. So, this formula doesn’t follow because the events of being hot and being cold are dependent on each other. But in a Naive Bayes, we will assume that our probabilities are independent. This, as we said, is a false and naive assumption, but in practice, it works very well and it makes our algorithm very fast. Another formula I will use is a formula for conditional probability. These are two ways of writing P of A intersection B. And this is the basis for our base theorem. But the trick we’ll use here is to forget about P of B. And now, we don’t have these being equal. But we have P of A given B to be proportional to P of B given A times P of A. This will work very well because in the practice, P of B will cancel out, so the fact that these two are proportional is very useful. And now, here’s what we want. We have an email that contains the words easy and money, and we want to know if it is spam. So we want this, the probability of the email being spam given that it contains the words easy and money. We’ll start by using a conditional probability rule that we just reviewed to write it as a product of the probability that the email contains the words easy and money given that it is spam, times the probability of the email being spam. In this formula, A represents being spam and B represents containing the words easy and money. Now, we are ready to use our naive assumption. This first factor over here is a probability of the email containing the words easy and money given that it is spam. We can write it as a probability of the email containing the word easy given that it is spam, times the probability of the email containing the word money given that it is spam. Again, huge naive assumptions as these may be dependent. It could be that containing the word easy actually makes it more likely that the email contains the word money. But that’s okay. In many cases, this assumption won’t affect the results and it will make our calculations much easier. And this is the heart of the Naive Bayes algorithm. And now, we do the same thing for hand emails. We have both probabilities written as a product of factors. But what are these factors? Well, we’ve calculated them before based on our data. The first one P of containing the word easy given that it is spam, is one-third since there are three spam emails and one of them contains the word easy. For P of containing money given spam, that’s two-thirds since there are three spam emails and two of them contain the word money. And P of spam is very simple. It’s three over eight since there are eight emails and only three of them are spam. We do a similar calculation for the bottom one and get one-fifth, one-fifth, and five over eight. Now we multiply them and we get that the probability of spam given that it contains the word easy and money is proportional to one over 12. And for ham, it’s proportional to one over 40. Now remember that these values are not the actual probabilities, they are proportional to the actual probabilities. So what do we do to get the actual probabilities? Here’s the magic. We know that an email has to be either spam or ham, so these two should add to one. So we need to normalize them, namely, multiply them both by the same factor so that they are still proportional to one over 12 and one over 40, but they add to one. Let’s try that in a quiz. Can you find two numbers that add to one and that they are in the same proportion to each other as one over 12 and one over 40? Enter your answer below.