So, let’s go to probability distributions. Let’s say we have a coin, and we toss it twice. Let’s say we get one heads and one tails. What would we think about this coin? Well, it could be a fair coin, right? It could also be slightly biased towards either heads or tails. We don’t have enough data to be sure. So, let’s say that our verdict is that we think it’s fair, but with not much confidence. So, let’s think of the probability p that this coin lands on heads. With a little bit of confidence, p is one half, but it could be a lot of other values. So, the probability distribution for p is something like this graph. Higher at one half, but a bit even in the entire interval, and very low on the corner zero and one. Now, let’s say we toss the coin 20 times and we get 10 heads, 10 tails. Now, we have a bit more confidence that the coin is fair. So, the distribution for the new value p is something more like this graph with a higher peak at 0.5. Now, let’s say we toss a coin four times, and we get heads three times, and tails one. The probability distribution for p is now centered at 0.75 since we get three heads out of four tries. But also with not much confidence. So, it’s a graph like this. But if we toss it 400 times, and it lands on heads 300, then we’re much more confident that the value of p is close to 0.5. Therefore, the probability distribution for p is something like this with a huge peak at at 0.75, and almost flat everywhere else. This is called a Beta distribution, and it works for any values of A and B. If we get heads A times and tails B times, the graph looks like this with a peak at A over A+B. The formula is this. Gamma of A plus B, divided by Gamma of A times Gamma of B, times X to the A minus one, times Y to the B minus one. If you haven’t seen the Gamma function, think of it as a continuous version of the factorial function. In this case, if A is an integer, then Gamma of A is A minus one factorial. For example, Gamma of five is four factorial, which is 24, and Gamma of six is five factorial, which is 120. But the interesting thing about the Gamma function is that it also takes values in between. So, we can define things like Gamma of 5.5, and it will be something between 24 and 120. The actual formula for the Gamma function is in the instructor comments although is not something we’ll use in this class. So, what happens if we have values that are not integers like say, 0.1 for heads, and 0.4 for tails. This makes no sense since we can get 0.1 heads and 0.4 tails. But that’s okay for the Beta distribution. We just need to grab the right formula using the Gamma function. For values smaller than one, like 0.1 and 0.4, we get this graph. That means p is much more likely to be either close to zero or close to one than to be in the middle. This makes a bit of sense, right? If p is closer to zero or to one, then we’re likely to have either zero heads or zero tails, which at least gets us close to one of the values 0.1 or 0.4.