So now, here is how the Hidden Markov Model generates a sentence. We start by placing ourselves at the start. And with some probability, we walk around the hidden states. Let’s say we walk into the N of noun. Once we’re at the N, then we have some options to generate observations. So let’s say with some probability, we generate the word, Jane. So we record this word. Then we walk around the hidden states some more. Let’s say we land on the M for modal and this generates the word Will. So, we record this and then we walk to the V of verb and generate the word spot. Record that then walk to the N of noun and generate the word Will. And finally, we walk to the end state and generate the end of the sentence or a period. And when we reach this, we’re done. Now, it is clear that most sentences can be generated this way. So now, what we’ll do is we’ll record the probability of the sentence being generated. So, we start again at the starting state. And as we remember, we walked to the N. The transition probability from the start state to N is three-quarters. So we’ll record this three-quarters. Now, at N we generate the word Jane. This happens with emission probability two-ninths, so we record that. Now, we move to M and the transition probability is one-third and generate Will with emission probability of three-quarters. Then, we move to V with transition probability three-quarters and generate spot with the emission probability one-quarter. Then, we move back to N with transition probability one and we generate Will with emission probability one over nine. Finally, we move to the end state with transition probability four over nine. So, since the transition moves are all independent of each other and the emission moves are conditional on the hidden state that were located at, the probability that this Hidden Markov Model generates a sentence is the product of these probabilities. We multiply them and get the total probability of the sentence being emitted is 0.0003858. It looks small, but actually it’s large considering the huge amount of sentences we can generate of many lengths. So let’s repeat this path for, Jane will spot Will. We have the parts of speech noun, modal, verb, noun and the probabilities, and their product is 0.0003858. Let’s see if another path can generate this sentence. What about the path noun noun noun noun? We hope this is smaller since it’s a bit of a nonsensical pattern of parts of speech. And indeed, it is very small. It is the product of these numbers which is 0.0000002788. So, among these two, we pick the one on top because it generates the sentence with a higher probability. And in general, what we’ll do is from all the possible combinations of parts of speech, we’ll pick the one that generates the sentence, Jane will spot Will, with the highest probability. This is called maximum likelihood. It’s the core of many algorithms and artificial intelligence. So that’s simple. All we have to do is go over all the possible chains of parts of speech that could generate the sentence. So, let’s have a small quiz. How many of these chains are there?