5 – NLPND POS 04 When Bigrams Wont Work V1

Okay. So now let’s complicate the problem a little bit. We still have Mary, Jane and Will. And let’s say there’s a new member of the gang called Spot. And now our data is formed by the following four sentences, Mary, Jane can see Will. Spot will see Mary. Will Jane spot Mary? And Mary will pat Spot. And these are tied in someway. We’ll show you later. And we want to tag the sentence, Jane will spot Will. So let’s try a lookup table and I won’t fill it in, but let’s say we can fill it in based on the data. What’s a potential problem here? Well, when we look at the first pair Jane will, we notice that that’s not in the data. This pair of words never appears. But somehow, it’s a pair that makes sense. So, a problem with bigrams or n-grams in general is that sometimes every pair or every n-gram doesn’t appear, and then we have no way to fill in the extra information. So for this, we’ll introduce hidden mark of models.

%d