9 – Segmentally Boosted HMMs

In your past work on gesture recognition, how many dimensions have you used for your output probabilities? >> Up to hundreds. At one point, we are creating appearance models of the hand, using a similarity metric of how closely the current hand looked like different visual models of the hand as features for the HMM. … Read more

8 – State Tying

Another trick is State Tying. >> You mean combining training for states with the states within the model are closed? >> Yup. Let’s look at our models for I and we again. In the case where we are recognizing isolated signs, the initial movement of the right hand going to the chest is very similar … Read more

7 – Statistical Grammar

Statistical grammars can help us even more. In our example up to this point, we used a simple grammar, a pronoun, verb and noun. And that placed a strong limit of were we started and ended in our Viterbi trellis. >> But in real life, language is not so well segmented. Instead, we can record … Read more

6 – Context Training

OK. Now let’s talk about another trick. When we moved from recognizing isolated signs to recognizing phrases of signs, the combination of movements looks very different. >> For example, when Thad signed NEED in isolation, his hands started from a rest position and finished in the rest position. When he signs NEED in the context … Read more

5 – Stochastic Beam Search

Stochastic beam search, did we see that before? >> Not in detail. So far we’ve been doing something like a breadth first search, expanding each possible path in each time step. But now we want to prune some of those paths. >> Well some of those paths are going to get a low probability pretty … Read more

4 – Phrase Level Recognition

Now that we have topologies for our six signs, let’s talk about phrase level sign language recognition. We have eight phrases we want to recognize. >> Actually, you mean 7 signs and 12 phrases. >> Since we have two variants of cat we are recognizing, expanding all the possibilities leads to 12 phrases. >> Good … Read more

3 – HMM Topologies

Next, let’s talk about increasing the size of our vocabulary. >> Okay, I’ve selected some signs we can use to start making phrases. But we’re going to have to choose topologies for each of them. >> Well, we’ve already chosen topologies for I and we. What sign is next on your list? >> Well, let’s … Read more

2 – Using a Mixture of Gaussians

What if our output probabilities aren’t Gaussian? >> Well according to the central limit theorem, we should get Gaussians if enough factors are affecting the data. >> But in practice sometimes the output probabilities really are not Gaussian. It is not hard for them to be bimodal. >> You mean like this. >> Yep. >> … Read more

10 – Using HMMs to Generate Data

One last thing I’d like to cover is why the normal HMM formulation is not good for generating data. In the early days of speech recognition, there was the hope that we could use the same HMMs we use to recognize speech, to also generate it. It turned out not to be such a good … Read more

1 – Multidemensional Output Probabilities

Now that we’ve shown how HMMs work, let’s provide some more tips on how to improve them. >> Okay, in our example of using HMMs to distinguish between the signs I versus we, we used delta y as a feature. But in reality delta x would be a better feature. >> That’s true, but for … Read more

9 – I vs We Quiz Solution

Here’s the answer. It could be probability distributions in middle states, as well as likely time spent in middle states

8 – I vs We Quiz

What property of the observed sequences of delta_ys can help tell the difference between the two gestures? Probability distributions in respect to starting states, probability distributions in middle states, likely time spent in middle states, or none of the above? Select all answers that could apply.

7 – HMM_ _We_

Great, now here’s the HMM I created for the gesture, we. [BLANK_AUDIO] >> Hold on. I would have used four states here. Why did you only use three? >> Well it was mostly to simplify the problem for our purposes. Note that the middle section varies a little bit more in delta y than with … Read more

6 – HMM_ _I_

Okay, I’ve made an HMM for sign language word, I. >> Great, how did you pick those states? >> Well, the gesture seemed like it had three separate motions. So, I made each of those their own state and chose the transition probabilities based on the timing. >> We can take a look at the … Read more

4 – Delta-y Quiz

Here are several plots of y versus t. Given these plots, match each of the y versus t plots with their derivative plots, delta y versus t.

31 – Baum Welch

So what’s next? >> A process called Baum Welch re-estimation. >> That’s like Expectation-maximization again, right? >> Correct. >> But how does it differ from what we just did? >> It’s very similar, but with Baum Welch, every sample of the data contributes to every state proportionally to the probability of that frame of data … Read more

30 – HMM Training

When we started this lesson, we create our models by inspection, however, most of the time we want to train using the data itself. When using HMMs for gesture recognition, I like to have at least 12 examples for each gesture I’m trying to recognize, five examples at a minimum. >> For illustration purposes let’s … Read more

3 – Sign Language Recognition

We will use sign language recognition as our first application of HMMs. For example, let’s consider the signs I and we and create HMMs for each of them. Here’s I. [BLANK_AUDIO] We is a little different. [BLANK_AUDIO] Let’s focus on the I gesture. We’ll use delta y as our first feature here. >> Wait a … Read more

29 – New Observation Sequence for _We_ Solution

Here’s the resulting probability for We, 2.91 x 10 to the -5. Note that this answer is higher than what we got for the model of i. Indicating that this observation sequence probably came from a We gesture. This is a different result from what we saw previously. Showing how the additional time spent in … Read more