9 – Segmentally Boosted HMMs

In your past work on gesture recognition, how many dimensions have you used for your output probabilities? >> Up to hundreds. At one point, we are creating appearance models of the hand, using a similarity metric of how closely the current hand looked like different visual models of the hand as features for the HMM. … Read more

8 – State Tying

Another trick is State Tying. >> You mean combining training for states with the states within the model are closed? >> Yup. Let’s look at our models for I and we again. In the case where we are recognizing isolated signs, the initial movement of the right hand going to the chest is very similar … Read more

7 – Statistical Grammar

Statistical grammars can help us even more. In our example up to this point, we used a simple grammar, a pronoun, verb and noun. And that placed a strong limit of were we started and ended in our Viterbi trellis. >> But in real life, language is not so well segmented. Instead, we can record … Read more

6 – Context Training

OK. Now let’s talk about another trick. When we moved from recognizing isolated signs to recognizing phrases of signs, the combination of movements looks very different. >> For example, when Thad signed NEED in isolation, his hands started from a rest position and finished in the rest position. When he signs NEED in the context … Read more

5 – Stochastic Beam Search

Stochastic beam search, did we see that before? >> Not in detail. So far we’ve been doing something like a breadth first search, expanding each possible path in each time step. But now we want to prune some of those paths. >> Well some of those paths are going to get a low probability pretty … Read more

4 – Phrase Level Recognition

Now that we have topologies for our six signs, let’s talk about phrase level sign language recognition. We have eight phrases we want to recognize. >> Actually, you mean 7 signs and 12 phrases. >> Since we have two variants of cat we are recognizing, expanding all the possibilities leads to 12 phrases. >> Good … Read more

3 – HMM Topologies

Next, let’s talk about increasing the size of our vocabulary. >> Okay, I’ve selected some signs we can use to start making phrases. But we’re going to have to choose topologies for each of them. >> Well, we’ve already chosen topologies for I and we. What sign is next on your list? >> Well, let’s … Read more

2 – Using a Mixture of Gaussians

What if our output probabilities aren’t Gaussian? >> Well according to the central limit theorem, we should get Gaussians if enough factors are affecting the data. >> But in practice sometimes the output probabilities really are not Gaussian. It is not hard for them to be bimodal. >> You mean like this. >> Yep. >> … Read more

10 – Using HMMs to Generate Data

One last thing I’d like to cover is why the normal HMM formulation is not good for generating data. In the early days of speech recognition, there was the hope that we could use the same HMMs we use to recognize speech, to also generate it. It turned out not to be such a good … Read more

1 – Multidemensional Output Probabilities

Now that we’ve shown how HMMs work, let’s provide some more tips on how to improve them. >> Okay, in our example of using HMMs to distinguish between the signs I versus we, we used delta y as a feature. But in reality delta x would be a better feature. >> That’s true, but for … Read more