In your past work on gesture recognition, how many dimensions have you used for your output probabilities? >> Up to hundreds. At one point, we are creating appearance models of the hand, using a similarity metric of how closely the current hand looked like different visual models of the hand as features for the HMM. However, the problem was there was a lot of noise in that the models that didn’t match well were giving relatively random results. Causing many of the dimensions to be meaningless unless the hand at that particular time matched well. >> So all those dimensions were actually hurting you because they were all very noisy most of the time? >> Correct, however we can use boosting to help us weight the feature vector to combat this problem. This technique is called segmentally boosted HMMs. One of my students, did hi PhD dissertation on the idea. And his results were often 20% better than normal HMMs. For some datasets, we got up to 70% improvement. >> How does it work? >> First, we align and train the HMMs as normal. Next, we use that training to align the data that belongs to each date as best we can. We examine each state in each model iteratively. We boost by asking which features help us most to differentiate the data for our chosen state versus the rest of the states. We then weight the dimensions appropriately in that HMM. This trick combines some of the advantages of the discriminative models with generative methods. >> Is it in a toolkit somewhere? >> Not yet. But I’m thinking about adding it to HTK and our Georgia Tech Gesture Toolkit. Personally, I think it can be pretty powerful but it’s not widely known yet.