3 – Making Batches

Welcome back everyone. So now you are going to actually start changing the data into batches. You are going to be creating batches for this to pass into the network. So remember, with skip-gram architecture, we’re going to take one word and then we’re going to use the words that are around that word in the text as targets right, so to use as the context around whatever input word we have. So basically for every word, we’re going to grab a window of words around it and that window’s going to have some size C. So then what Mikalov did is they actually chose like some random range within this window. So the reason they did this is because distant words are usually less related to whatever input word you have, than the ones that are close to it. So basically you just want to kind of de-emphasize words that are further away as compared to words that are closer. So then if you have some windows C say it’s like five, then what you’re going to do is select some random number R that’s in a range of like one to C. So basically you’re just choosing like randomly a smaller window from your large window and then we are- smaller window which is defined by R, you just grab the words from the past and R words from the future. And then you’re going to use these as correct labels. So what this does is you’re basically always likely to get words that are right next to your current word, but you’re less likely to get words that are further away from your current word. And so what you’re really doing is going to be training on words that are closer to your current word, more often. So what you’ll be doing here is actually implementing this. Like, so, you’re going to get the target words. So you get a pass in a string of words and then an index which is your current word and then you’re going to grab a window from around the index and basically return all the target words in that window. But you’re also going to be using- like you’re going to find like some this random range R and then only return words that are in that smaller window R. And then once you have that written, this is a function that’s going to return our batches so the data that we’re going to pass into the network. So what this is going to do is going to grab some words from our, you know, big text list or list of integers and for each of those words it’s going to get the targets. So like the target words that show up in the window which you implement here. So I haven’t found a good way to like make this random number of target words actually work with the TensorFlow graph. So basically what I did here is just make each of those like input target pairs one row in the batch. So then if you have some current word and then there’s four target words then I’m going to make those into four different rows in the batch. So four input target pairs. Okay, so here just go ahead and try to implement this get_target function. And as always you can check out my solution in the notebook or my video. Cheers.

%d 블로거가 이것을 좋아합니다: