6 – Negative Sampling

Welcome back. So this is the last part that you’ll be implementing yourself. In our architecture, we have this you know softmax layer on the output and you know since we’re working with, you know tens of thousands of words the softmax layer is going to have tens of thousands of units. But in any one, with any one input we’re all going to have like one true label. So what that means is we’re going to be making very small changes to millions of weights even though we actually only have like one true example, and so, so only very few of the weights are actually going to be updated in a meaningful way. So instead what we do is approximate the, the loss from the softmax layer and we can do this by like sampling just a small subset of all the weights. So what we do is we update the weights for the correct label but then we just do a small like sample of incorrect labels. Usually round like 100 or so. This is called negative sampling. So you want to read about it more, you can check out this link. And TensorFlow has provided a function to basically do this. So tf.nn.sampled_softmax_loss. So go ahead and read the documentation here to see how it works. So below here, you’re going to create the weights and the biases for the softmax layer and then use these to actually calculate the loss, with again with the targets with samples of softmax loss. And then pass this to cost and the AdamOptimizer is going to minimize the costs like normal. Okay, go ahead and try to implement this. And as usual check out my solution if you need help, or you know, you get stuck or just want to see how I did it. Cheers.

%d 블로거가 이것을 좋아합니다: