Hello again. So, here I’m going to be showing you how to actually build this network, so go through the inputs and the embedding layer and the soft max. So, as usual, we have our tensor flow placeholders, so we’re using integers in both of these and so just tf.n32 and then the labels have arbitrary batch size and arbitrary second dimension. So the deal is like with inputs and labels, for both of them we’re actually just passing in like just integers and so, you know, when I first started doing this, when I first built this, like, I thought I could just have them both like this. But then later when I was trying to implement part of the network, I kept having issues with the labels needing to have a second dimension. So it’s just sort of one of those things when you build it and you think you can just do one thing, then as you go through it you start hitting bugs and then you have to kind of go back and change the earlier part of your network to get it to work with the later part of your network. Okay. So now the embedding. So I set the embedding dimensions to 200 and then for our embedding weights I create a variable and use a random uniform distribution and the size of this, this weight tensor is going to be the size of our vocabulary by the size of the embedding. And then I just initialized it from -1 to 1. And then to do the embedding look up, so just tf.nnembedding_lookup and then embedding and inputs. So this will give you your embed tensor and this is basically just, you know, the values of the hidden layer. So, in general, you are going to be doing the same process, like anytime you’re using embeddings. So you create your embedding matrix and then you pass in your inputs into the embedding matrix and you get this embed layer and then you can pass this to the rest of your network. Okay. And finally for the negative sampling, so again create weights and biases for the soft max layer. So we’re initializing the weights with the truncated normal distribution. And so this is going to be, we need the weights to be the size of vocabulary and the size of the embedding. And then we have a standard deviation of 0.1. And the biases are going to be initialized as all zeros with the size of our vocabulary. So now to actually calculate the loss with negative sampling we used sampled Softmax loss, passing our weights or biases, the labels are an embedding layer and then the, in the sampled. So this is like the number of negative values that we’re going to sample and then the size of our vocabulary and that will run in this negative sampling and calculator loss. And then we get the cost and minimize it with Adam optimizer.