15 – 05 Batching Data V1

So, this is my complete get_batches code that generates mini-batches of data. So, the first thing I wanted to do here is get the total number of complete batches that we can make in batches. To do that, I first calculated how many characters were in a complete mini-batch. So, in one mini-batch, there’s going to be batch size times sequence length number of characters. Then, the number of complete batches that we can make is just the length of the array divided by the total number of characters in a mini-batch. This double slash is an integer division which we’ll just round down any decimal leftover from this division. With that, we have the number of completely full batches that we can make. Then, we get our array and we take all the characters in the array up to n_batches times this total character size for a mini-batch. So here, we’re making sure that we’re keeping only enough characters to make full batches, and we may lose some characters here. But in general, you’re going to have enough data that getting rid of a last unfold batch is not really going to matter. Next, with reshaping, we can take our array and we can make the number of rows equal to our batch size, and that’s just how many sequences we want to include in a mini-batch. So, we just say we want the number of rows to be batch size and then we put this negative one. Negative one here is kind of a dimension placeholder, and it’ll just automatically fill up the second dimension to whatever size it needs to be to accommodate all the data. Then finally, I’m iterating over my batch data using a window of length sequence length. So here, I’m taking in a reshaped complete array and then looking at all our rows, all our batches, and the columns are in a range from n to n plus sequence length, which makes our sequence length window. This completes x our input mini-batch. Then, what I did here for the target y is I just initialized an array of all zeros that’s the same shape as x, and I just kind of fill it up with values from our x array shifted by one. From the start to the end, I just shifted over x by one. Then in the case of reaching the very end of our array, I’m going to make the last element of y equal to the first element in our array. I’m not super sure why most people do it this way, wrapping our array around so that the last element of y is the first element of x. But I’ve seen this many times, and so I did it in the cyclical way and it seems like the network trains perfectly fine doing this. So, it does not seem to be a problem. The main thing is we want x and y to be the same size. So, if you did this right and you want to test your implementation, you should have gotten batches that looks something like this. Right here, we have a batch size of eight, so you have eight rows here, and then we’re just printing out the first 10 items in a sequence. So, you should see 10 items here. The important thing to note here is that you want to make sure that the elements in x, like the actual encoded values, are shifted over by one in y. So, we have 51 as the first item here and as the zeroth item here, then 23 and 23. Likewise, 55 is here and 55 is here in y. I basically want to make sure that everything is shifted over correctly, and this looks good. So now that we have our batch data, the next step we’ll talk about is actually building the network.

%d 블로거가 이것을 좋아합니다: