12 – PyTorch V2 Part 4 Solution V1

Again. So, in the last video, I hand you try out building your own neural network to classify this fashion in this dataset. Here is my solution like how I decided to build this. So first, building our network. So, here, I’m going to import our normal modules from PyTorch. So, nn and optim, so, nn is going to allow us to build our network, and optim is going to give us our optimizers. I must going to import this functional modules, so, we can use functions like ReLU and log softmax. I decided to define my network architectures using the class. So, in nn.modules subclassing from this, and it’s called a classifier. Then I created four different linear transformations. So, in this case, it’s three hidden layers and then one output layer. Our first hidden layer has 256 units. The second hidden layer has a 128, one after that has 64. Then our output has 10 units. So, in the forward pass, I did something a little different. So, I made sure here that the input tensor is actually flattened. So now, you don’t have to flatten your input tensors in the training loop, it’ll just do it in the forward pass itself. So, to do this is do x.view, which is going to change our shape. So, x.shape zero is going to give us our batch size. Then the negative one here is going to basically fill out the the second dimension with as many elements as it needs to keep the same total number of elements. So, what this does is it basically gives us another tensor, that is the flattened version of our input tensor. It doing a pass these through our linear transformations, and then ReLU activation functions. Then finally, we use a log softmax with a dimension set to one, as our output, and return that from our forward function. With the model defined, I can do model equals classifiers. So, this actually creates our model. Then we define our criterion with the negative log likelihood loss. So, I’m using log softmax as the output my model. So, I want to use the NLLLoss as the criterion. Then, here I’m using the Adam optimizer. So, this is basically the same as stochastic gradient descent, but it has some nice properties where it uses momentum which speeds up the actual fitting process. It also adjust the learning rate for each of the individual parameters in your model. Here, I wrote my training loop. So, again I’m using five epochs. So, for e in range epoch, so, this is going to basically loop through our dataset five times, I’m tracking the loss with running loss, and just kind of instantiated it here. Then, getting our images. So, from images labels in train loader, so I get our log probabilities by passing in the images to a model. So, one thing to note, you can kind of do a little shortcut. If you just pass these in to model as if it was a function, then it will run the forward method. So, this is just a kind of a shorter way to run the forward pass through your model. Then with the log probabilities and the labels, I can calculate the loss. Then here, I am zeroing the gradients. Now I’m doing the lost up backwards to calculating our gradients, and then with the gradients, I can do our optimizer step. If we tried it, we can see at least for these first five epochs that are loss actually drops. Now the network is trained, we can actually test it out. So, we pass in data to our model, calculate the probabilities. So, here, doing the forward pass through the model to get our actual log probabilities and with the log probabilities, you can take the exponential to get the actual probabilities. Then with that, we can pass it into this nice little view classify function that I wrote, and it shows us, if we pass an image of a shirt, it tells us that it’s a shirt. So, our network seems to have learned fairly well, what this dataset is showing us.

%d