2 – PyTorch V2 Part 1 Solution V1

So now, this is my solution for this exercise on calculating the output of this small simple neural network. So, remember that what we want to do is multiply our features by our weights, so features times weights. So these tensors, they work basically the same as NumPy arrays, if you’ve used NumPy before. So, when you multiply features times weights, it’ll just take the first element from each one, multiply them together, take the second element and multiply them together and so on and give you back a new tensor, where there’s element by element multiplication. So, from that we can do torch.sum to sum it all up into one value, add our bias term and then pass it through the activation function and then we get Y. So, we can also do this where we do features times weights again, and this creates another tensor, but tensors have a method.sum, where you just take a tensor do.sum and then it sums up all the values in that tensor. So, we can either do it this way or we do torch.sum, or we can just take this method, this sum method of a tensor and some upper values that way. Again, pass it through our our activation function. So, here what we’re doing, we’re doing this element wise multiplication and taking the sum in two separate operations. We’re doing this multiplication and then we’re doing the sum. But you can actually do this in the same operation using matrix multiplication. So, in general, you’re going to be wanting to use matrix multiplications most of the time, since they’re the more efficient and these linear algebra operations have been accelerated using modern libraries, such as CUDA that run on GPUs. To do matrix multiplication in PyTorch with our two tensors features and weights, we can use one of two methods. So, either torch.mm or torch.matmul. So, torch.mm, so matrix multiplication is more simple and more strict about the tensors that you pass in. So, torch.matmul, it actually supports broadcasting. So, if you put in tensors that have weird sizes, weird shapes, then you could get an output that you’re not expecting. So, what I tend to use torch.mm more often, so that it does what I expect basically, and then if I get something wrong it’s going throw an error instead of just doing it and continuing the calculations. So, however, if we actually try to use torch.mm with features and weights, we’ll get an error. So, here we see RuntimeError, size mismatch. So, what this means is that we passed in our two tensors to torch.mm, but there’s a mismatch in the sizes and it can’t actually do the matrix multiplication and it lists out the sizes here. So, the first tensor, M1 is one by five and the second tensor is one by five also. So, if you remember from your linear algebra classes or if you studied it recently, when you’re doing matrix multiplication, the first matrix has to have a number of columns that’s equal to the number of rows in the second matrix. So, really what we need is we need our weights tensor, our weights matrix to be five by one instead of one by five. To checkout the shape of tensors, as you’re building your networks, you want to use tensor.shape. So, this is something you’re going to be using all the time in PyTorch, but also in TensorFlow and in other deep learning frameworks So, most of the errors you’re going to see when you’re building networks and just a lot of the difficulty when it comes to designing the architecture of neural networks is getting the shapes of your tensors to work right together. So, what that means is that a large part of debugging, you’re actually going to be trying to look at the shape of your tensors as they’re going through your network. So, remember this, tensor.shape. So, for reshaping tensors, there are three, in general, three different options to choose from. So, we have these methods; reshape, resize, and view. The way these all work, in general, is that you take your tensor weights.reshape and then pass in the new shape that you want. So, in this case, you want to change our weights to be a five by one matrix, so we’d say.reshape and then five comma one. So, reshape here, what it will do is it’s going to return a new tensor with the same data as weights. So, the same data that’s sitting in memory at those addresses in memory. So, it’s going to basically just create a new tensor that has the shape that you requested, but the actual data in memory isn’t being changed. But that’s only sometimes. Sometimes it does return a clone and what that means is that it actually copies the data to another part of memory and then returns you a tensor on top of that part of the memory. As you can imagine when it actually does that, when it’s copying the data that’s less efficient than if you had just changed the shape of your tensor without cloning the data. To do something like that, we can use resize, where there’s underscore at the end. The underscore means that this method is an in-place operation. So, when it’s in-place, that basically means that you’re just not touching the data at all and all you’re doing is changing the tensor that’s sitting on top of that addressed data in memory. The problem with the resize method is that if you request a shape that has more or less elements than the original tensor, then you can actually cut off, you can actually lose some of the data that you had or you can create this spurious data from uninitialized memory. So instead, what you typically want is that you want a method that’s going to return an error if you changed the shape from the original number of elements to a different number of elements. So, we can actually do that with.view. So.view is the one that I use the most, and basically what it does it just returns a new tensor with the same data in memory as weights. This is just all the time, 100 percent of the time, all it’s going to do is return a new tensor without messing with any of the data in memory. If you tried to get a new size, a new shape for your tensor with a different number of elements, it’ll return an error. So, you are basically using.view, you’re ensuring that you will always get the same number of elements when you change the shape of your weights. So, this is why I typically use when I’m reshaping tensors. So, with all that out of the way, if you want to reshape weights to have five rows and one column, then you can use something like weights.view (5, 1), right. So, now, that you have seen how you can change the shape of a tensor and also do matrix multiplication, so this time I want you to calculate the output of this little neural network using matrix multiplication.