17 – PyTorch – Part 7

In this video, I’ll be showing you how to load image data. This is really useful for what you’ll be doing in real projects. So previously, we used MNIST. Fashion-MNIST were just toy datasets for testing your networks, but you’ll be using full-size images like you’d get from smartphone cameras and your actual projects that you’ll be doing with deep learning networks. So with this, we’ll be using a dataset of cat and dog photos, super cute. That come from Kaggle. So, if you want to learn more about it, you can just click on this link. So, you can see our images are now much larger, much higher resolution and they’re coming in different shapes and sizes than what we saw with MNIST and fashion-MNIST. So, the first step to using these is to actually load them in with PyTorch. Then once you have them in, you can train a network using these things. So, the easiest way to load in our image data is with datasets.ImageFolder. This is from torchvision, that datasets module. So basically, you just pass in a path to your dataset, so into the folder where your data is sitting into image folder and give us some transforms, which we talked about before. I’ll go into some more detail about transforms next. So, the image folder, it expects your files and directories to look like this, where you have some root directory that’s where all your data. Then each of the different classes has their own folder. So in this case, we have two classes. We have dog and cat. So, we have these two folders, dog and cat. Get more classes like for MNIST, now you have ten classes. There will be one folder for each of the different digits, right? Those are our classes or labels. Then within each of the specific class folders, you have your images that belong to those classes. So, in your dog folder are going to be all of your dog pictures and the cat folder are going to be all of your cat pictures. So, if you’re working in a workspace, then the data should already be there, but if you’re working on your local computer, you can get the data by clicking here. I’ve also already split this into a training set and test set for you. When you load in the image folder, you need to define some transforms. So, what I mean by this is you’ll want to resize it, you can crop it, you can do a lot of things like typically you’ll want to convert it to a PyTorch tensor and it is loaded in as a pillow image. So, you need to change the image into a tensor. Then you combine these transforms into a pipeline of transforms, using transforms.compose. So, if you want to resize your image to be 255 by 255, then you say transforms.resize 255 and then you take just the center portion, you just crop that out with a size of 224 by 224. Then you can convert it to a tensor. So, these are the transforms that you’ll use and you pass this into ImageFolder to define the transforms that you’re performing on your images. Once you have your dataset from your image folder, defining your transforms and then you pass that to dataloader. From here, you can define your batch size, so it’s the number of images you get per batch like per loop through this dataloader and then you can also do things like set shuffle to true. So basically, what shuffle does is it randomly shuffles your data every time you start a new epoch. This is useful because when you’re training your network, we prefer it the second time it goes through to see your images in a different order, the third time it goes through you see your images in a different order. Rather than just learning in the same order every time because then this could introduce weird artifacts in how your network is learning from your data. So, the thing to remember is that this dataloader that you get from this class dataloader, the actual dataloader object itself, is a generator. So, this means to get data out of it you actually have to loop through it like in a for loop or you need to call iter on it, to turn into an iterator. Then call next to get the data out of it. Really what’s happening here in this for loop, this for images comma labels in dataloader is actually turning this into an iterator. Every time you go through a loop, it calls next. So basically, this for loop is an automatic way of doing this. Okay. So, I’m going to leave up to you is to define some transforms, create your image folder and then pass that image folder to create a dataloader. Then if you do everything right, you should see an image that looks like this. So, that’s the basic way of loading in your data. You can also do what’s called data augmentation. So, what this is is you want to introduce randomness into your data itself. What this can do is you can imagine if you have images, you can translate where a cat shows up and you can rotate the cat, you can scale the cat, you can crop different parts of things, you can mirror it horizontally and vertically. What this does is it helps your network generalized because it’s seen these images in different scales, at different orientations and so on. This really helps your network train and will eventually lead to better accuracy on your validation tests. Here, I’ll let you define some transforms for training data. So here, you want to do the data augmentation thing, where you’re randomly cropping and resizing and rotating your images and also define transforms for the test dataset. So, one thing to remember is that for testing when you’re doing your validation, you don’t want to do any of this data augmentation. So basically, you just want to just do a resize and center crop of your images. This is because you want your validation to be similar to the eventual like in state of your model. Once you train your data, you’re going to be sending in pictures of cats and dogs. So, you want your validation set to look pretty much exactly like what your eventual input images will look like. If you do all that correctly, you should see training examples are like this. So, you can see how these are rotated. Then you’re testing examples should look like this, where they are scaled proportionally and they’re not rotated. Once you’ve loaded this data, you should try to build a network based on what you’ve already learned that can then classify cats and dogs from this dataset. I should warn you this is actually a pretty tough challenge and it probably won’t work. So, don’t try too hard at it. Before you used MNIST and fashion-IMNIST. Those are very simple images, right? So, there are 20 by 28. They only have grayscale colors. But now these cat and dog images, they’re much larger. Their colors, so you have those three channels. Just in general, it’s going to be very difficult to build a classifier that can do this just using this fully connected network. The next part, I’ll show you how to use a pre-trained network to build a model that can actually classify these cat and dog images. Cheers.