7 – Images as Grids of Pixels

One of your first tasks will be to classify a binary set of data; images taken during the day or night. But before you can complete this task, you first have to learn about how images are seen by machines. Let’s take this image of a car. This is actually a self-driving car on the road. And look at how a computer understands it. We’ll be working with a grayscale image like this first, because color adds another layer of complexity, but the same general principles will apply as we’ll see soon. So, when I show you this image, you might say, “Oh, it’s a picture of a car.” And it is, but it’s also a 2D grid of values, also known as an array with some width and height. Let me show you what I mean. This and all digital images, are all made of a grid of pixels, which are very small units of a single color or intensity. And if we zoom in on the image of the car like this area around the wheel, we can get a better look at these pixels. Now you can see it’s really starting to look more like a grid. Each pixel color in this grid, has a corresponding numerical value. For grayscale images like this, the value of each pixel ranges from zero to 255. Zero is black, 255 is white, and grey is anywhere in between. So, a value of around 120 is a medium grey, in between black and white. And a value of around 20, will be a very very dark grey, close to black. And each of these pixels, in addition to having a color value, also has a location xy in this image grid. These axes are a lot like axes for a graph, only for digital images, the top left coordinate is at the origin or the point x equals zero, y equals zero. Now, our image of a car is 427 pixels in height and 640 in width. And the pixel locations are on a grid that starts at index zero. From zero to 369 columns, and from zero to 426 rows. As an example, at the location x equals 190 and y equals 375, we have a pixel on this wheel at the bottom left of the image. The pixel value is 28, a dark dark grey. And you might ask how I know this. Well, in code we can actually find any single pixel value by location. So let’s do that. We’ll read in our image of a car, but first, I’ll import the libraries I need. This includes matplotlib.image, which let’s us read in any image. You’ll also see cv2, which is a computer vision library and you’ll learn more about that soon. I’ll also be using matplotlib qt, qt makes us image pop up in an interactive window when I display it. So I’ll read in the image of a car, using matplotlib’s imread function. And I’ll pass in the name of our image file. I have the image of a car and in images directory in the same location as this notebook. Next, I’ll actually print out some information about this image. I want to print out its dimensions by referencing image.shape. Now we can see its height and width in pixels. And we see another value three, which corresponds to the number of color channels this images has. And we’ll learn more about this value soon. For now, we’ll convert our image to be grayscale. And I’ll convert it to grayscale using a computer vision library. For now, know that this has built-in color conversion code, like changing an image from red-green-blue color to grayscale. Then I’ll display this grayscale image. This opens our interactive window. And as I pass over this image with my mouse, you can see the xy location displayed on the bottom left of the screen, as well as the corresponding pixel value. Down here by the wheel, we have dark pixel values, around 28, 29. And up here in the sky, we have some light pixel values. You can see around 220 or even higher. And if we go back to our notebook, we can print out the value of a single pixel by accessing it by location. I’ll say x equals 190 and y equals 375. And I can access that pixel value by looking at that location in our great image y, x. Finally, I’ll print out that value and we can see that it’s 28. Every pixel in an image is just a numerical value. And we can also change these pixel values. We can multiply every single one by a scalar to change how bright the image is. We can shift each pixel value to the right or left and many more operations. Treating images as grids of numbers is the basis for many image processing techniques. Most color and shape transformations are done just by mathematically operating on image, and changing it pixel by pixel.

Dr. Serendipity에서 더 알아보기

지금 구독하여 계속 읽고 전체 아카이브에 액세스하세요.

Continue reading