So how does a computer actually see image data? Images or just 2D representations of 3D world scenes. For example, if you take a picture of an apple which we know is a 3-D object, you’ll get a 2D image that represents that apple. The image contains detail about the color and the shape of the apple. It also has shading that varies based on lighting conditions, and an apparent size that varies based on how close or far away the picture has been taken. For example, the apple will appear bigger the closer the camera is to it. When a camera forms an image like this, it’s looking at the world similar to how our eyes do, by focusing the light that’s reflected off of objects in the world. Let’s see an example. Here is a simple model of a camera called a pinhole camera model. In this case, through a small pinhole, the camera focuses the light that’s reflected off of an apple and forms a 2D image at the back of the camera where a sensor or some film would be placed. In fact, the image it forms here will be upside down and reversed because rays of light that enter from the top of an object will continue on that angled path through the pinhole and end up at the bottom of the formed image. Similarly, light that reflects off the right side of an object will travel to the left of the formed image. A digital camera will record this image and flip it to give us a familiar 2D image of an apple or any other object. This is the start of how a computer sees an image. Next, we’ll see how a digital image can be broken down into a grid of small units of color and intensity called pixels. And this grid is paramount to how we can programmatically process and interpret images.