9 – Histogram of Oriented Gradients

In computer vision there are many algorithms that are designed to extract spatial features and identify objects using information about image gradients. One illustrative technique is called HOG or Histogram of Oriented Gradients. Histogram of Oriented Gradients may sound a little intimidating. So let’s go through what these terms actually mean. A histogram is a graphical representation of the distribution of data. It looks a bit like a bar graph with bars of different heights. Each bar represents a group of data that falls in a certain range of values, also called bins, and taller bars indicate that more data falls into a certain bin. For example, say, you took a grayscale image. Say, this image of pancakes and wanted to display a histogram of intensity data. We know that all these pixel values range from 0 to 255 and we can create bins to partition these values into ranges. I’ll create 32 bins each holding a range of 8 pixel values 0 to 7, 8 to 15, and so on up to 248 to 255. Then to create a histogram we look at every pixel value in this image and put each one in its correct bin. This image has a lot of bright values in the pancakes, but a very dark background. So we get a histogram that looks like this. This histogram shows that there’s a distinct dark grouping of pixels- the background pixels, that fall in these low ranges. And there’s another bright grouping of pixels that are often around a grayscale value of 200, which must be the majority of the pancake pixel values. So now we know what a histogram of grayscale values looks like. The next words we see are oriented gradients. Oriented just means the direction or orientation of an image gradient. And we’ve already discussed how both the magnitude and the direction of a gradient can be calculated using an sobel operators. So let’s put this all together. HOG should produce a histogram of gradient directions in an image. As a first step, HOG takes in an image like this pancake image and calculates the magnitude and direction of the gradient at each pixel. This can be a lot of information. So it actually groups these pixels into larger square cells typically eight by eight or smaller grids for smaller pictures. For the eight by eight case, it will have 64 gradient values, then for each of these cells it counts up how many of these gradients are in a certain direction and sums the magnitude of these gradients so that the strength of the gradients are accounted for. Then HOG places all that directional data into a histogram. Here I’m showing you nine bins or ranges of values, but you can choose to use more bins to further divide your data. And this is the histogram of oriented gradients that the pancake edge turns into, and HOG does this for every cell in the image. This histogram of oriented gradients is actually a feature vector. The next step will be to actually use these HOG features to train a classifier. The idea is that among images of the same object at different scales and orientations, the same pattern of HOG features can be used to detect the object wherever and however it appears. But first let’s see how to implement HOG in code. This will be a multi-step process. So in this and the next few examples, I’ll explain the algorithm in a video and then you can expect a code and text explanation to follow for further learning.

Dr. Serendipity에서 더 알아보기

지금 구독하여 계속 읽고 전체 아카이브에 액세스하세요.

Continue reading