9 – 09 NonMaximal Suppression V1

For a test image broken down into a grid, how can we handle the case in which our grid CNN produces multiple grid cell vectors and multiple bounding boxes for the same object? To account for this, we use a technique called non-maximal suppression. This uses the IOU between two predicted bounding boxes to select the best bounding box. Let’s see an example. In this example, these three grid cells have a non-zero PC value, meaning they’ve all detected an object and a bounding box for that object. If we plot the bounding boxes predicted by each grid cell along with the value of PC, we get this. Three three bounding boxes for the same object. But the PC value associated with each box is different. We see that the first bounding box has a PC value of 0.8, the second has a value of 0.9, and the last has a value of zero 0.7. PC is a measure of confidence that there has been an object detected, and so a high PC means a high confidence of object detection. Now non-maximal suppression selects only the bounding box with the highest PC value. In this case, the bounding box with the highest PC value is the red bounding box with a value of 0.9. It then removes all of the bounding boxes that have a high IOU value when compared to this best bounding box that I just selected. In this way it gets rid of the boxes that are too similar to the red bounding box. Then you’re left with just one bounding box, which should correspond to the best prediction. This is called non-maximal suppression, because you suppress overlapping bounding boxes that do not have the maximum probability for object detection. So far we’ve been going through an example with only one object in the image. When there are multiple objects in an image, we have to apply non-maximal suppression to each class independently. So if we have three classes, we have to apply non-maximal suppression three times. Next, we’ll see another technique that YOLO uses to detect objects even if they overlap