1. Introduction
Object Detection is the task of localizing and classifying objects in a given image. Applications such as assisted and autonomous driving [9], navigation aids for visually impaired [21], and robotics applications [13] use object detector module as an integral step for motion planning, landmark detection etc. The efficacy of these detectors on a set of images is typically measured using conventional metrics like precision and recall. For evaluating performance on video input, True positives (TP), False Positives (FP), True Negatives (TN) and False Negatives (FN) determined using the Jaccard index criteria are simply accumulated over all frames of the video. Jaccard index, more commonly known as IoU is the intersection over union of the bounding box predicted using the object detector under consideration with that given in the ground truth. A detection is considered as positive if the Jaccard Index is greater than a threshold. Precision-Recall curves are then plotted for different values of the threshold of the detector and Mean Average Precision (mAP) is often used as a measure of how well the detector performs.
There are a number of factors which affect the performance of detectors under different constraints. We discuss some of these constraints and situations. We also suggest an evaluation criterion which can capture this information and take the evaluation of object detectors closer to the application performance.