1. Introduction
The main challenge in tracking multiple objects in a video sequence stems from its open world nature. Specifically, the unknown number of present objects as well as uncertainties and ambiguities in assigning tracks to detection (data association) make it difficult to decide if a detection is a false positive or a new track, or when a track should be established or terminated. Even more challenging is tracking objects in an online setting, where a globally optimal solution cannot be achieved. Locality in time means that once an error is made it can hardly be corrected in the future, e.g., confusing a false positive detection on a background structure with a new track could result in a false track, which will constantly be confirmed by the background structure.