I. Introduction
Tracking in wide-area motion imagery is a challenging research domain that is receiving a lot of current interest. The visualization of hundreds to thousands of tracks resulting from automated and manual tracking of objects offers new challenges in visualization and visual analytics. Even with standard video sequences, also known as full motion video (FMV), meaningful comparisons of tracking algorithm behavior with quantitative performance metrics were difficult to perform due to the paucity of standard video datasets with associated manual labeled ground truth. However, recent work has led to the creation of extensive, open repositories of non-wide area video datasets, tools and ground truth. Notable examples include open source tools such as the Video Performance Evaluation Resource (ViPER-GT) of scripts and Java programs [4], which allows for metadata viewing and editing, ground truth generation and annotation of video including a frame accurate MPEG-1 decoder; VIPER-PE a scriptable command-line based performance evaluation tool; VirtualDub for frame accurate capturing, playing back and filtering of video in AVI format; the Video Surveillance Online Repository (ViSOR) project [24], which comprises a web-based dynamic, shareable, open repository of surveillance video sequences and annotations using an event ontology; and the Scoring, Truthing And Registration Toolkit (START), for semi-automated ground truth generation using a keyframe approach [20]. Several workshops, such as the PETS series and the VSSN series, and national-level projects, such as i-LIDS [7] and ETISEO [11], utilize the ViPER-XML annotation format in their video databases. Another important example is the ground truth motion database developed along with the layer segmentation and motion annotation tools at MIT CSAIL [9]. Since manual ground truth creation from real video is time consuming and error-prone an alternative approach using computer graphics tools to automatically create precise ground truth from realistic/synthetic video of virtual worlds using simulated cameras has also been studied [12], [18], [19]. The NGA coordinated Motion Imagery Standards Board (MISB) has been developing a metadata architecture and standards-based software interfaces for Video Moving Target Indicator (VMTI) systems to analyze and share activity-based GeoINT and tracking results for characterizing actions and interactions in a wide range of motion imagery including FMV and Large Volume Streaming Data (LVSD) [21].