Proposed Retrieval Framework
The proposed concept retrieval framework is shown in Figure 1. Some existing methods are used to prepare the training and testing data for each concept. The shot-based feature set has 16 audio features (including volume-based and energy-based features, and zero crossing rate), 11 visual features (including pixel-based, histogram-based, and background-based features), and one metafeature (the length of the shots). A set of 20 keyframe-based visual features (including color, texture, and edge) are merged with the shot-based feature set as the final feature set. After data preprocessing, the extracted features for all video segments are split into two sets: a training data set (two thirds of the whole data) and a testing data set (one third of the whole data). Then the training data set is provided as the input for the filtering component. The ranking component is trained using the filtered data. Finally, the ranked testing video segments are retrieved.