1. Introduction
We propose an approach for finding if and when an action is performed in a large video database. We focus on actions of short duration (typically a few seconds), like sitting down or opening a door, and several hours of real-world video data (e.g. movies). Our approach is based on decomposing actions into sequences of key atomic action units, which we refer to as actoms.