I. Introduction
The previous years have witnesses an expansion in the installation of video monitoring equipment in public and private spaces. Within indoor intelligent environments where privacy is not a limitation, there is a growing need to develop linguistic summarization tools which are capable of summarizing in a layman language the information of interest within the long video sequences recorded in such spaces. Such summarization can be used to detect automatically serious events that need immediate attention such as attempted burglaries, serious injuries, etc. Linguistic summarization can also provide valuable context information from the video which cannot be extracted by other sensors. For example, an important application in elderly care in intelligent environments is ensuring that the user drinks enough water throughout the day to avoid dehydration.