1 Introduction
Many domains nowadays try to gain insight in complex phenomena by logging their behavior. Telecom companies for instance analyze their communication networks for the presence of fraud, hospitals analyze patient treatments to discover bottlenecks in the process, and companies study their work flows to improve customer satisfaction. The common ground here is that domains are interested in the analysis of sequences (e.g., phone calls, treatments, work flows) in their system by recording events. Without loss of generality, we define a sequence (a.k.a. trace, record, session, case, or conversation) as a series of events that have the same sequence_id. Besides their type and temporal information, events often have more associated information (e.g., status code, source, length etc.) depending on the domain. In addition, the number of events in real-world data is typically in the order of millions and more.