I. Introduction
In order to make use of the wealth of unstructured text data captured in electronic health records (EHR) in clinical data intelligence applications, its semantic annotation is required. The automatic detection of medial information extraction (IE) classes is crucial for integrated decision support. Currently, enormous manual effort is put in the acquisition of the structured data needed for clinical data intelligence or as input for quality assurance processes. In order to automate this step, we introduce a semantically enhanced IE pipeline that allows us to extract relevant information from EHRs of mamma carcinoma patients. We extract six medical IE classes, i.e., type of operation conducted, tumour size, grading of tumour, lymph node status, hormone receptor state, HER2 state, and lymphatic spread. These IE classes, together with the age of treated patient in structured format, are considered to be the main influencing indicators for the therapeutic measure to choose.