I. Introduction
Object recognition in real-life digital images is a challenging task. Several difficulties arise from non-controlled acquisition environments, such as different illumination conditions, different poses of the subject, and cluttered scenes [1]. State of the art solutions typically address these problems through deep learning techniques, which take advantage from large quantities of training data to build robust and well-generalizing classifiers [2].