I. Introduction
Detection of malicious binaries still constitutes one of the major quests in computer security [22]. To counter their growing number, sophistication and variability, machine learning-based solutions are becoming increasingly adopted also by anti-malware companies [13]. Although past research work on binary malware detection has explored the use of traditional learning algorithms on n-gram-based, system-call-based, or behavior-based features [1], [19], [21], [26], more recent work has considered the possibility of using deep-learning algorithms on raw bytes as an effective way to improve accuracy on a wide range of samples [18]. The rationale is that such algorithms should automatically learn the relationships among the various sections of the executable file, thus extracting a number of features that correctly represent the role of specific byte groups in specific sections (e.g., if a byte belongs to the code section or simply to a section pointer).