I. Introduction
Software vulnerabilities, once disclosed, can be documented in security databases, such as NVD [1], IBMXForce [2], ExploitDB [3]. People usually describe the key characteristics of a vulnerability in natural languages, such as the examples shown in Fig. 1. Key characteristics often include vulnerable product and versions, product vendor and root cause, attack vector and impact of the vulnerability. Although these vulnerability databases provide rich information about known vulnerabilities, security analysts have to manually identify and extract key information of their interests from textual vulnerability descriptions (TVD). Automatic information extraction is highly desirable to expedite vulnerability analysis and security research, for example, finding all vulnerabilities of a product with certain impact, or establishing traceability links between related vulnerabilities in different databases, or detecting discrepancies between vulnerability reports regarding the same vulnerability created by different people [4]–[11].