1 Introduction
Developers often discuss in the form of natural language text on social platforms (such as Stack Overflow, CodeProject) to share and acquire programming knowledge. Therefore, many natural-language-processing (NLP) based techniques have been proposed and developed to mine programming knowledge from such informal discussions. The mined knowledge can assist developers for many software engineering tasks, such as searching for documents [1], [2], categorizing software technologies [3], extracting API mentions and usage insights [4], [5], recovering traceability among informal discussions (e.g., duplicate questions [6]) or between code and informal discussions [7], linking domain-specific entities in informal discussions to official documents [8] and mining technology landscapes [9], [10]. To make an effective use of NLP techniques in these tasks, a consistently-used vocabulary of software-specific terms is essential, since NLP techniques assume that the same words are used whenever a particular concept is mentioned.