I. Introduction
The power grid infrastructure project usually delivers various features [1], e.g, large scale, long period, complex technology, cascaded stages, to name a few. Throughout the design, construction and acceptance check processes, there exists a vast number of multi-format power transmission project data derived from different sources. The involved data can be simply placed in two categories [2]. (1) Structured Data: The kind of data can be collected from design drawings, equipment nameplates and closeout drawings. It consists of multi-class environment attribute data and multi-dimensional geographic information data with different scales. Under unified design standards, this data is used for the digital loading and visual expression of physical characteristics and functional properties of the power transmission project. (2) Semi-structured and Unstructured Data: The kind of data is usually acquired from various design specifications, equipment test reports, equipment lists, etc. By stored in EXCEL, WORD, PDF and other formats, the text data is filled with useful information which is relevant to the power grid topology, asset and equipment. At the other end, it also exists in the inspection, dispatching and finance systems, such as manufacturer, project cost, etc. Compared with the structured data, this kind of data lacks unified design standards or formats, thereby it can be hardly stored in computers. In practice, it's frequently used for the service personnel working.