Conferences >2023 IEEE 6th International E...

Power Text Data Preprocessing of Power Grid Infrastructure Project based on Skip-gram Model

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The power grid infrastructure project is a large-scale and long-period instance which often involves various subjects. It would produce a large amount of data, serving as...Show More

Metadata

Abstract:

The power grid infrastructure project is a large-scale and long-period instance which often involves various subjects. It would produce a large amount of data, serving as an important original data source of the operating maintenance and asset management systems in power supply enterprises. However, the manpower analysis fails to deal with unstructured natural text language data as well as nonstandard semi-structured tabular data. To address this issue, the deep analysis on the data with different forms is first conducted based on the power grid infrastructure project. Then, a data cleaning technique is used to eliminate the noise in the original low-quality data. Finally, a skip-gram model is built to convert the text data into a word embedding vector form. The well-preprocessed data contains contextual semantic information which is more suitable for data mining. Extensive simulation experiments clearly demonstrate the effectiveness of the proposed method.

Published in: 2023 IEEE 6th International Electrical and Energy Conference (CIEEC)

Date of Conference: 12-14 May 2023

Date Added to IEEE Xplore: 10 July 2023

ISBN Information:

DOI: 10.1109/CIEEC58067.2023.10165862

Conference Location: Hefei, China

References is not available for this document.

Contents

I. Introduction

The power grid infrastructure project usually delivers various features [1], e.g, large scale, long period, complex technology, cascaded stages, to name a few. Throughout the design, construction and acceptance check processes, there exists a vast number of multi-format power transmission project data derived from different sources. The involved data can be simply placed in two categories [2]. (1) Structured Data: The kind of data can be collected from design drawings, equipment nameplates and closeout drawings. It consists of multi-class environment attribute data and multi-dimensional geographic information data with different scales. Under unified design standards, this data is used for the digital loading and visual expression of physical characteristics and functional properties of the power transmission project. (2) Semi-structured and Unstructured Data: The kind of data is usually acquired from various design specifications, equipment test reports, equipment lists, etc. By stored in EXCEL, WORD, PDF and other formats, the text data is filled with useful information which is relevant to the power grid topology, asset and equipment. At the other end, it also exists in the inspection, dispatching and finance systems, such as manufacturer, project cost, etc. Compared with the structured data, this kind of data lacks unified design standards or formats, thereby it can be hardly stored in computers. In practice, it's frequently used for the service personnel working.

Select All

X Peng, D Deng, S Cheng et al., "Key technologies of electric power big data and its application prospects in smart grid[J]", Proceedings of the CSEE, vol. 35, no. 3, pp. 503-511, 2015.

Google Scholar

M Fang and L. HU, "High-efficiency large-scale power grid engineering data analysis based on preprocessing iterative method[J]", International Electronic Elements, vol. 30, no. 08, pp. 171-175, 2022.

Google Scholar

D He, N Kumar, S Zeadally et al., "Efficient and privacy-preserving data aggregation scheme for smart grid against internal adversaries[J]", IEEE Transactions on Smart Grid, vol. 8, no. 5, pp. 2411-2419, 2017.

View Article

Google Scholar

S Liu, S Zhang, L Zheng et al., "Fine analysis of power network state estimation based on big data[J]", Power Systems and Big Data, vol. 23, no. 07, pp. 9-15, 2020.

Google Scholar

B Xu, J Su, X Zhang et al., "Exploration of early warning and decision-making of power grid infrastructure projects based on big data analysis [C]", Management innovation practice of China's power enterprises, pp. 466-468, 2020.

Google Scholar

Y Zeng, X N Li and X Z. Liu, "Research on parallelization clustering algorithm for power communication big data[J]", Application of Electronic Technique, vol. 44, no. 05, pp. 1-4+24, 2018.

Google Scholar

S P. Liang, "Design and implementation of power equipment operation data analysis system based on big data[D]", North China Electric Power University, 2018.

Google Scholar

A Lazaridou, N T Pham and M. Baroni, "Combining language and vision with a multimodal skip-gram model[J]", arXiv preprint, 2015.

CrossRef Google Scholar

S Bauskar, V Badole, P Jain et al., "Natural language processing based hybrid model for detecting fake news using content-based features and social features[J]", International Journal of Information Engineering and Electronic Business, vol. 10, no. 4, 2019.

CrossRef Google Scholar

10.

T A Patel, M Puppala, R O Ogunti et al., "Correlating mammographic and pathologic findings in clinical decision support using natural language processing and data mining methods [J]", Cancer, vol. 123, no. 1, pp. 114-121, 2017.

CrossRef Google Scholar

11.

P Wang, J Xu, B Xu et al., "Semantic clustering and convolutional neural network for short text categorization[C]", Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 2, pp. 352-357, 2015.

CrossRef Google Scholar

12.

M Goudjil, M Koudil, M Bedda et al., "A novel active learning method using SVM for text classification[J]", International Journal of Automation and Computing, vol. 15, no. 3, pp. 290-298, 2018.

CrossRef Google Scholar

References is not available for this document.

Power Text Data Preprocessing of Power Grid Infrastructure Project based on Skip-gram Model

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Power Text Data Preprocessing of Power Grid Infrastructure Project based on Skip-gram Model

Alerts

Abstract:

Metadata

Abstract:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?