Loading [MathJax]/extensions/MathMenu.js
Research frontiers of pre-training mathematical models based on BERT | IEEE Conference Publication | IEEE Xplore

Research frontiers of pre-training mathematical models based on BERT


Abstract:

Natural language processing (NLP) is a popular technology after the rise of big data and machine learning in recent years. With the development of deep learning, the fiel...Show More

Abstract:

Natural language processing (NLP) is a popular technology after the rise of big data and machine learning in recent years. With the development of deep learning, the field of natural language processing has also undergone a landmark transformation, including the emergence of the BERT large-scale language training model. The emergence of this model makes text mining a qualitative leap, meets more practical needs, and solves the related problems of feature vectorization of unstructured data. This article will sort out the connotation, task application, and main optimization and improvement methods of the BERT pre-training model released by Google, and provide a reference for subsequent related research and development based on BERT.
Date of Conference: 25-27 February 2022
Date Added to IEEE Xplore: 06 April 2022
ISBN Information:
Conference Location: Changchun, China
References is not available for this document.

I. Overview of Pre-Trained Language Models

The pre-training language model is currently the best type of language model for natural language modeling. In early 2018, Peters et al. proposed the ELMo model, using the idea of a two-way language model and introducing a pre-training process. Afterwards, Radford et al. proposed the GPT model, using Transformer as the basic structure of the model, combined with ultra-large-scale unsupervised text data for pre-training, and achieved significant performance improvements in natural language generation tasks. Yu Tongrui and Jin Ran (2020) believe that pre-training technology refers to pre-designing the network structure, and inputting the encoded data into the network structure for training to increase the generalization ability of the model. Pre-training technology was originally proposed for problems in the image field (such as Res [NET], VGG). Because of its good effect, related technologies have been applied to the NLP field. According to the historical development sequence of the pre-training model, the development of pre-training technology is mainly divided into two stages, namely the traditional pre-training model stage based on probability statistics and the pre-training model stage based on deep learning. Chen Deguang, Ma Jinlin, etc. (2021)believe that both neural network pre-training techniques and traditional pre-training techniques need to pre-process the corpus. Specifically, pre-processing is to clean the original corpus(including removing blanks and removing invalid tags), Removal of symbols and pause words, document segmentation, basic error correction, coding conversion, etc.), word segmentation (only available for independent languages similar to Chinese), and standardization operations, so as to transform the corpus into a machine-recognizable language. Design in the text

Select All
1.
M Peters, M Neumann, M Iyyer et al., "Deep contextualized word representations", Proc. of the 2018 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 2227-2237, 2018.
2.
A Radford, K Narasimhan, T Salimans et al., "Improving language understanding by generative pre-training", 2019, [online] Available: https://s3-uswest-2.amazonaws.com/openaiassets/researchcovers/languageunsupervised/languageunderstandingpaper.pdf.
3.
Tongrui Yu, Ran Jin, Xiaozhen Han, Jiahui Li and Ting Yu, "Research review of natural language processing pre-training models[J]", Computer Engineering and Applications, vol. 2020, no. 23, pp. 12-22.
4.
Deguang Chen, Jinlin Ma, Ziping Ma and Jie Zhou, "A review of natural language processing pre-training techniques[J]", Computer Science and Exploration, vol. 2021, no. 08.
5.
Ruiheng Liu, Xia Ye and Zeng Ying Yue, "Overview of pre-training models for natural language processing tasks [J]", Computer Applications, vol. 41, no. 5, pp. 1236-1246, 2021.
6.
Zhiwei Feng and Ying Li, Pre-training paradigm in natural language processing [J], vol. 01, 2021.
7.
Liwei Qiu, Weili Guan and Wujin Zhang, "Inquiry into the BERT language model [J]", Computer programming skills and maintenance, vol. 01, 2021.
8.
"Zu. BERT -based Chinese text vectorization representation [J]", Technology and innovation, vol. 2021, no. 21, pp. 107-108.
9.
Wei Mingfei, Pan Ji, Chen Zhimin and Mei Xiaohua, "Shi Huipeng pre-training method under the space intelligence entity identification model [J]", Journal of Huaqiao University (Natural Science Edition), vol. 06, pp. 834, 2021.
10.
Section Ruixue, Chao Wenyu and Zhang Yangsen, "Application of the pretrained language model BERT to downstream tasks [J]", J ournal of Beijing University of Information Technology, no. 06, pp. 78, 2020.
11.
Liu Huan, Zhang Zhixiong and Wang Yufei, "Review of the main optimization and improvement methods of the BERT model [J]", Data analysis and knowledge discovery, vol. 2021, no. 01.
12.
Feng Zhiwei and Li Ying, Pre-training paradigm in natural language processing [J], vol. 2021, no. 01, pp. 13.
13.
Section Ruixue, Chao Wenyu and Zhang Yangsen, "Application of the pre-trained language model BE R T in downstream tasks [J]", Journal of Beijing University of Information Technology, vol. 2020, no. 06, pp. 79-82.
14.
Liu Chuang, Application of BERT in Text analysis [D], 2020.
15.
Liu Huan, Zhang Zhixiong and Wang Yufei, "Review of the main optimization and improvement methods of the BERT model [J]", Data Analysis and Knowledge Discovery, vol. 5, no. 1, pp. 4, 2021.
16.
Dou Yuchen and Hu Yong, "BERT [J]", Information security research, vol. 2021, no. 03, pp. 242-249.
17.
Liu Wenxiu, Li Yanmei, Luo Jian, Li Wei and Fu Shunbing, "Chinese analysis based on BERT and BiLSTM [J]", Journal of Taiyuan Normal University (Natural Science Edition), vol. 2020, no. 04, pp. 52-57.
18.
Fang Ziqing and Chen Yifei, "Short-text similarity discriminant model based on BERT [J]", Computer knowledge and technology, vol. 2021, no. 05.
19.
Zhang Jingyi, He Guanghui, Daizhou and Liu Yadong, "BERT [J]", Journal of Shanghai Jiao Tong University, vol. 2021, no. 02, pp. 117-123.
20.
Lu Wei, Li Pengcheng, Zhang Guobiao and Cheng Qikai, "Academic text vocabulary function recognition-based on BERT vector quantitative representation of keywords automatic classification research [J]", intelligence journa1, vol. 2020, no. 12, pp. 1320-1329.
21.
Yue Yifeng, Huang Wei and Ren Xianghui, "Based on BERT [J]", Computers and modernization, vol. 2020, no. 01, pp. 63-64.

Contact IEEE to Subscribe

References

References is not available for this document.