Conferences >2022 IEEE International Confe...

Research frontiers of pre-training mathematical models based on BERT

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Natural language processing (NLP) is a popular technology after the rise of big data and machine learning in recent years. With the development of deep learning, the fiel...Show More

Metadata

Abstract:

Natural language processing (NLP) is a popular technology after the rise of big data and machine learning in recent years. With the development of deep learning, the field of natural language processing has also undergone a landmark transformation, including the emergence of the BERT large-scale language training model. The emergence of this model makes text mining a qualitative leap, meets more practical needs, and solves the related problems of feature vectorization of unstructured data. This article will sort out the connotation, task application, and main optimization and improvement methods of the BERT pre-training model released by Google, and provide a reference for subsequent related research and development based on BERT.

Published in: 2022 IEEE International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA)

Date of Conference: 25-27 February 2022

Date Added to IEEE Xplore: 06 April 2022

ISBN Information:

DOI: 10.1109/EEBDA53927.2022.9744791

Conference Location: Changchun, China

References is not available for this document.

Contents

I. Overview of Pre-Trained Language Models

The pre-training language model is currently the best type of language model for natural language modeling. In early 2018, Peters et al. proposed the ELMo model, using the idea of a two-way language model and introducing a pre-training process. Afterwards, Radford et al. proposed the GPT model, using Transformer as the basic structure of the model, combined with ultra-large-scale unsupervised text data for pre-training, and achieved significant performance improvements in natural language generation tasks. Yu Tongrui and Jin Ran (2020) believe that pre-training technology refers to pre-designing the network structure, and inputting the encoded data into the network structure for training to increase the generalization ability of the model. Pre-training technology was originally proposed for problems in the image field (such as Res [NET], VGG). Because of its good effect, related technologies have been applied to the NLP field. According to the historical development sequence of the pre-training model, the development of pre-training technology is mainly divided into two stages, namely the traditional pre-training model stage based on probability statistics and the pre-training model stage based on deep learning. Chen Deguang, Ma Jinlin, etc. (2021)believe that both neural network pre-training techniques and traditional pre-training techniques need to pre-process the corpus. Specifically, pre-processing is to clean the original corpus(including removing blanks and removing invalid tags), Removal of symbols and pause words, document segmentation, basic error correction, coding conversion, etc.), word segmentation (only available for independent languages similar to Chinese), and standardization operations, so as to transform the corpus into a machine-recognizable language. Design in the text

Select All

M Peters, M Neumann, M Iyyer et al., "Deep contextualized word representations", Proc. of the 2018 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 2227-2237, 2018.

CrossRef Google Scholar

A Radford, K Narasimhan, T Salimans et al., "Improving language understanding by generative pre-training", 2019, [online] Available: https://s3-uswest-2.amazonaws.com/openaiassets/researchcovers/languageunsupervised/languageunderstandingpaper.pdf.

Research frontiers of pre-training mathematical models based on BERT

Alerts

Abstract:

Metadata

Abstract:

I. Overview of Pre-Trained Language Models

Authors

Figures

References

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?