I. Introduction
The transfer learning (TL) has emerged as a cornerstone of NLP, enabling the development and deployment of new kinds of language models. A good NLP model had to be trained on very rich task specific data, which is both time consuming and resource intensive. Transfer learning addresses these problems by enabling pre-trained models to be fine-tuned on new tasks with only a few data, which in turn cuts down the time and culmination costs of the training stage (Salehi et al., 2023). I could say that not only has this enabled the development of state-of-the-art performance in domains with limited labeled data.