Loading web-font TeX/Main/Regular
Towards Unified Multi-Domain Machine Translation With Mixture of Domain Experts | IEEE Journals & Magazine | IEEE Xplore

Towards Unified Multi-Domain Machine Translation With Mixture of Domain Experts


Abstract:

Multi-domain machine translation (MDMT) aims to construct models with mixed-domain training corpora to switch translation between different domains. Previous studies eith...Show More

Abstract:

Multi-domain machine translation (MDMT) aims to construct models with mixed-domain training corpora to switch translation between different domains. Previous studies either assume that the domain information is given and leverage the domain knowledge to guide the translation process, or suppose that the domain information is unknown and utilize the model to automatically recognize it. However, the cases are mixed in practical scenarios, which means that some sentences are labeled with domain information while others are unlabeled, which is beyond the capacity of the previous methods. In this article, we propose a unified MDMT model with a mixture of sub-networks (experts) to address the cases with or without domain labels. The mixture of sub-networks in our MDMT model includes a shared expert and multiple domain-specific experts. For the inputs with domain labels, our MDMT model goes through the shared and the corresponding domain-specific experts. For the unlabeled inputs, our MDMT model activates all the experts, each of which makes a dynamic contribution. Experimental results on multiple diverse domains in De \rightarrow En, Fr \rightarrow En, and En \rightarrow Ro demonstrate that our method can outperform the strong baselines in both scenarios with or without domain labels. Further analyses show that our model has good generalization ability when transferring into new domains.
Page(s): 3488 - 3498
Date of Publication: 19 September 2023

ISSN Information:

Funding Agency:


I. Introduction

Machine translation (MT) quality has significantly improved with the development of neural machine translation (NMT) [1], [2], which always relies on a large amount of high-quality parallel sentences. However, in some low-resource domains, parallel sentences are limited and cannot support training a good NMT model. Moreover, exhausting all the potential domains and separately training the NMT models is expensive and impossible. To address this problem, Multi-domain (MD) MT [3], [4] is proposed to construct models with mixed-domain training corpora to switch translation between different domains. MDMT has multiple advantages: 1) When faced with inputs that are potentially from multiple domains, MDMT can be effective and cheap to deploy [5]. 2) MDMT allows domains to share information and promote the performance for related domains like the findings in multilingual translation [6]. 3) MDMT would be skilled at generalization and benefit the low-resource domains.

Contact IEEE to Subscribe

References

References is not available for this document.