Machine translation is an important and challenging task that aims at automatically translating natural language sentences from one language into another. Recently, Transformer-based neural machine translation (NMT) has achieved great breakthroughs and has become a new mainstream method in both methodology and applications. This article from the research team of Prof. Zong Chengqing of the Institute of Automation, Chinese Academy of Sciences conducts an overview of Transformer-based NMT and its extension to other tasks. Specifically, it first introduces the framework of Transformer, discusses the main challenges in NMT and lists the representative methods for each challenge. Then, the public resources and toolkits in NMT are listed. Meanwhile, the extensions of Transformer in other tasks, including the other natural language processing tasks, computer vision tasks, audio tasks and multi-modal tasks, are briefly presented. Finally, possible future research directions are suggested.
From Springer
Machine translation (MT) aims at automatically translating natural language sentences using computers from one language into another. Since the first MT system was proposed, it has become one of the most important and challenging tasks in natural language processing (NLP) or even in the artificial intelligence community. With the effort of many researchers, MT has achieved remarkable progress in both methodology and applications.
Fig. 1 Encoder-decoder framework, where encoder transforms the source sentence into hidden states and decoder generates target translation from the hidden states
With the rapid development of machine learning and the availability of large-scale parallel corpora, statistical machine translation (SMT) approaches appeared in the 1990s and have drawn much attention. Instead of designing the translation rules manually, SMT learns the language model and word or phrase mappings automatically from the parallel corpora. However, SMT represents the source and target sentences as symbolic and discrete tokens. Thus, the performance of SMT is far from satisfactory.
With the breakthrough of deep learning, many studies have incorporated deep neural networks into MT. Early studies are still based on the SMT framework, where deep neural networks are utilized to design new features or extract more accurate semantic representations. In 2013 and 2014, end-to-end neural machine translation (NMT) has emerged as a new paradigm and quickly replaced SMT as the mainstream approach. NMT adopts the distributed representation of sentences and utilizes a whole neural network to learn the mappings from source sentences to target sentences. In only a few years of development, the translation quality of NMT has significantly improved and exceeded that of SMT. In practice, many companies (such as Google, Microsoft and Baidu) have deployed their own online translation systems and provide users with increasingly high-quality translation services.
From the perspective of NMT architectures, the early architectures are recurrent neural network (RNN) based NMT and convolutional neural network (CNN) based NMT models, which utilize the RNN and CNN to calculate the representation of source sentences and predict the target sentence. In 2017, a new framework, self-attention based NMT (Transformer), was proposed and sharply advanced the field of NMT. At present, Transformer has become the dominant architecture for machine translation, surpassing convolutional and recurrent neural network based NMT in terms of both translation quality and training speed. Meanwhile, Transformer goes far beyond NMT and extends to other tasks, such as other natural language processing tasks, computer vision tasks, audio tasks and multimodal tasks.
Fig. 3 Model structure of Transformer
This article attempts to give a survey of Transformer-based NMT, including the frameworks, the main challenges, the representative methods for each challenge and the available data and toolkits in NMT. It also briefly presents the extensions of Transformer in other NLP tasks, including pre-training language models, text summarization, dialogue and knowledge graphs. Finally, possible future research directions are discussed.
The remainder of this survey is organized as follows. Section 2 introduces the encoder-decoder framework and Transformer. Section 3 lists the main challenges of NMT. Section 4 represents the representative approaches for each challenge. Section 5 shows the resources and toolkits in NMT. Section 6 briefly presents the applications of Transformer in other tasks. Section 7 introduces the current status of NMT. Section 8 suggests some potential research directions.
Download full text:
Transformer: A General Framework from Machine Translation to Others
Yang Zhao, Jiajun Zhang, Chengqing Zong
https://www.mi-research.net/en/article/doi/10.1007/s11633-022-1393-5
https://link.springer.com/article/10.1007/s11633-022-1393-5
BibTex:
@Article{MIR-2022-09-288,
author = {Yang Zhao and Jiajun Zhang and Chengqing Zong},
journal = {Machine Intelligence Research},
title = {Transformer: A General Framework from Machine Translation to Others},
year = {2023},
volume = {20},
number = {4},
pages = {514-538},
doi = {10.1007/s11633-022-1393-5}
}