Multilingual Neural Machine Translation for Asian Language Treebank

Authors

Corressponding author's email:

nhblong@fit.hcmus.edu.vn

DOI:

https://doi.org/10.54644/jte.2026.2047

Keywords:

Multilingual, Neural Machine Translation, Asian Languages, Low Resources, Asian Language Treebank

Abstract

This study examines multilingual neural machine translation (MNMT) for a diverse group of low-resource Asian languages-Bengali, Filipino, Indonesian, Japanese, Khmer, Malay, and Vietnamese-which differ substantially in linguistic families, writing systems, and typology. This paper evaluates state-of-the-art MNMT systems and introduces a Compact & Language-Sensitive MNMT model designed to improve translation performance while reducing computational cost. The proposed approach shares parameters through a compact multilingual representation, and enhances language discrimination using language-sensitive embeddings, a language-sensitive discriminator, and an adaptive cross-attention mechanism that selects attention parameters based on specific language pairs. Integrated with a multi-stage fine-tuning strategy, this model effectively strengthens cross-lingual transfer while maintaining robust language-specific representations. Experiments on the ALT multi-parallel corpus and the KFTT English-Japanese dataset demonstrate that multilingual models significantly outperform single-language NMT baselines. Despite its smaller size, the proposed Compact & Language-Sensitive MNMT achieves competitive or superior BLEU scores compared to Google’s MNMT, confirming the effectiveness of guided parameter sharing and language-sensitive training. These results highlight the value of compact multilingual architectures and multi-parallel datasets for advancing low-resource Asian machine translation.

Downloads: 0

Download data is not yet available.

Author Biographies

Hong Buu Long Nguyen, University of Science, VNU-HCM, Vietnam

Hong Buu Long Nguyen received the B.S. degree (Honors Program) in Information Technology from the University of Science, Vietnam National University, Ho Chi Minh City (VNU-HCM), Vietnam, in 2010, the M.S. degree in Computer Science from the same university in 2015, and the Ph.D. degree in Computer Science in 2023. He is currently a lecturer and researcher at the University of Science, Vietnam National University, Ho Chi Minh City. He has published numerous articles in reputable journals and serves as a reviewer for several A*, A, and B-ranked conferences as well as SCIE-indexed journals. His research interests include machine translation, question answering, and language modelling.

Email: nhblong@fit.hcmus.edu.vn. ORCID:  https://orcid.org/0000-0002-0884-1635

Thanh Tung Vu, University of Science, VNU-HCM, Vietnam

Thanh Tung Vu is a graduate student in the Faculty of Information Technology, University of Science, VNU-HCMC., Vietnam. His research interests are Natural Language Processing, Deep Learning and Machine Translation.

Email: thanhtungvu727@gmail.com. ORCID:  https://orcid.org/0009-0000-2837-3288

References

R. Dabre, T. Nakagawa, and H. Kazawa, “A survey of multilingual neural machine translation,” ACM Comput. Surv., vol. 53, no. 5, pp. 1–38, 2020.

D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in Proc. Int. Conf. Learn. Representations (ICLR), San Diego, CA, USA, 2015.

M. Ü. Uyar, “RNNs with attention for machine translation,” in Machine Learning and AI with Simple Python and Matlab Scripts: Courseware for Non-computing Majors. Hoboken, NJ, USA: IEEE-Wiley, 2025, pp. 209–223, doi: 10.1002/9781394294985.ch13.

I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 27, 2014, pp. 3104–3112.

K. Cho, B. van Merriënboer, D. Bahdanau, and Y. Bengio, “On the properties of neural machine translation: Encoder–decoder approaches,” arXiv preprint, arXiv:1409.1259, 2014.

A. Li, “Application of convolution neural network algorithm in English translation,” in Proc. Int. Conf. Integrated Intelligence and Communication Systems (ICIICS), Kalaburagi, India, 2023, pp. 1–8, doi: 10.1109/ICIICS59993.2023.10421570.

A. Vaswani et al., “Attention is all you need,” in Adv. Neural Inf. Process. Syst. (NeurIPS), 2017, pp. 5998–6008.

Y. Chen, Y. Liu, Y. Cheng, and V. O. K. Li, “A teacher–student framework for zero-resource neural machine translation,” in Proc. 55th Annu. Meeting Assoc. Comput. Linguistics (ACL), Vancouver, BC, Canada, 2017, pp. 1925–1935, doi: 10.18653/v1/P17-1176.

Y. Chen, Y. Liu, and V. O. K. Li, “Zero-resource neural machine translation with multi-agent communication game,” in Proc. AAAI Conf. Artif. Intell., 2018, pp. 5086–5093.

Y. Cheng, Q. Yang, Y. Liu, M. Sun, and W. Xu, “Joint training for pivot-based neural machine translation,” in Proc. Int. Joint Conf. Artif. Intell. (IJCAI), Melbourne, VIC, Australia, 2017, pp. 3974–3980, doi: 10.24963/ijcai.2017/555.

O. Firat, K. Cho, and Y. Bengio, “Multi-way, multilingual neural machine translation with a shared attention mechanism,” in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics: Human Lang. Technol. (NAACL-HLT), San Diego, CA, USA, 2016, pp. 866–875, doi: 10.18653/v1/N16-1101.

M. Johnson et al., “Google’s multilingual neural machine translation system: Enabling zero-shot translation,” Trans. Assoc. Comput. Linguistics, vol. 5, pp. 339–351, 2017.

Y. Wang et al., “A compact and language-sensitive multilingual translation method,” in Proc. 57th Annu. Meeting Assoc. Comput. Linguistics (ACL), Florence, Italy, 2019, pp. 1213–1223.

B. Zoph and K. Knight, “Multi-source neural translation,” in Proc. NAACL-HLT, San Diego, CA, USA, 2016, pp. 30–34, doi: 10.18653/v1/N16-1004.

R. Dabre and A. Fujita, “Exploiting multilingualism through multistage fine-tuning for low-resource neural machine translation,” in Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, Nov. 2019, pp. 1410–1416.

G. Neubig, “The Kyoto free translation task,” 2011. [Online]. Available: http://www.phontron.com/kftt

H. Riza et al., “Introduction of the Asian language treebank,” in Proc. Oriental COCOSDA, 2016.

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. Int. Conf. Learn. Representations (ICLR), 2015.

K. Papineni, S. Roukos, T. Ward, and W. Zhu, “BLEU: A method for automatic evaluation of machine translation,” in Proc. 40th Annu. Meeting Assoc. Comput. Linguistics (ACL), Philadelphia, PA, USA, 2002.

P. Koehn, “Statistical significance tests for machine translation evaluation,” in Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP), Barcelona, Spain, Jul. 2004.

H. Nguyen et al., “Language-oriented sentiment analysis based on grammar structure and improved self-attention network,” in Proc. 15th Int. Conf. Evaluation of Novel Approaches to Software Engineering (ENASE), Prague, Czech Republic, May 2020, pp. 339–346.

Downloads

Published

28-02-2026

How to Cite

[1]
H. B. L. Nguyen and T. T. Vu, “Multilingual Neural Machine Translation for Asian Language Treebank”, JTE, vol. 21, no. 01, pp. 99–109, Feb. 2026.

Issue

Section

Research Article

Categories