北京第二外国语学院学报 ›› 2021, Vol. 43 ›› Issue (5): 66-82.DOI: 10.12002/j.bisu.355

• 翻译研究(机器翻译译后编辑专栏 主持人:王华树) • 上一篇    下一篇

神经机器翻译译文评测及译后编辑研究

郭望皓1(), 胡富茂2()   

  1. 1.解放军战略保障部队信息工程大学,洛阳 471003
    2.洛阳科技学院,洛阳 471003
  • 收稿日期:2020-06-01 出版日期:2021-10-31 发布日期:2021-12-16
  • 作者简介:郭望皓,战略支援部队信息工程大学,471003,研究方向:机器翻译。电子邮箱: herald@yeah.net
    胡富茂,洛阳理工学院,471003,研究方向:语料库翻译学和翻译技术研究。电子邮箱: hfm7921@163.com
  • 基金资助:
    河南省哲学社会科学规划项目“基于大规模语料库的汉英语块提取及应用研究”(2020BYY011)

A Study of the Assessment of Translations and Post-editing in Neural Machine Translation

Guo Wanghao1(), Hu Fumao2()   

  1. 1. PLA Strategic Support Force Information Engineering University, Luoyang 471003, China
    2. Luoyang Institute of Science and Technology, Luoyang 471003, China
  • Received:2020-06-01 Online:2021-10-31 Published:2021-12-16

摘要:

尽管神经机器翻译技术已经取得了巨大进步,业界也正在加速推进神经机器翻译系统实用化和商品化的进程,但其在垂直领域的表现还不尽如人意。本研究以军事领域英译汉文本为研究对象,这些译文均由国内外主流神经机器翻译系统完成,在自主构建的1 000个军事题材译文测试数据集中,谷歌、百度、腾讯、有道、搜狗5个翻译系统的BLEU均值仅为20.854,较之其通用语料译文BLEU值相差6.62。实验结果显示:在军事题材译文的拼写、词汇、句法和语义4大类15种共5 050处错误中,军事术语翻译错误占比最高,为42.83%;其次为普通词语误译和层级结构错误。实验结果表明,目前现有的神经机器翻译系统尚不能实现高质量的军事文本翻译,无法满足现实需求,亟需进行译后编辑研究,以提高军事文本翻译的准确率。

关键词: 神经机器翻译; 译文评测; 译后编辑; 军事文本; 翻译错误类型

Abstract:

Although great progress has been made in neural machine translation technology and neural machine translation systems are developing extremely quickly in terms of practicality and commercialization, their performance in the vertical field remains unsatisfactory. In this study, the research object has been the English-Chinese translation of military texts conducted by the mainstream machine translation systems in China and overseas. The average of BLEU scores for the five machine translation systems — Google, Baidu, Tencent, NetEase Youdao, and Sogou — is only 20.854 for the 1 000 test datasets in an independently constructed military corpus, which is 6.62 BLEU lower than the outcome for the general corpus. The study’s results indicate that, of the 5 050 errors in the 15 categories of spelling, vocabulary, syntax and semantics, errors in the translation of military terms accounted for the greatest proportion (42.83%), followed by errors in the translation of common words and hierarchy errors. The results show that the existing neural machine translation systems cannot achieve high-quality translations of military texts to meet the actual needs; thus, more extensive post-editing research to improve the accuracy of military text translations is an urgent requirement.

Keywords: neural machine translation; assessment of translations; post-editing; military texts; types of translation errors

中图分类号: