A Triple Unified Transformer for  Context-Aware Dialogue Translation

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：61

、訪客IP：13.59.76.150

姓名

林育全(Yu-Cyuan Lin) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(A Triple Unified Transformer for Context-Aware Dialogue Translation)

相關論文

★ 多重標籤文本分類之實證研究 : word embedding 與傳統技術之比較	★ 基於圖神經網路之網路協定關聯分析
★ 學習模態間及模態內之共用表示式	★ Hierarchical Classification and Regression with Feature Selection
★ 病徵應用於病患自撰日誌之情緒分析	★ 基於注意力機制的開放式對話系統
★ 針對特定領域任務—基於常識的BERT模型之應用	★ 基於社群媒體使用者之硬體設備差異分析文本情緒強烈程度
★ 機器學習與特徵工程用於虛擬貨幣異常交易監控之成效討論	★ 捷運轉轍器應用長短期記憶網路與機器學習實現最佳維保時間提醒
★ 基於半監督式學習的網路流量分類	★ ERP日誌分析-以A公司為例
★ 企業資訊安全防護：網路封包蒐集分析與網路行為之探索性研究	★ 資料探勘技術在顧客關係管理之應用─以C銀行數位存款為例
★ 人臉圖片生成與增益之可用性與效率探討分析	★ 人工合成文本之資料增益於不平衡文字分類問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2027-8-26以後開放)

摘要(中)

對話翻譯任務旨在翻譯不同語言的人之間的對話文本。然而，現今在處理機器翻譯任務大多都使用句子級別的翻譯模型，但是這樣的架構卻無法有效地捕捉上下文的關係，會造成在跨語句的語意無法以正確的方式表達。然而文檔級別的翻譯架構可以處理上述的問題。
雖然可以使用文檔級別的翻譯模型去應用在聊天翻譯的任務上，但是該如何選擇有用的上下文去幫助當前要翻譯的句子卻還是有待討論的議題。除此之外，也較少研究關注歷史聊天紀錄對翻譯效果的影響，這些資訊也是能有效幫助到當前要翻譯的句子，所以我們認為應該將其納入考慮。因此，我們提出了一個新的上下文選擇方式和一個適應聊天翻譯的模型，分別叫做United method和Triple-Unified-Transformer。我們的模型可以更好的學習到跨句子之間的關係，使當前要翻譯的句子得到更好的結果。
在實驗一中，我們使用三個不同的聊天翻譯資料集和自動化的評量指標去衡量我們提出的United method和Triple-Unified-Transformer的效果。實驗結果顯示，Flat-Transformer再加入United method後的表現是能提升翻譯效果的。另一方面，Triple-Unified-Transformer的翻譯效果在特定資料集可以得到不錯的結果 (BLEU4、BLONDE)。此外，在實驗二中，我們測試United method和Triple-Unified-Transformer是否具有一般性，可以適應在不同的聊天情境中。實驗結果顯示，Triple-Unified-Transformer在特定的資料集中表現最好，也代表說我們提出的模型是有能力適用在不同的聊天情境中，而有加入United method的基線模型也都能有較好的表現。

摘要(英)

Dialogue translation is designed to translate the text of conversations between people of different languages. Most of machine translation tasks use Sentence-level translation models, but such architectures can’t effectively capture contextual relationships, resulting in the inability to express cross-sentence semantics in a correct way. However, a Document-level translation architecture can deal with the above problems.
Document-level translation model can be used for the task of dialogue translation, but how to choose a useful context to help the current sentence to be translated is still an open issue. In addition, there are fewer studies focusing on the influence of historical chat records on translation effects. The historical information can also effectively help the current sentence to be translated, so we think it should be taken into account. Therefore, we propose a new way of context selection and a model adapted to dialogue translation, called United method and Triple-Unified-Transformer, respectively. Our model can better learn the relationship between sentences, so that the current sentence to be translated can get better results.
In experiment 1, we use three different chat translation datasets and auto evaluation metrics to measure the effectiveness of our proposed United method and Triple-Unified-Transformer. The results show that the performance of one of the baseline models after adding the United method can improve the translation effect. On the other hand, the translation effect of Triple-Unified-Transformer achieve good results in specific datasets (BLEU, BLONDE). Furthermore, In experiment 2, we test the United method and Triple-Unified-Transformer are general and can adapt to different chat situations. The results show that Triple-Unified-Transformer performs best in a specific dataset, which means that our model has the ability to apply in different chat situations, and the baseline model with the United method can perform better.

關鍵字(中)

★ 聊天翻譯
★ 機器翻譯
★ 深度學習

關鍵字(英)

★ dialogue translation
★ machine translation
★ deep learning

論文目次

摘要 I
Abstract II
List of Figures VI
List of Table VII
1. Introduction 1
1.1. Overview 1
1.2. Motivation 2
1.3. Objectives 4
1.4. Thesis Organization 5
2. Related Works 7
2.1. Statistical Machine Translation (SMT) 7
2.2. Neural Machine Translation (NMT) 8
2.2.1. Sentence-level Translation 8
2.2.2. Document-level Translation 10
2.3. Dialogue Translation 19
2.4. Evaluation Metrics 23
2.4.1. BLEU 23
2.4.2. METEOR 23
2.4.3. BLONDE 24
2.5. Discussion 26
3. Methodology 27
3.1. Model overview 27
3.2. Model Architecture 29
3.2.1. Reference Sentence 29
3.2.2. Segment Embedding 31
3.2.3. Triple-Unified-Transformer 32
3.3. Flow Chart 34
3.4. Dataset 35
3.5. Data Preprocessing 36
3.6. Experiment Design 37
3.6.1. Experiment 1 – Are our proposed architectures and United method helpful? 37
3.6.2. Experiment 2 - The effect of model generalization 39
4. Experiment Results 40
4.1.1. Experiment 1 Results 40
4.1.2. Case study of Experiment 1 50
4.1.3. Summary of Experiment 1 56
4.2. Experiment 2 Results 58
5. Conclusion 64
5.1. Overall summary 64
5.2. Contributions 64
5.3. Study limitation 65
5.4. Future work 65
Reference 66

參考文獻

Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L., 2018. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. ArXiv170707998 Cs.
Bahdanau, D., Cho, K.H., Bengio, Y., 2015. Neural machine translation by jointly learning to align and translate.
Banerjee, S., Lavie, A., 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments, in: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, pp. 65–72.
Brown, P.F., Cocke, J., Della Pietra, S.A., Della Pietra, V.J., Jelinek, F., Lafferty, J.D., Mercer, R.L., Roossin, P.S., 1990. A Statistical Approach to Machine Translation. Comput. Linguist. 16, 79–85.
Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y., 2014. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp. 1724–1734. https://doi.org/10.3115/v1/D14-1179
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp. 4171–4186. https://doi.org/10.18653/v1/N19-1423
Dong, L., Yang, N., Wang, W., Wei, F., Liu, X., Wang, Y., Gao, J., Zhou, M., Hon, H.-W., 2019. Unified Language Model Pre-training for Natural Language Understanding and Generation, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Farajian, M.A., Lopes, A.V., Martins, A.F.T., Maruf, S., Haffari, G., 2020. Findings of the WMT 2020 Shared Task on Chat Translation, in: Proceedings of the Fifth Conference on Machine Translation. Association for Computational Linguistics, Online, pp. 65–75.
Jiang, Y.E., Liu, T., Ma, S., Zhang, D., Yang, J., Huang, H., Sennrich, R., Cotterell, R., Sachan, M., Zhou, M., 2022. BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation.
Li, Y., Su, H., Shen, X., Li, W., Cao, Z., Niu, S., 2017. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset, in: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, Taipei, Taiwan, pp. 986–995.
Liang, Y., Meng, F., Chen, Y., Xu, J., Zhou, J., 2021a. Modeling Bilingual Conversational Characteristics for Neural Chat Translation, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, pp. 5711–5724. https://doi.org/10.18653/v1/2021.acl-long.444
Liang, Y., Zhou, C., Meng, F., Xu, J., Chen, Y., Su, J., Zhou, J., 2021b. Towards Making the Most of Dialogue Characteristics for Neural Chat Translation, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 67–79. https://doi.org/10.18653/v1/2021.emnlp-main.6
Lison, P., Tiedemann, J., Kouylekov, M., 2018. OpenSubtitles2018: Statistical Rescoring of Sentence Alignments in Large, Noisy Parallel Corpora, in: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan.
Liu, S., Sun, Y., Wang, L., 2021. Recent Advances in Dialogue Machine Translation. Information 12, 484. https://doi.org/10.3390/info12110484
Lu, J., Xiong, C., Parikh, D., Socher, R., 2017. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning. IEEE Computer Society, pp. 3242–3250. https://doi.org/10.1109/CVPR.2017.345
Ma, S., Zhang, D., Zhou, M., 2020. A Simple and Effective Unified Encoder for Document-Level Machine Translation, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, pp. 3505–3511. https://doi.org/10.18653/v1/2020.acl-main.321
Maruf, S., Haffari, G., 2018. Document Context Neural Machine Translation with Memory Networks, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, pp. 1275–1284. https://doi.org/10.18653/v1/P18-1118
Maruf, S., Martins, A.F.T., Haffari, G., 2019. Selective Attention for Context-aware Neural Machine Translation, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp. 3092–3102. https://doi.org/10.18653/v1/N19-1313
Miculicich, L., Ram, D., Pappas, N., Henderson, J., 2018. Document-Level Neural Machine Translation with Hierarchical Attention Networks, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, pp. 2947–2954. https://doi.org/10.18653/v1/D18-1325
Miller, G.A., 1995. WordNet: a lexical database for English. Commun. ACM 38, 39–41. https://doi.org/10.1145/219717.219748
Moghe, N., Hardmeier, C., Bawden, R., 2020. The University of Edinburgh-Uppsala University’s Submission to the WMT 2020 Chat Translation Task, in: Proceedings of the Fifth Conference on Machine Translation. Association for Computational Linguistics, Online, pp. 473–478.
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J., 2002. Bleu: a Method for Automatic Evaluation of Machine Translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 311–318. https://doi.org/10.3115/1073083.1073135
Pouliquen, B., 2017. WIPO Translate: Patent Neural Machine Translation publicly available in 10 languages.
Reimers, N., Gurevych, I., 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp. 3980–3990. https://doi.org/10.18653/v1/D19-1410
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J., 2006. A Study of Translation Edit Rate with Targeted Human Annotation, in: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers. Association for Machine Translation in the Americas, Cambridge, Massachusetts, USA, pp. 223–231.
Sohn, K., Lee, H., Yan, X., 2015. Learning Structured Output Representation using Deep Conditional Generative Models, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Song, K., Tan, X., Qin, T., Lu, J., Liu, T.-Y., 2019. MASS: Masked Sequence to Sequence Pre-training for Language Generation. ArXiv190502450 Cs.
Sutskever, I., Vinyals, O., Le, Q.V., 2014. Sequence to Sequence Learning with Neural Networks, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Tomita, M., Tomabechi, H., Saito, H., 1990. S PEECH T RANS : An Experimental Real-Time Speech-to-Speech Translation System. undefined.
Tu, Z., Liu, Y., Shi, S., Zhang, T., 2018. Learning to Remember Translation History with a Continuous Cache. Trans. Assoc. Comput. Linguist. 6, 407–420. https://doi.org/10.1162/tacl_a_00029
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I., 2017. Attention is All you Need, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Voita, E., Sennrich, R., Titov, I., 2019a. Context-Aware Monolingual Repair for Neural Machine Translation, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp. 877–886. https://doi.org/10.18653/v1/D19-1081
Voita, E., Sennrich, R., Titov, I., 2019b. When a Good Translation is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp. 1198–1212. https://doi.org/10.18653/v1/P19-1116
Voita, E., Serdyukov, P., Sennrich, R., Titov, I., 2018. Context-Aware Neural Machine Translation Learns Anaphora Resolution, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, pp. 1264–1274. https://doi.org/10.18653/v1/P18-1117
Wang, L., Tu, Z., Shi, S., Zhang, T., Graham, Y., Liu, Q., 2018. Translating Pro-Drop Languages With Reconstruction Models. Proc. AAAI Conf. Artif. Intell. 32.
Wang, L., Tu, Z., Way, A., Liu, Q., 2017. Exploiting Cross-Sentence Context for Neural Machine Translation, in: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, pp. 2826–2831. https://doi.org/10.18653/v1/D17-1301
Wang, L., Tu, Z., Zhang, X., Li, H., Way, A., Liu, Q., 2016. A Novel Approach to Dropped Pronoun Translation, in: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, pp. 983–993. https://doi.org/10.18653/v1/N16-1113
Wang, T., Zhao, C., Wang, M., Li, L., Xiong, D., 2021. Autocorrect in the Process of Translation — Multi-task Learning Improves Dialogue Machine Translation, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers. Association for Computational Linguistics, Online, pp. 105–112. https://doi.org/10.18653/v1/2021.naacl-industry.14
Wu, B., Li, M., Wang, Z., Chen, Y., Wong, D.F., Feng, Q., Huang, J., Wang, B., 2020. Guiding Variational Response Generator to Exploit Persona, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, pp. 53–65. https://doi.org/10.18653/v1/2020.acl-main.7
Wu, J., Wang, X., Wang, W.Y., 2019. Self-Supervised Dialogue Learning, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, pp. 3857–3867. https://doi.org/10.18653/v1/P19-1375
Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., Stevens, K., Kurian, G., Patil, N., Wang, W., Young, C., Smith, J., Riesa, J., Rudnick, A., Vinyals, O., Corrado, G., Hughes, M., Dean, J., 2016. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. ArXiv160908144 Cs.
Xu, K., Ba, J.L., Kiros, R., Cho, K., Courville, A., Salakhutdinov, R., Zemel, R.S., Bengio, Y., 2015. Show, attend and tell: neural image caption generation with visual attention, in: Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15. JMLR.org, Lille, France, pp. 2048–2057.
Yang, Zhilin, Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V., 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Yang, Zhengxin, Zhang, J., Meng, F., Gu, S., Feng, Y., Zhou, J., 2019. Enhancing Context Modeling with a Query-Guided Capsule Network for Document-level Translation, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp. 1527–1537. https://doi.org/10.18653/v1/D19-1164
Yun, H., Hwang, Y., Jung, K., 2020. Improving Context-Aware Neural Machine Translation Using Self-Attentive Sentence Embedding. Proc. AAAI Conf. Artif. Intell. 34, 9498–9506. https://doi.org/10.1609/aaai.v34i05.6494
Zhang, H., Lan, Y., Pang, L., Chen, H., Ding, Z., Yin, D., 2020. Modeling Topical Relevance for Multi-Turn Dialogue Generation. https://doi.org/10.48550/arXiv.2009.12735
Zhang, J., Luan, H., Sun, M., Zhai, F., Xu, J., Zhang, M., Liu, Y., 2018. Improving the Transformer Translation Model with Document-Level Context, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, pp. 533–542. https://doi.org/10.18653/v1/D18-1049

指導教授

柯士文(Shih-Wen Ke)

審核日期

2022-8-26

推文