History Aware Multi-Stage Prompting for Neural Chat Translation

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：60

、訪客IP：18.219.63.90

姓名

林佑錡(Yu-Chi Lin) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(History Aware Multi-Stage Prompting for Neural Chat Translation)

相關論文

★ 多重標籤文本分類之實證研究 : word embedding 與傳統技術之比較	★ 基於圖神經網路之網路協定關聯分析
★ 學習模態間及模態內之共用表示式	★ Hierarchical Classification and Regression with Feature Selection
★ 病徵應用於病患自撰日誌之情緒分析	★ 基於注意力機制的開放式對話系統
★ 針對特定領域任務—基於常識的BERT模型之應用	★ 基於社群媒體使用者之硬體設備差異分析文本情緒強烈程度
★ 機器學習與特徵工程用於虛擬貨幣異常交易監控之成效討論	★ 捷運轉轍器應用長短期記憶網路與機器學習實現最佳維保時間提醒
★ 基於半監督式學習的網路流量分類	★ ERP日誌分析-以A公司為例
★ 企業資訊安全防護：網路封包蒐集分析與網路行為之探索性研究	★ 資料探勘技術在顧客關係管理之應用─以C銀行數位存款為例
★ 人臉圖片生成與增益之可用性與效率探討分析	★ 人工合成文本之資料增益於不平衡文字分類問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2025-8-1以後開放)

摘要(中)

神經網路聊天翻譯 (Neural Chat Translation, NCT) 是近年於機器翻譯領域中興起的任務，與神經機器翻譯 (Neural Machine Translation, NMT) 不同的是，神經網路聊天翻譯還涉及了多輪對話，因此是一項相當具挑戰性的二合一任務。雖然先前已經有研究使用上下文感知模型，並加入不同的輔助任務來來解決此任務，但往往需要很高的訓練成本。
在微調預訓練語言的成本逐漸提升下，提示 (Prompt) 調整的開始興起，該方法展現了具備參數效率以及在表現上可與微調預訓練語言比較的特性。而最近此方法有被應用至機器翻譯領域中，但是仍只考慮句子級別的翻譯，沒辦法有效將神經網路聊天翻譯任務重視的聊天內容納入考量。因此在本研究中，我們為這項任務提出一個新的提示調整方法稱為 History Aware Multi-stage Prompting (HAMSP)，透過將聊天歷史內容資訊納入到提示，以引導預訓練語言模型生成與對話情境一致的翻譯結果。
在實驗結果中，我們展示了我們提出的 HAMSP 與基準方法相較之下達到更好的表現性能，並且能夠與微調方法相互抗衡。而透過進一步的內在評估，我們說明了我們的方法更加的穩健，並且能夠有效提升翻譯結果的對話連貫性，以及可以提升訓練效率與降低硬體成本，具備廣泛應用至真實世界中不同的聊天系統之潛力。

摘要(英)

Neural Chat Translation (NCT) is an emerging task in the field of machine translation. Unlike Neural Machine Translation (NMT), NCT involves multi-turn conversations, making it a challenging dual-task. Previous research has explored the use of context-aware models and auxiliary tasks to address this task, but often at a high training cost.
As the cost of fine-tuning pre-trained language models continues to rise, prompt tuning has emerged as a promising alternative. This method demonstrates the characteristics of parameter efficiency and comparable performance to fine-tuning pre-trained language models. Recently, this method has been applied to the field of machine translation, but it only considers sentence-level translations and does not incorporate the conversational content that is crucial in neural chat translation tasks. Therefore, in this study, we present a new prompt tuning method called History Aware Multi-Stage Prompting (HAMSP). By incorporating the information from the chat history into the prompts, we guide the pre-trained language model to generate translations that are consistent with the conversational context.
In the experimental results, we demonstrate that our proposed HAMSP outperforms the baseline methods and can compete with fine-tuning methods. Through further intrinsic evaluation, we illustrate the robustness of our method and its ability to enhance the dialogue coherence of translations. Additionally, our method shows potential for improving training efficiency and reducing hardware costs, making it suitable for various chat systems in real-world applications.

關鍵字(中)

★ 神經網路聊天翻譯
★ 機器翻譯
★ 提示調整
★ 深度學習

關鍵字(英)

★ neural chat translation
★ machine translation
★ prompt tuning
★ deep learning

論文目次

摘要 I
Abstract II
Acknowledgements III
Table of Contents IV
List of Figures VI
List of Tables VII
1. Introduction 1
1.1. Overview 1
1.2. Motivation 2
1.3. Objectives 4
1.4. Thesis Organization 5
2. Related Works 6
2.1. Neural Machine Translation 6
2.1.1. Sentence-level NMT 6
2.1.2. Document-level NMT 8
2.1.3. Neural Chat Translation 10
2.2. Prompt Tuning 12
2.2.1. Manual Prompt 13
2.2.2. Discrete Prompt 14
2.2.3. Continuous Prompt 20
2.3. Multilingual Pre-trained Language Models 28
2.3.1. mBART 28
2.3.2. mT5 29
2.3.3. mGPT 30
2.4. Discussion 31
3. Methodology 34
3.1. Model Overview 34
3.2. Model Architecture 35
3.2.1. Prompt Generator 36
3.2.2. Multi-Stage 38
3.3. Training Phase 39
3.4. Datasets 39
3.5. Experiment setting 41
3.5.1. Data preprocessing and postprocessing 41
3.5.2. Model Setting 41
3.6. Flow Chart 43
3.7. Experiment Design 43
3.7.1. Experiment - The effectiveness of our proposed prompting method applied to NCT tasks. 43
3.7.2. Evaluation Metrics 44
4. Experiment Results 48
4.1. Experiment - The effectiveness of our proposed prompting method applied to NCT tasks. 48
4.1.1. Experiment Results 48
4.1.2. Intrinsic evaluation 56
5. Conclusion 72
5.1. Overall summary 72
5.2. Contributions 72
5.3. Study limitation 73
5.4. Future work 74
Reference 75

參考文獻

Agichtein, E., Gravano, L., 2000. Snowball: extracting relations from large plain-text collections, in: Proceedings of the Fifth ACM Conference on Digital Libraries, DL ’00. Association for Computing Machinery, New York, NY, USA, pp. 85–94. https://doi.org/10.1145/336597.336644
Bahdanau, D., Cho, K., Bengio, Y., 2015. Neural Machine Translation by Jointly Learning to Align and Translate, in: Bengio, Y., LeCun, Y. (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
Banerjee, S., Lavie, A., 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments, in: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, pp. 65–72.
Bawden, R., Sennrich, R., Birch, A., Haddow, B., 2018. Evaluating Discourse Phenomena in Neural Machine Translation, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Presented at the NAACL-HLT 2018, Association for Computational Linguistics, New Orleans, Louisiana, pp. 1304–1313. https://doi.org/10.18653/v1/N18-1118
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D., 2020. Language Models are Few-Shot Learners, in: Advances in Neural Information Processing Systems. Curran Associates, Inc., pp. 1877–1901.
Chen, J., Li, X., Zhang, J., Zhou, C., Cui, J., Wang, B., Su, J., 2020. Modeling Discourse Structure for Document-level Neural Machine Translation, in: Proceedings of the First Workshop on Automatic Simultaneous Translation. Presented at the AutoSimTrans 2020, Association for Computational Linguistics, Seattle, Washington, pp. 30–36. https://doi.org/10.18653/v1/2020.autosimtrans-1.5
Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y., 2014. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the EMNLP 2014, Association for Computational Linguistics, Doha, Qatar, pp. 1724–1734. https://doi.org/10.3115/v1/D14-1179
Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., Stoyanov, V., 2020. Unsupervised Cross-lingual Representation Learning at Scale, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2020, Association for Computational Linguistics, Online, pp. 8440–8451. https://doi.org/10.18653/v1/2020.acl-main.747
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Presented at the NAACL-HLT 2019, Association for Computational Linguistics, Minneapolis, Minnesota, pp. 4171–4186. https://doi.org/10.18653/v1/N19-1423
Farajian, M.A., Lopes, A.V., Martins, A.F.T., Maruf, S., Haffari, G., 2020. Findings of the WMT 2020 Shared Task on Chat Translation, in: Proceedings of the Fifth Conference on Machine Translation. Presented at the EMNLP-WMT 2020, Association for Computational Linguistics, Online, pp. 65–75.
Gao, T., Fisch, A., Chen, D., 2021. Making Pre-trained Language Models Better Few-shot Learners, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Presented at the ACL-IJCNLP 2021, Association for Computational Linguistics, Online, pp. 3816–3830. https://doi.org/10.18653/v1/2021.acl-long.295
Gu, Y., Han, X., Liu, Z., Huang, M., 2022. PPT: Pre-trained Prompt Tuning for Few-shot Learning, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 8410–8423. https://doi.org/10.18653/v1/2022.acl-long.576
Haviv, A., Berant, J., Globerson, A., 2021. BERTese: Learning to Speak to BERT, in: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Presented at the EACL 2021, Association for Computational Linguistics, Online, pp. 3618–3623. https://doi.org/10.18653/v1/2021.eacl-main.316
Hendrycks, D., Gimpel, K., 2020. Gaussian Error Linear Units (GELUs). https://doi.org/10.48550/arXiv.1606.08415
Jiang, Z., Anastasopoulos, A., Araki, J., Ding, H., Neubig, G., 2020a. X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the EMNLP 2020, Association for Computational Linguistics, Online, pp. 5943–5959. https://doi.org/10.18653/v1/2020.emnlp-main.479
Jiang, Z., Xu, F.F., Araki, J., Neubig, G., 2020b. How Can We Know What Language Models Know? Trans. Assoc. Comput. Linguist. 8, 423–438. https://doi.org/10.1162/tacl_a_00324
Kingma, D.P., Ba, J., 2014. Adam: A method for stochastic optimization. ArXiv Prepr. ArXiv14126980.
Lapata, M., Barzilay, R., 2005. Automatic evaluation of text coherence: models and representations, in: Proceedings of the 19th International Joint Conference on Artificial Intelligence, IJCAI’05. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp. 1085–1090.
Läubli, S., Sennrich, R., Volk, M., 2018. Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 4791–4796. https://doi.org/10.18653/v1/D18-1512
Lee, S., Lee, J., Moon, H., Park, C., Seo, J., Eo, S., Koo, S., Lim, H., 2023. A Survey on Evaluation Metrics for Machine Translation. Mathematics 11, 1006. https://doi.org/10.3390/math11041006
Lei, Y., Ren, Y., Xiong, D., 2022. CoDoNMT: Modeling Cohesion Devices for Document-Level Neural Machine Translation, in: Proceedings of the 29th International Conference on Computational Linguistics. Presented at the COLING 2022, International Committee on Computational Linguistics, Gyeongju, Republic of Korea, pp. 5205–5216.
Lester, B., Al-Rfou, R., Constant, N., 2021. The Power of Scale for Parameter-Efficient Prompt Tuning, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 3045–3059. https://doi.org/10.18653/v1/2021.emnlp-main.243
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., Zettlemoyer, L., 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2020, Association for Computational Linguistics, Online, pp. 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703
Li, X.L., Liang, P., 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Presented at the ACL-IJCNLP 2021, Association for Computational Linguistics, Online, pp. 4582–4597. https://doi.org/10.18653/v1/2021.acl-long.353
Li, Y., Yin, Y., Li, J., Zhang, Y., 2022. Prompt-Driven Neural Machine Translation, in: Findings of the Association for Computational Linguistics: ACL 2022. Presented at the Findings 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 2579–2590. https://doi.org/10.18653/v1/2022.findings-acl.203
Liang, Y., Meng, F., Chen, Y., Xu, J., Zhou, J., 2021a. Modeling Bilingual Conversational Characteristics for Neural Chat Translation, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Presented at the ACL-IJCNLP 2021, Association for Computational Linguistics, Online, pp. 5711–5724. https://doi.org/10.18653/v1/2021.acl-long.444
Liang, Y., Meng, F., Xu, J., Chen, Y., Zhou, J., 2022. Scheduled Multi-task Learning for Neural Chat Translation, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 4375–4388. https://doi.org/10.18653/v1/2022.acl-long.300
Liang, Y., Zhou, C., Meng, F., Xu, J., Chen, Y., Su, J., Zhou, J., 2021b. Towards Making the Most of Dialogue Characteristics for Neural Chat Translation, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 67–79. https://doi.org/10.18653/v1/2021.emnlp-main.6
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G., 2022. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv. https://doi.org/10.1145/3560815
Liu, X., Ji, K., Fu, Y., Tam, W., Du, Z., Yang, Z., Tang, J., 2022. P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Presented at the ACL 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 61–68. https://doi.org/10.18653/v1/2022.acl-short.8
Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., Tang, J., 2021. GPT Understands, Too. https://doi.org/10.48550/arXiv.2103.10385
Liu, Y., Gu, J., Goyal, N., Li, X., Edunov, S., Ghazvininejad, M., Lewis, M., Zettlemoyer, L., 2020. Multilingual Denoising Pre-training for Neural Machine Translation. Trans. Assoc. Comput. Linguist. 8, 726–742. https://doi.org/10.1162/tacl_a_00343
Ma, S., Zhang, D., Zhou, M., 2020. A Simple and Effective Unified Encoder for Document-Level Machine Translation, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2020, Association for Computational Linguistics, Online, pp. 3505–3511. https://doi.org/10.18653/v1/2020.acl-main.321
Maruf, S., Martins, A.F.T., Haffari, G., 2019. Selective Attention for Context-aware Neural Machine Translation, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Presented at the NAACL-HLT 2019, Association for Computational Linguistics, Minneapolis, Minnesota, pp. 3092–3102. https://doi.org/10.18653/v1/N19-1313
Maruf, S., Martins, A.F.T., Haffari, G., 2018. Contextual Neural Model for Translating Bilingual Multi-Speaker Conversations, in: Proceedings of the Third Conference on Machine Translation: Research Papers. Presented at the WMT 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 101–112. https://doi.org/10.18653/v1/W18-6311
Miculicich, L., Ram, D., Pappas, N., Henderson, J., 2018. Document-Level Neural Machine Translation with Hierarchical Attention Networks, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 2947–2954. https://doi.org/10.18653/v1/D18-1325
Mikolov, T., Chen, K., Corrado, G., Dean, J., 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781.
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J., 2002. Bleu: a Method for Automatic Evaluation of Machine Translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2002, Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 311–318. https://doi.org/10.3115/1073083.1073135
Petroni, F., Rocktäschel, T., Riedel, S., Lewis, P., Bakhtin, A., Wu, Y., Miller, A., 2019. Language Models as Knowledge Bases?, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Presented at the EMNLP-IJCNLP 2019, Association for Computational Linguistics, Hong Kong, China, pp. 2463–2473. https://doi.org/10.18653/v1/D19-1250
Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E., Mihalcea, R., 2019. MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2019, Association for Computational Linguistics, Florence, Italy, pp. 527–536. https://doi.org/10.18653/v1/P19-1050
Post, M., 2018. A Call for Clarity in Reporting BLEU Scores, in: Proceedings of the Third Conference on Machine Translation: Research Papers. Presented at the WMT 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 186–191. https://doi.org/10.18653/v1/W18-6319
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 9.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J., 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21, 1–67.
Ravichandran, D., Hovy, E., 2002. Learning surface text patterns for a Question Answering System, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2002, Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 41–47. https://doi.org/10.3115/1073083.1073092
Sanh, V., Debut, L., Chaumond, J., Wolf, T., 2020. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. https://doi.org/10.48550/arXiv.1910.01108
Shazeer, N., 2020. GLU Variants Improve Transformer. https://doi.org/10.48550/arXiv.2002.05202
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S., 2020. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the EMNLP 2020, Association for Computational Linguistics, Online, pp. 4222–4235. https://doi.org/10.18653/v1/2020.emnlp-main.346
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J., 2006. A Study of Translation Edit Rate with Targeted Human Annotation, in: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers. Presented at the AMTA 2006, Association for Machine Translation in the Americas, Cambridge, Massachusetts, USA, pp. 223–231.
Sohn, K., Lee, H., Yan, X., 2015. Learning Structured Output Representation using Deep Conditional Generative Models, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Sutskever, I., Vinyals, O., Le, Q.V., 2014. Sequence to Sequence Learning with Neural Networks, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Tan, Z., Zhang, J., Huang, X., Chen, G., Wang, S., Sun, M., Luan, H., Liu, Y., 2020. THUMT: An Open-Source Toolkit for Neural Machine Translation, in: Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track). Presented at the AMTA 2020, Association for Machine Translation in the Americas, Virtual, pp. 116–122.
Tan, Z., Zhang, X., Wang, S., Liu, Y., 2022. MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 6131–6142. https://doi.org/10.18653/v1/2022.acl-long.424
Tang, T., Li, J., Zhao, W.X., Wen, J.-R., 2022. Context-Tuning: Learning Contextualized Prompts for Natural Language Generation, in: Proceedings of the 29th International Conference on Computational Linguistics. Presented at the COLING 2022, International Committee on Computational Linguistics, Gyeongju, Republic of Korea, pp. 6340–6354.
Tiedemann, J., Scherrer, Y., 2017. Neural Machine Translation with Extended Context, in: Proceedings of the Third Workshop on Discourse in Machine Translation. Association for Computational Linguistics, Copenhagen, Denmark, pp. 82–92. https://doi.org/10.18653/v1/W17-4811
Toral, A., Castilho, S., Hu, K., Way, A., 2018. Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation, in: Proceedings of the Third Conference on Machine Translation: Research Papers. Presented at the WMT 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 113–123. https://doi.org/10.18653/v1/W18-6312
Tu, Z., Liu, Y., Shi, S., Zhang, T., 2018. Learning to Remember Translation History with a Continuous Cache. Trans. Assoc. Comput. Linguist. 6, 407–420. https://doi.org/10.1162/tacl_a_00029
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I., 2017. Attention is All you Need, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Voita, E., Serdyukov, P., Sennrich, R., Titov, I., 2018. Context-Aware Neural Machine Translation Learns Anaphora Resolution, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2018, Association for Computational Linguistics, Melbourne, Australia, pp. 1264–1274. https://doi.org/10.18653/v1/P18-1117
Wang, C., Wang, J., Qiu, M., Huang, J., Gao, M., 2021. TransPrompt: Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 2792–2802. https://doi.org/10.18653/v1/2021.emnlp-main.221
Wang, T., Zhao, C., Wang, M., Li, L., Xiong, D., 2021. Autocorrect in the Process of Translation — Multi-task Learning Improves Dialogue Machine Translation, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers. Presented at the NAACL-HLT 2021, Association for Computational Linguistics, Online, pp. 105–112. https://doi.org/10.18653/v1/2021.naacl-industry.14
Wenzek, G., Lachaux, M.-A., Conneau, A., Chaudhary, V., Guzmán, F., Joulin, A., Grave, E., 2020. CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data, in: Proceedings of the Twelfth Language Resources and Evaluation Conference. Presented at the LREC 2020, European Language Resources Association, Marseille, France, pp. 4003–4012.
Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., Stevens, K., Kurian, G., Patil, N., Wang, W., Young, C., Smith, J., Riesa, J., Rudnick, A., Vinyals, O., Corrado, G., Hughes, M., Dean, J., 2016. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://doi.org/10.48550/arXiv.1609.08144
Wu, Z., Wang, S., Gu, J., Hou, R., Dong, Y., Vydiswaran, V.G.V., Ma, H., 2022. IDPG: An Instance-Dependent Prompt Generation Method, in: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Presented at the NAACL-HLT 2022, Association for Computational Linguistics, Seattle, United States, pp. 5507–5521. https://doi.org/10.18653/v1/2022.naacl-main.403
Xiong, H., He, Z., Wu, H., Wang, H., 2019. Modeling Coherence for Discourse Neural Machine Translation. Proc. AAAI Conf. Artif. Intell. 33, 7338–7345. https://doi.org/10.1609/aaai.v33i01.33017338
Xue, L., Constant, N., Roberts, A., Kale, M., Al-Rfou, R., Siddhant, A., Barua, A., Raffel, C., 2021. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Presented at the NAACL-HLT 2021, Association for Computational Linguistics, Online, pp. 483–498. https://doi.org/10.18653/v1/2021.naacl-main.41
Zhang, J., Luan, H., Sun, M., Zhai, F., Xu, J., Zhang, M., Liu, Y., 2018. Improving the Transformer Translation Model with Document-Level Context, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 533–542. https://doi.org/10.18653/v1/D18-1049

指導教授

柯士文(Shin-Wen Ke)

審核日期

2023-7-20

推文