藉由mBART50和DialoGPT所生成之繁體中文對話

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：45

、訪客IP：18.118.162.166

姓名

廖翊均(Yi-Chun Liao) 查詢紙本館藏

畢業系所

數學系

論文名稱

藉由mBART50和DialoGPT所生成之繁體中文對話
(Dialogue Generation in Traditional Chinese Utilizing the Algorithm Based on mBART50 and DialoGPT)

相關論文

★ 氣流的非黏性駐波通過不連續管子之探究	★ An Iteration Method for the Riemann Problem of Some Degenerate Hyperbolic Balance Laws
★ 影像模糊方法在蝴蝶辨識神經網路中之應用	★ 單一非線性平衡律黎曼問題廣義解的存在性
★ 非線性二階常微方程組兩點邊界值問題之解的存在性與唯一性	★ 對接近音速流量可壓縮尤拉方程式的柯西問題去架構區間逼近解
★ 一些退化擬線性波動方程的解的性質.	★ 擬線性波方程中片段線性初始值問題的整體Lipchitz連續解的
★ 水文地質學的平衡模型之擴散對流反應方程	★ 非線性守恆律的擾動Riemann 問題的古典解
★ BBM與KdV方程初始邊界問題解的週期性	★ 共振守恆律的擾動黎曼問題的古典解
★ 可壓縮流中微黏性尤拉方程激波解的行為	★ 非齊次雙曲守恆律系統初始邊界值問題之整域弱解的存在性
★ 有關非線性平衡定律之柯西問題的廣域弱解	★ 單一雙曲守恆律的柯西問題熵解整體存在性的一些引理

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在自然語言處理中，目前已知的模型大部分都有支援中文翻譯或者對話的生成。但是我們知道，中文分為簡體中文以及繁體中文。然而這些模型支援的中文多為簡體中文，雖然同樣都是中文，但它們的用詞以及用法都不盡相同。

由於缺乏繁體中文的翻譯和對話數據，本文將以翻譯和對話相結合來進行。也就是說，我們做了繁體中文和英文的雙向翻譯，以及英文的對話。訓練翻譯的數據來以下的新聞以及線上課程網站：The China Post、Voice Tube和Hope English，並且對話數據來自dailydialog，之後我們使用Hi Tutor和 TOCFL來做最後的測試。我們藉由 mBART50 以及 DialoGPT 兩種模型的合併並且使用微調的方式來生成繁體中文的對話。

我們微調出來的翻譯模型其結果皆比原來的模型好，尤其是在beam size值為7時。對話模型在微調後的結果顯示，在小型的模型中生成的對話最為流暢。在最後的實驗中，我們運用了參數 beam size、top k 和 top p 找出能夠產生最佳結果的數值，分別為：7、10和0.95。我們最好的模型在最後的測試中的分數為2.85。最後，我們使用微調出來的最好的模型生成了一個藉由英文對話而產生的繁體中文的對話。

摘要(英)

In Natural Language Processing ( NLP ), as far as we know, lots of the currently known models support translation or dialogue generation in Chinese. But we know that Chinese is divided into simplified Chinese and traditional Chinese. However, the Chinese supported by these models are mostly simplified Chinese. Although they are all Chinese, their characters and usage are not the same.

Due to a lack of translation and dialogue data in Traditional Chinese, we use a combination of translation and dialogue with English as the pivot in this paper. In other words, we have made a two-way translation between traditional Chinese and English, as well as a pivotal dialogue in English. To accomplish the translation in the training part, we use data collected from the following news sources and online classes: The China Post, Voice Tube, and Hope English. Moreover, we use dailydialog to train the English dialogue. Then, for the final test, we adopt a traditional Chinese dialogue from Hi Tutor and TOCFL. We utilize mBART50 and DialoGPT to generate the traditional Chinese dialogue with fine-tuning.

The results of our fine-tuning models are better than the original models without fine-tuning. Especially when the beam size is 7 in the translation. After fine-tuning the dialogue model, the result shows that the dialogue generated from the small size model is the smoothest. In the final experiment, we use the parameters beam size, top k, and top p to produce the best results in our model, respectively: 7, 10, and 0.95. The bleu score of the final test in our best model is 2.85. Finally, using the best model, we build a traditional Chinese dialogue utilizing English conversations as the pivot.

關鍵字(中)

★ 機器翻譯
★ 對話生成
★ mBART50
★ DialoGPT

關鍵字(英)

★ Machine Translation
★ Dialogue Generation
★ mBART50
★ DialoGPT

論文目次

中文摘要 i
English Abstract ii
Acknowledgements iii
Table of Contents iv
List of Figures vi
List of Tables vii
1 Introduction 1
2 Theory 4
2.1 Transformer 4
2.1.1 Encoder 4
2.1.2 Decoder 5
2.1.3 Positional Encoding 6
2.1.4 Add & Norm 7
2.1.5 Feed Forward 7
2.2 Attention 7
2.2.1 Self-attention 9
2.3 mBART 10
2.4 Beam Search 11
2.5 Sacrebleu 14
2.6 DialoGPT 14
2.7 Top k sampling 15
2.8 Top p sampling 16
2.9 Perplexity 16
3 Experiments & Results 17
3.1 Dataset 17
3.2 Scores 18
3.3 Results 20
4 Conclusion 23
References 24

參考文獻

[1] D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align
and translate, arXiv preprint arXiv:1409.0473 (2016).
[2] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin,
Attention is all you need, arXiv preprint arXiv:1706.03762 (2017).
[3] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
[4] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, Improving language understanding
by generative pre-training (2018).
[5] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, Language models are
unsupervised multitask learners (2019).
[6] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan,
P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan,
R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler,
M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever,
D. Amodei, Language models are few-shot learners, 2020.
[7] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov,
L. Zettlemoyer, Bart: Denoising sequence-to-sequence pre-training for natural language
generation, translation, and comprehension, arXiv preprint arXiv:1910.13461 (2019).
[8] Y. Liu, J. Gu, N. Goyal, X. Li, S. Edunov, M. Ghazvininejad, M. Lewis, L. Zettlemoyer,
Multilingual denoising pre-training for neural machine translation, arXiv preprint
arXiv:2001.08210 (2020).
24
[9] Y. Tang, C. Tran, X. Li, P.-J. Chen, N. Goyal, V. Chaudhary, J. Gu, A. Fan, Multilingual
translation with extensible multilingual pretraining and finetuning, arXiv preprint
arXiv:2008.00401 (2020).
[10] Y. Zhang, S. Sun, M. Galley, Y.-C. Chen, C. Brockett, X. Gao, J. Gao, J. Liu, B. Dolan,
Dialogpt: Large-scale generative pre-training for conversational response generation, in:
ACL, system demonstration, 2020.
[11] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzman, E. Grave,
M. Ott, L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning
at scale, arXiv preprint arXiv:1911.02116 (2020).
[12] H. Z. P. K. Y. G. D. Y. Y. Q. Y. S. H. J. J. G. F. Q. X. W. Y. Z. G. Z. H. C. S. C. D. L. Z. S.
Z. L. M. H. W. H. J. T. J. L. X. Z. Zhengyan Zhang, Xu Han, M. Sun, Cpm: A large-scale
generative chinese pre-trained language model, arXiv preprint arXiv:2012.00413 (2020).
[13] Z. L. Lifeng Shang, H. Li, Neural responding machine for short-text conversation, arXiv
preprint arXiv:1503.02364 (2015).
[14] Y. Z. K. H. Y. J. X. Z. Yida Wang, Pei Ke, M. Huang, You impress me: Dialogue generation
via mutual persona perception, arXiv preprint arXiv:2008.03946 (2020).
[15] M. Freitag, Y. Al-Onaizan, Beam search strategies for neural machine translation, arXiv
preprint arXiv:1702.01806 (2017).
[16] M. Post, A call for clarity in reporting bleu scores, arXiv preprint arXiv:1804.08771
(2018).
[17] K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, Bleu: A method for automatic evaluation
of machine translation, in: Proceedings of the 40th Annual Meeting on Associ-
25
ation for Computational Linguistics, ACL ’02, Association for Computational Linguistics,
USA, 2002, p. 311–318. URL: https://doi.org/10.3115/1073083.1073135.
doi:10.3115/1073083.1073135.
[18] H. Hassan, A. Aue, C. Chen, V. Chowdhary, J. Clark, C. Federmann, X. Huang,
M. Junczys-Dowmunt, W. Lewis, M. Li, S. Liu, T.-Y. Liu, R. Luo, A. Menezes, T. Qin,
F. Seide, X. Tan, F. Tian, L. Wu, S. Wu, Y. Xia, D. Zhang, Z. Zhang, M. Zhou,
Achieving human parity on automatic chinese to english news translation, arXiv preprint
arXiv:1803.05567 (2018).
[19] Y. Li, H. Su, X. Shen, W. Li, Z. Cao, S. Niu, Dailydialog: A manually labelled multi-turn
dialogue dataset, in: Proceedings of The 8th International Joint Conference on Natural
Language Processing (IJCNLP 2017), 2017.
[20] P. H. Martins, Z. Marinho, A. F. Martins, Sparse text generation, in: Proc. EMNLP, 2020.

指導教授

洪盟凱

審核日期

2022-1-14

推文