參考文獻 |
[1] D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align
and translate, arXiv preprint arXiv:1409.0473 (2016).
[2] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin,
Attention is all you need, arXiv preprint arXiv:1706.03762 (2017).
[3] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional
transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
[4] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever, Improving language understanding
by generative pre-training (2018).
[5] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, Language models are
unsupervised multitask learners (2019).
[6] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan,
P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan,
R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler,
M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever,
D. Amodei, Language models are few-shot learners, 2020.
[7] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov,
L. Zettlemoyer, Bart: Denoising sequence-to-sequence pre-training for natural language
generation, translation, and comprehension, arXiv preprint arXiv:1910.13461 (2019).
[8] Y. Liu, J. Gu, N. Goyal, X. Li, S. Edunov, M. Ghazvininejad, M. Lewis, L. Zettlemoyer,
Multilingual denoising pre-training for neural machine translation, arXiv preprint
arXiv:2001.08210 (2020).
24
[9] Y. Tang, C. Tran, X. Li, P.-J. Chen, N. Goyal, V. Chaudhary, J. Gu, A. Fan, Multilingual
translation with extensible multilingual pretraining and finetuning, arXiv preprint
arXiv:2008.00401 (2020).
[10] Y. Zhang, S. Sun, M. Galley, Y.-C. Chen, C. Brockett, X. Gao, J. Gao, J. Liu, B. Dolan,
Dialogpt: Large-scale generative pre-training for conversational response generation, in:
ACL, system demonstration, 2020.
[11] A. Conneau, K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzman, E. Grave,
M. Ott, L. Zettlemoyer, V. Stoyanov, Unsupervised cross-lingual representation learning
at scale, arXiv preprint arXiv:1911.02116 (2020).
[12] H. Z. P. K. Y. G. D. Y. Y. Q. Y. S. H. J. J. G. F. Q. X. W. Y. Z. G. Z. H. C. S. C. D. L. Z. S.
Z. L. M. H. W. H. J. T. J. L. X. Z. Zhengyan Zhang, Xu Han, M. Sun, Cpm: A large-scale
generative chinese pre-trained language model, arXiv preprint arXiv:2012.00413 (2020).
[13] Z. L. Lifeng Shang, H. Li, Neural responding machine for short-text conversation, arXiv
preprint arXiv:1503.02364 (2015).
[14] Y. Z. K. H. Y. J. X. Z. Yida Wang, Pei Ke, M. Huang, You impress me: Dialogue generation
via mutual persona perception, arXiv preprint arXiv:2008.03946 (2020).
[15] M. Freitag, Y. Al-Onaizan, Beam search strategies for neural machine translation, arXiv
preprint arXiv:1702.01806 (2017).
[16] M. Post, A call for clarity in reporting bleu scores, arXiv preprint arXiv:1804.08771
(2018).
[17] K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, Bleu: A method for automatic evaluation
of machine translation, in: Proceedings of the 40th Annual Meeting on Associ-
25
ation for Computational Linguistics, ACL ’02, Association for Computational Linguistics,
USA, 2002, p. 311–318. URL: https://doi.org/10.3115/1073083.1073135.
doi:10.3115/1073083.1073135.
[18] H. Hassan, A. Aue, C. Chen, V. Chowdhary, J. Clark, C. Federmann, X. Huang,
M. Junczys-Dowmunt, W. Lewis, M. Li, S. Liu, T.-Y. Liu, R. Luo, A. Menezes, T. Qin,
F. Seide, X. Tan, F. Tian, L. Wu, S. Wu, Y. Xia, D. Zhang, Z. Zhang, M. Zhou,
Achieving human parity on automatic chinese to english news translation, arXiv preprint
arXiv:1803.05567 (2018).
[19] Y. Li, H. Su, X. Shen, W. Li, Z. Cao, S. Niu, Dailydialog: A manually labelled multi-turn
dialogue dataset, in: Proceedings of The 8th International Joint Conference on Natural
Language Processing (IJCNLP 2017), 2017.
[20] P. H. Martins, Z. Marinho, A. F. Martins, Sparse text generation, in: Proc. EMNLP, 2020. |