GAN^2: Fuse IntraGAN with OuterGAN for Text Generation

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：40

、訪客IP：3.137.220.120

姓名

莊凱智(Kai-Chih Chuang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(GAN^2: Fuse IntraGAN with OuterGAN for Text Generation)

相關論文

★ 台灣50走勢分析：以多重長短期記憶模型架構為基礎之預測	★ 以多重遞迴歸神經網路模型為基礎之黃金價格預測分析
★ 增量學習用於工業4.0瑕疵檢測	★ 遞回歸神經網路於電腦零組件銷售價格預測之研究
★ 長短期記憶神經網路於釣魚網站預測之研究	★ 基於深度學習辨識跳頻信號之研究
★ Opinion Leader Discovery in Dynamic Social Networks	★ 深度學習模型於工業4.0之機台虛擬量測應用
★ A Novel NMF-Based Movie Recommendation with Time Decay	★ 以類別為基礎sequence-to-sequence模型之POI旅遊行程推薦
★ A DQN-Based Reinforcement Learning Model for Neural Network Architecture Search	★ Neural Network Architecture Optimization Based on Virtual Reward Reinforcement Learning
★ 生成式對抗網路架構搜尋	★ 以漸進式基因演算法實現神經網路架構搜尋最佳化
★ Enhanced Model Agnostic Meta Learning with Meta Gradient Memory	★ 遞迴類神經網路結合先期工業廢水指標之股價預測研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2027-7-1以後開放)

摘要(中)

自然語言生成模型在近年備受矚目並蓬勃發展，並且可以實際應用在商業中，如社群網站中的圖片敘述自動生成、新聞報導模板生成等。因此，自然語言生成十分注重於生成文字的品質以及是否與真人之寫作風格相似。然而，自然語言生成目前遭逢四大問題：訓練不穩定、獎勵稀疏、模式崩潰與曝光偏差，導致生成文字品質無法達到預期，更無法精準的學習寫作風格。因此，我們提出了〖GAN〗^2模型，透過結合IntraGAN與OuterGAN來建構一個創新的雙層生成對抗網路模型。IntraGAN作為OuterGAN的生成器，並結合beam search與IntraGAN的判別器來優化生成序列。IntraGAN生成之序列會輸出至OuterGAN，由經過改進的比較判別器來計算獎勵，以強化引導生成器更新的訊號，並更加輕易的傳遞更新資訊。且透過迭代對抗訓練持續優化模型。另外提出記憶機制穩定本模型的訓練，使效能最佳化。而本研究也透過三個資料集與三個評估方法作為效能評估，顯示本模型與不同知名模型比較有優秀的表現與極佳的生成品質。也在實驗中證明本模型架構採用的技術皆助於提升生成品質。最後探討模型中參數使用的影響以及最佳的參數配置來優化生成結果。

摘要(英)

Natural language generation (NLG) has recently flourished in research, and the NLG can apply to several commercial cases, such as text descriptions of images on social media and the templates of news reports. The research of NLG concentrates on improving the quality of text and generating sequences similar to human writing style. However, NLG suffers from four issues: training unstable, reward sparsity, mode collapse, and exposure bias. These issues provoke the awful text quality and fail to learn the accurate writing style. As a result, we propose a novel 〖GAN〗^2 model constructed by IntraGAN and OuterGAN based on the generative adversarial networks (GAN). IntraGAN is the generator of OuterGAN which employ beam search and discriminator of IntraGAN to optimize the generated sequence. Then output the generated sequence to the OuterGAN, calculate the reward by improved comparative discriminator to strengthen the reward signal, and easily update the generator. And we iterate adversarial training to update the models regularly. Moreover, we introduce the memory mechanism to stabilize the training process that improves the efficiency of training. We collect three datasets and three evaluation metrics to conduct the experiments. It reveals that our model outperforms other state-of-art baseline models, and also proves the components of our model help to improve the text quality. Finally, we discuss the influence of parameters in our model and find the best configuration to advance the generated results.

關鍵字(中)

★ 深度學習
★ 生成對抗網路
★ 自然語言生成

關鍵字(英)

★ Deep Learning
★ Generative Adversarial Network
★ Natural Language Generation

論文目次

摘要 i
Abstract ii
誌謝 iii
Table of Contents iv
List of Figures v
List of Tables vi
1. Introduction 1
2. Related Work 8
2.1 Natural Language Generation (NLG) 8
2.2 Addressing NLG problems using Generative Adversarial Network (GAN) 10
3. Proposed Method 13
3.1 GAN2 Framework 13
3.2 Outer-GAN Model 16
3.3 Intra-GAN Model 20
4. Experiments and Evaluation 24
4.1 Evaluation Metrics and Baseline Models 26
4.2 Performance Comparison 31
4.3 Memory Influence Discussion 35
4.4 Reward Setting Analysis 37
4.5 IntraGAN Significance Discussion 40
4.6 Ablation Study 41
4.7 Parameters Setting 47
4.8 Case Study 54
5. Conclusion 59
Reference 60

參考文獻

[1] I. Alsmadi, N. Aljaafari, M. Nazzal, S. Alhamed, A. H. Sawalmeh, C. P. Vizcarra, A. Khreishah, M. Anan, A. Algosaibi, M. A. Al-Naeem, A. Aldalbahi, and A. Al-Humam, “Adversarial Machine Learning in Text Processing: A Literature Survey,” IEEE Access, vol. 10, pp. 17043–17077, 2022, doi: 10.1109/ACCESS.2022.3146405.
[2] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein Generative Adversarial Networks,” in Proceedings of the 34th International Conference on Machine Learning, Jul. 2017, pp. 214–223.
[3] A. Belz, “Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models,” Natural Language Engineering, vol. 14, no. 4, pp. 431–455, Oct. 2008, doi: 10.1017/S1351324907004664.
[4] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language Models are Few-Shot Learners,” in Advances in Neural Information Processing Systems, 2020, vol. 33, pp. 1877–1901.
[5] A. Cahill, M. Forst, and C. Rohrer, “Stochastic Realisation Ranking for a Free Word Order Language,” in Proceedings of the Eleventh European Workshop on Natural Language Generation (ENLG 07), Saarbrücken, Germany, Jun. 2007, pp. 17–24.
[6] A. Celikyilmaz, E. Clark, and J. Gao, “Evaluation of Text Generation: A Survey,” arXiv:2006.14799 [cs], May 2021.
[7] T. Che, Y. Li, R. Zhang, R. D. Hjelm, W. Li, Y. Song, and Y. Bengio, “Maximum-Likelihood Augmented Discrete Generative Adversarial Networks,” arXiv:1702.07983 [cs], Feb. 2017.
[8] J. Chen, Y. Wu, C. Jia, H. Zheng, and G. Huang, “Customizable text generation via conditional text generative adversarial network,” Neurocomputing, vol. 416, pp. 125–135, Nov. 2020, doi: 10.1016/j.neucom.2018.12.092.
[9] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, Jun. 2019, pp. 4171–4186. doi: 10.18653/v1/N19-1423.
[10] L. Dong, N. Yang, W. Wang, F. Wei, X. Liu, Y. Wang, J. Gao, M. Zhou, and H.-W. Hon, “Unified Language Model Pre-training for Natural Language Understanding and Generation,” in Advances in Neural Information Processing Systems, 2019, vol. 32.
[11] R. Fathony and N. Goela, “Discrete Wasserstein Generative Adversarial Networks (DWGAN),” presented at the International Conference on Learning Representations, Feb. 2018.
[12] W. Fedus, I. Goodfellow, and A. M. Dai, “MaskGAN: Better Text Generation via Filling in the _______,” presented at the International Conference on Learning Representations, Feb. 2018.
[13] A. Gatt and E. Krahmer, “Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation,” Journal of Artificial Intelligence Research, vol. 61, pp. 65–170, Jan. 2018, doi: 10.1613/jair.5477.
[14] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Nets,” in Advances in Neural Information Processing Systems, 2014, vol. 27.
[15] A. Graves, “Generating Sequences With Recurrent Neural Networks,” arXiv:1308.0850 [cs], Jun. 2014.
[16] J. Guo, S. Lu, H. Cai, W. Zhang, Y. Yu, and J. Wang, “Long Text Generation via Adversarial Training with Leaked Information,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, Art. no. 1, Apr. 2018.
[17] T. He, J. Zhang, Z. Zhou, and J. Glass, “Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation?,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, Nov. 2021, pp. 5087–5102. doi: 10.18653/v1/2021.emnlp-main.415.
[18] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
[19] F. Huang, J. Guan, P. Ke, Q. Guo, X. Zhu, and M. Huang, “A Text GAN for Language Generation with Non-Autoregressive Generator,” presented at the International Conference on Learning Representations, Sep. 2020.
[20] T. D. Kulkarni, K. Narasimhan, A. Saeedi, and J. Tenenbaum, “Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation,” in Advances in Neural Information Processing Systems, 2016, vol. 29.
[21] M. J. Kusner and J. M. Hernández-Lobato, “GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution,” arXiv:1611.04051 [cs, stat], Nov. 2016.
[22] I. Langkilde, “Forest-Based Statistical Sentence Generation,” presented at the ANLP-NAACL 2000, 2000.
[23] C.-Y. Lin, “ROUGE: A Package for Automatic Evaluation of Summaries,” in Text Summarization Branches Out, Barcelona, Spain, Jul. 2004, pp. 74–81.
[24] K. Lin, D. Li, X. He, Z. Zhang, and M. Sun, “Adversarial Ranking for Language Generation,” in Advances in Neural Information Processing Systems, 2017, vol. 30.
[25] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Common Objects in Context,” in Computer Vision – ECCV 2014, Cham, 2014, pp. 740–755. doi: 10.1007/978-3-319-10602-1_48.
[26] S. Mangal, P. Joshi, and R. Modak, “LSTM vs. GRU vs. Bidirectional RNN for script generation,” arXiv:1908.04332 [cs], Aug. 2019.
[27] E. Montahaei, D. Alihosseini, and M. Soleymani Baghshah, “DGSAN: Discrete generative self-adversarial network,” Neurocomputing, vol. 448, pp. 364–379, Aug. 2021, doi: 10.1016/j.neucom.2021.03.097.
[28] W. Nie, N. Narodytska, and A. Patel, “RelGAN: Relational Generative Adversarial Networks for Text Generation,” presented at the International Conference on Learning Representations, Sep. 2018.
[29] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a Method for Automatic Evaluation of Machine Translation,” in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, Jul. 2002, pp. 311–318. doi: 10.3115/1073083.1073135.
[30] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” 2019.
[31] E. Reiter, R. Dale, S. Bird, and B. Boguraev, Building Natural Language Generation Systems. Cambridge, GBR: Cambridge University Press, 2009.
[32] G. Rizzo and T. H. M. Van, “Adversarial text generation with context adapted global knowledge and a self-attentive discriminator,” Information Processing & Management, vol. 57, no. 6, p. 102217, Nov. 2020, doi: 10.1016/j.ipm.2020.102217.
[33] Z. Shi, X. Chen, X. Qiu, and X. Huang, “Toward Diverse Text Generation with Inverse Reinforcement Learning,” pp. 4361–4367, 2018.
[34] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to Sequence Learning with Neural Networks,” in Advances in Neural Information Processing Systems, 2014, vol. 27.
[35] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All you Need,” in Advances in Neural Information Processing Systems, 2017, vol. 30.
[36] Q. Wu, L. Li, and Z. Yu, “TextGAIL: Generative Adversarial Imitation Learning for Text Generation,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 16, Art. no. 16, May 2021.
[37] Y. Wu and J. Wang, “Text Generation Service Model Based on Truth-Guided SeqGAN,” IEEE Access, vol. 8, pp. 11880–11886, 2020, doi: 10.1109/ACCESS.2020.2966291.
[38] Y. Yang, X. Dan, X. Qiu, and Z. Gao, “FGGAN: Feature-Guiding Generative Adversarial Networks for Text Generation,” IEEE Access, vol. 8, pp. 105217–105225, 2020, doi: 10.1109/ACCESS.2020.2993928.
[39] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, “XLNet: Generalized Autoregressive Pretraining for Language Understanding,” in Advances in Neural Information Processing Systems, 2019, vol. 32.
[40] H. Yin, D. Li, X. Li, and P. Li, “Meta-CoTGAN: A Meta Cooperative Training Paradigm for Improving Adversarial Text Generation,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, Art. no. 05, Apr. 2020, doi: 10.1609/aaai.v34i05.6490.
[41] L. Yu, W. Zhang, J. Wang, and Y. Yu, “SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1, Art. no. 1, Feb. 2017.
[42] W. Zhou, T. Ge, K. Xu, F. Wei, and M. Zhou, “Self-Adversarial Learning with Comparative Discrimination for Text Generation,” presented at the International Conference on Learning Representations, Sep. 2019.
[43] Y. Zhu, S. Lu, L. Zheng, J. Guo, W. Zhang, J. Wang, and Y. Yu, “Texygen: A Benchmarking Platform for Text Generation Models,” in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, New York, NY, USA, 27 2018, pp. 1097–1100. doi: 10.1145/3209978.3210080.

指導教授

陳以錚(Yi-Cheng Chen)

審核日期

2022-7-21

推文