以注意力機制輔助文本分類中的資料增益

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：16

、訪客IP：3.144.103.205

姓名

黃晉豪(Jin-Hao Huang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

以注意力機制輔助文本分類中的資料增益

相關論文

★ 網路合作式協同教學設計平台－以國中九年一貫課程為例	★ 內容管理機制於常用問答集(FAQ)之應用
★ 行動多重代理人技術於排課系統之應用	★ 存取控制機制與國內資安規範之研究
★ 信用卡系統導入NFC手機交易機制探討	★ App應用在電子商務的推薦服務-以P公司為例
★ 建置服務導向系統改善生產之流程-以W公司PMS系統為例	★ NFC行動支付之TSM平台規劃與導入
★ 關鍵字行銷在半導體通路商運用-以G公司為例	★ 探討國內田徑競賽資訊系統－以103年全國大專田徑公開賽資訊系統為例
★ 航空地勤機坪作業盤櫃追蹤管理系統導入成效評估—以F公司為例	★ 導入資訊安全管理制度之資安管理成熟度研究－以B個案公司為例
★ 資料探勘技術在電影推薦上的應用研究-以F線上影音平台為例	★ BI視覺化工具運用於資安日誌分析—以S公司為例
★ 特權帳號登入行為即時分析系統之實證研究	★ 郵件系統異常使用行為偵測與處理-以T公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

資料增益（Data Augmentation）在許多研究領域中都顯示對於模型預測的準確率有幫助，然而在自然語言處理中，資料增益方式大多都具隨機性，或者在增益前需要許多語言的先備知識，因此本研究提出以注意力機制作為輔助的資料增益方法，以進行資料增益，探討注意力機制對於文本分類中的資料增益是否有影響。最後在實驗結果中證實本研究提出之方法，可有效提升分類器在分類上之準確度，在僅有500筆資料量下提升分類器10%的分類準確度。

摘要(英)

Data augmentation is a strategy to increase the quantity of the data, in order to improve the performance of the model. This strategy is widely used in natural language processing field, however, data augmentation strategies in natural language field nowadays, either consist of a lot of randomness, or require a lot of human pre-defined rules. In our work, we propose a novel approach to augment the data according to the attention weight, which doesn’t require any human pre-defined rules yet can get rid of the randomness. Our approach increases the accuracy of the classifier model, and it shows the feasibility of taking attention weight as a basis to perform data augmentation.

關鍵字(中)

★ 資料增益
★ 文本分類
★ 自然語言處理
★ 注意力機制

關鍵字(英)

★ Data augmentation
★ Attention Mechanism
★ Text classification
★ Natural language processing

論文目次

摘要１
Abstract ii
目錄 iii
圖目錄 vi
表目錄 viii
一、緒論 1
1-1 研究背景 1
1-2 研究動機 2
1-3 研究目的 3
1-4 論文架構 4
二、文獻探討 5
2-1 圖像資料增益（Image Data Augmentation） 6
2-1-1 調變原圖之資料增益法 6
2-1-2 深度學習之資料增益法 8
2-2 文字的資料增益（Text Data Augmentation） 12
2-2-1 原文句增益法 13
2-2-2 反向翻譯法 19
2-3 以額外知識輔助資料增益之方法 20
2-3-1 以停用詞表為輔助的資料增益 20
2-3-2 以LDA （Latent Dirichlet Allocation）為輔助的資料增益 22
2-3-3 以詞性為輔助之資料增益 22
2-3-4 以TF-IDF為輔助之資料增益 23
2-4 注意力機制（Attention Mechanism） 24
2-4-1 注意力機制之發展 25
三、研究方法 29
3-1 研究流程 29
3-2 階層式注意力機制 29
3-3 注意力資料增益 33
3-4 下游分類器 34
四、實驗設計與分析 35
4-1 前處理、資料集與下游模型 35
4-2 實驗環境 35
4-3 實驗設計與結果 36
4-3-1 實驗一、注意力機制閾值設置實驗 36
4-3-2 實驗二、注意力機制與使用語言知識的資料增益之比較 41
4-3-3 實驗三、注意力機制在各資料增益方法中之表現 46
4-3-4 實驗四、注意力機制生成資料與原始資料之比較 49
4-4 實驗小結 50
五、結論與未來方向 52
5-1 結論 52
5-2 研究限制 52
5-3 未來研究方向 53
參考文獻 54

參考文獻

Bahdanau, D., Cho, K., & Bengio, Y. （2016）. Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv:1409.0473 [Cs, Stat]. http://arxiv.org/abs/1409.0473
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. （2014）. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. ArXiv:1406.1078 [Cs, Stat]. http://arxiv.org/abs/1406.1078
Chorowski, J. K., Bahdanau, D., Serdyuk, D., Cho, K., & Bengio, Y. （2015）. Attention-Based Models for Speech Recognition. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, & R. Garnett （Eds.）, Advances in Neural Information Processing Systems 28 （pp. 577–585）. Curran Associates, Inc. http://papers.nips.cc/paper/5847-attention-based-models-for-speech-recognition.pdf
Claude, C. （2018, December 5）. Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs. https://arxiv.org/ftp/arxiv/papers/1812/1812.04718.pdf
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. （2019）. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv:1810.04805 [Cs]. http://arxiv.org/abs/1810.04805
Doersch, C. （2016）. Tutorial on Variational Autoencoders. ArXiv:1606.05908 [Cs, Stat]. http://arxiv.org/abs/1606.05908
Duan, S., Zhao, H., Zhang, D., & Wang, R. （2020）. Syntax-aware Data Augmentation for Neural Machine Translation. ArXiv:2004.14200 [Cs]. http://arxiv.org/abs/2004.14200
Elder, H., & Hokamp, C. （2018）. Generating High-Quality Surface Realizations Using Data Augmentation and Factored Sequence Models. ArXiv:1805.07731 [Cs]. http://arxiv.org/abs/1805.07731
Galassi, A., Lippi, M., & Torroni, P. （2020）. Attention in Natural Language Processing. ArXiv:1902.02181 [Cs, Stat]. http://arxiv.org/abs/1902.02181
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. （2014）. Generative Adversarial Nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger （Eds.）, Advances in Neural Information Processing Systems 27 （pp. 2672–2680）. Curran Associates, Inc. http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
Hu, T., Qi, H., Huang, Q., & Lu, Y. （2019）. See Better Before Looking Closer: Weakly Supervised Data Augmentation Network for Fine-Grained Visual Classification. ArXiv:1901.09891 [Cs]. http://arxiv.org/abs/1901.09891
Ioffe, S., & Szegedy, C. （2015）. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. ArXiv:1502.03167 [Cs]. http://arxiv.org/abs/1502.03167
Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., & Liu, Q. （2019）. TinyBERT: Distilling BERT for Natural Language Understanding. ArXiv:1909.10351 [Cs]. http://arxiv.org/abs/1909.10351
Kang, G., Dong, X., Zheng, L., & Yang, Y. （2017）. PatchShuffle Regularization. ArXiv:1707.07103 [Cs]. http://arxiv.org/abs/1707.07103
Kim, Y. （2014）. Convolutional Neural Networks for Sentence Classification. ArXiv:1408.5882 [Cs]. http://arxiv.org/abs/1408.5882
Kobayashi, S. （2018）. Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations. ArXiv:1805.06201 [Cs]. http://arxiv.org/abs/1805.06201
Loper, E., & Bird, S. （2002）. NLTK: The Natural Language Toolkit. ArXiv:Cs/0205028. http://arxiv.org/abs/cs/0205028
Luo, C., Zhu, Y., Jin, L., & Wang, Y. （2020）. Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition. ArXiv:2003.06606 [Cs]. http://arxiv.org/abs/2003.06606
Luque, F. M. （2019）. Atalaya at TASS 2019: Data Augmentation and Robust Embeddings for Sentiment Analysis. ArXiv:1909.11241 [Cs]. http://arxiv.org/abs/1909.11241
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. （2013）. Distributed Representations of Words and Phrases and their Compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger （Eds.）, Advances in Neural Information Processing Systems 26 （pp. 3111–3119）. Curran Associates, Inc. http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
Miller, G. A. （1995）. WordNet: A lexical database for English. Communications of the ACM, 38（11）, 39–41. https://doi.org/10.1145/219717.219748
Moreno-Barea, F. J., Strazzera, F., Jerez, J. M., Urda, D., & Franco, L. （2018）. Forward Noise Adjustment Scheme for Data Augmentation. 2018 IEEE Symposium Series on Computational Intelligence （SSCI）, 728–734. https://doi.org/10.1109/SSCI.2018.8628917
Muhammad, A., & Amit, K. S. （2019）. A Text Data Augmentation Approach for Improving the Performance of CNN.
Pan, S. J., & Yang, Q. （2010）. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22（10）, 1345–1359. https://doi.org/10.1109/TKDE.2009.191
Pennington, J., Socher, R., & Manning, C. （2014）. Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing （EMNLP）, 1532–1543. https://doi.org/10.3115/v1/D14-1162
Radford, A., Metz, L., & Chintala, S. （2016）. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. ArXiv:1511.06434 [Cs]. http://arxiv.org/abs/1511.06434
Ratner, A. J., Ehrenberg, H., Hussain, Z., Dunnmon, J., & Ré, C. （2017）. Learning to Compose Domain-Specific Transformations for Data Augmentation. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett （Eds.）, Advances in Neural Information Processing Systems 30 （pp. 3236–3246）. Curran Associates, Inc. http://papers.nips.cc/paper/6916-learning-to-compose-domain-specific-transformations-for-data-augmentation.pdf
Sennrich, R., Haddow, B., & Birch, A. （2016）. Improving Neural Machine Translation Models with Monolingual Data. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1: Long Papers）, 86–96. https://doi.org/10.18653/v1/P16-1009
Shleifer, S. （2019）. Low Resource Text Classification with ULMFit and Backtranslation. ArXiv, abs/1903.09244.
Shorten, C., & Khoshgoftaar, T. M. （2019）. A survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 6（1）, 60. https://doi.org/10.1186/s40537-019-0197-0
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. （n.d.）. Dropout: A Simple Way to Prevent Neural Networks from Overﬁtting. 30.
Sugiyama, A., & Yoshinaga, N. （2019）. Data augmentation using back-translation for context-aware neural machine translation. Proceedings of the Fourth Workshop on Discourse in Machine Translation （DiscoMT 2019）, 35–44. https://doi.org/10.18653/v1/D19-6504
Tan, J., Wan, X., & Xiao, J. （2017）. Abstractive Document Summarization with a Graph-Based Attentional Neural Model. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1: Long Papers）, 1171–1181. https://doi.org/10.18653/v1/P17-1108
Tan, M., & Le, Q. V. （2019）. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ArXiv:1905.11946 [Cs, Stat]. http://arxiv.org/abs/1905.11946
Taylor, L., & Nitschke, G. （2017）. Improving Deep Learning using Generic Data Augmentation. ArXiv:1708.06020 [Cs, Stat]. http://arxiv.org/abs/1708.06020
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. （2017）. Attention Is All You Need. ArXiv:1706.03762 [Cs]. http://arxiv.org/abs/1706.03762
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., & Tang, X. （2017）. Residual Attention Network for Image Classification. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）, 6450–6458. https://doi.org/10.1109/CVPR.2017.683
Wang, W. Y., & Yang, D. （2015）. That’s So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2557–2563. https://doi.org/10.18653/v1/D15-1306
Wei, J., & Zou, K. （2019）. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. https://openreview.net/forum?id=BJelsDvo84
Xie, Q., Dai, Z., Hovy, E., Luong, M.-T., & Le, Q. V. （2019）. Unsupervised Data Augmentation for Consistency Training. ArXiv:1904.12848 [Cs, Stat]. http://arxiv.org/abs/1904.12848
Xie, Z., Wang, S. I., Li, J., Lévy, D., Nie, A., Jurafsky, D., & Ng, A. Y. （2017）. Data Noising as Smoothing in Neural Network Language Models. ArXiv:1703.02573 [Cs]. http://arxiv.org/abs/1703.02573
Xiong, C., Merity, S., & Socher, R. （2016）. Dynamic Memory Networks for Visual and Textual Question Answering. ArXiv:1603.01417 [Cs]. http://arxiv.org/abs/1603.01417
Xu, Y., Jia, R., Mou, L., Li, G., Chen, Y., Lu, Y., & Jin, Z. （2016）. Improved relation classification by deep recurrent neural networks with data augmentation. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 1461–1470. https://www.aclweb.org/anthology/C16-1138
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. （2016）. Hierarchical Attention Networks for Document Classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480–1489. https://doi.org/10.18653/v1/N16-1174
Zhang, X., Zhao, J., & LeCun, Y. （2015）. Character-level Convolutional Networks for Text Classification. Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, 649–657. http://dl.acm.org/citation.cfm?id=2969239.2969312

指導教授

林熙禎(She-Jen Lin)

審核日期

2020-7-28

推文