中文筆順預訓練效能之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：67

、訪客IP：13.59.241.75

姓名

黃晧誠(Hao-Cheng Huang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

中文筆順預訓練效能之研究

相關論文

★ 網路合作式協同教學設計平台－以國中九年一貫課程為例	★ 內容管理機制於常用問答集(FAQ)之應用
★ 行動多重代理人技術於排課系統之應用	★ 存取控制機制與國內資安規範之研究
★ 信用卡系統導入NFC手機交易機制探討	★ App應用在電子商務的推薦服務-以P公司為例
★ 建置服務導向系統改善生產之流程-以W公司PMS系統為例	★ NFC行動支付之TSM平台規劃與導入
★ 關鍵字行銷在半導體通路商運用-以G公司為例	★ 探討國內田徑競賽資訊系統－以103年全國大專田徑公開賽資訊系統為例
★ 航空地勤機坪作業盤櫃追蹤管理系統導入成效評估—以F公司為例	★ 導入資訊安全管理制度之資安管理成熟度研究－以B個案公司為例
★ 資料探勘技術在電影推薦上的應用研究-以F線上影音平台為例	★ BI視覺化工具運用於資安日誌分析—以S公司為例
★ 特權帳號登入行為即時分析系統之實證研究	★ 郵件系統異常使用行為偵測與處理-以T公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

預訓練(Pre-training)在自然語言處理極為重要，然而中文在較新的自然語言處理遷移學習研究較少，且多數是基於特徵及靜態嵌入方法之模型，因此本研究提出利用中文更深層的特徵——筆順，納入輸入維度以學習子字元之特徵，並以近期提出基於特徵方法 ELMo 及微調方法 BERT 的預訓練模型為基礎進行修改，試探討筆順對於中文預訓練模型的影響，提出利用卷積類神經網路模型考量筆順特徵(Stroke)之 ELMo+S 及 BERT+S 模型。最後，使用下游任務 XNLI 及 LCQMC 資料集進行評估，結果顯示筆順特徵對於這兩種預訓練模型並無明顯幫助。

摘要(英)

Pre-training is extremely important in natural language processing. However, Chinese studies about transfer learning are less, and most of them are uesd features-based and static embedding methods. Therefore, this study proposes to use deeper features by Chinese- strokes, and integrates input dimensions to learn the characteristics of sub-characters based on the recent proposed pre-training model ELMO with feature-based method and BERT with fine-tuning method. We proposed the ELMo+S and BERT+S models which consider stroke features by the convolutional neural network. Finally, the results show that stroke features are not significantly helpful for these two pre-training models on the downstream task XNLI and LCQMC datasets.

關鍵字(中)

★ 預訓練
★ 表徵
★ 自然語言處理
★ 中文
★ 筆順

關鍵字(英)

★ Pre-training
★ Representation
★ Natural language processing
★ Chinese
★ Stroke

論文目次

摘要 i
Abstract ii
誌謝 iii
目錄 iv
圖目錄 vii
表目錄 ix
一、緒論 1
1-1 研究背景 1
1-2 研究動機 2
1-3 研究目的 4
1-4 研究架構 5
二、相關研究 6
2-1 特徵抽取模型 6
2-1-1 CNN 6
2-1-2 LSTM 10
2-1-3 Transformer 14
2-2 預訓練 18
2-2-1 基於特徵 18
2-2-2 微調 20
2-3 中文 23
2-3-1 基於特徵 23
2-3-2 表徵 24
2-4 小結 25
三、研究方法 26
3-1 研究架構 26
3-2 資料前處理 27
3-2-1 簡繁轉換 27
3-2-2 筆順 27
3-3 預訓練模型 28
3-3-1 ELMo+S 28
3-3-2 BERT+S 30
3-4 下游任務模型 30
3-4-1 ELMo 下游模型 30
3-4-2 BERT 下游模型 32
3-5 模型評估 33
四、實驗與結果 34
4-1 前處理與資料集 34
4-1-1 筆順對應表 34
4-1-2 詞彙表 35
4-1-3 預訓練外部語料庫 35
4-1-4 下游任務資料集 36
4-1-5 各資料集之筆順長度 38
4-2 實驗環境 43
4-3 實驗設計與結果 44
4-3-1 實驗一：簡繁與筆順長度對於模型之影響 44
4-3-2 實驗二：CNN 及卷積核大小對於模型之影響 47
4-3-3 實驗三：高速網路對於模型之影響 48
4-3-4 實驗四：筆順對於預訓練模型之影響 49
五、結論與未來研究方向 52
5-1 結論 52
5-2 研究限制 53
5-3 未來研究方向 54
參考文獻 55
英文文獻 55

參考文獻

Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016). Layer Normalization. ArXiv:1607.06450 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1607.06450
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5, 135–146. https://doi.org/10.1162/tacl_a_00051
Bonaccorso, G., Fandango, A., & Shanmugamani, R. (2018). Python advanced guide to artificial intelligence: Expert machine learning systems and intelligent agents using Python.
Botha, J., & Blunsom, P. (2014). Compositional morphology for word representations and language modelling. International Conference on Machine Learning, 1899–1907.
Cer, D., Yang, Y., Kong, S., Hua, N., Limtiaco, N., John, R. S., … Kurzweil, R. (2018). Universal Sentence Encoder. ArXiv:1803.11175 [Cs]. Retrieved from http://arxiv.org/abs/1803.11175
Chen, X., Xu, L., Liu, Z., Sun, M., & Luan, H. (2015). Joint Learning of Character and Word Embeddings. Twenty-Fourth International Joint Conference on Artificial Intelligence, 1236–1242. IJCAI.
Collobert, R., & Weston, J. (2008). A Uniﬁed Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. Proceedings of the 25th International Conference on Machine Learning. ACM, 8.
Conneau, A., & Kiela, D. (2018). SentEval: An Evaluation Toolkit for Universal Sentence Representations. ArXiv:1803.05449 [Cs]. Retrieved from http://arxiv.org/abs/1803.05449
Conneau, A., Lample, G., Rinott, R., Williams, A., Bowman, S. R., Schwenk, H., & Stoyanov, V. (2018). XNLI: Evaluating Cross-lingual Sentence Representations. ArXiv:1809.05053 [Cs]. Retrieved from http://arxiv.org/abs/1809.05053
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv:1810.04805 [Cs]. Retrieved from http://arxiv.org/abs/1810.04805
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179–211.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Howard, J., & Ruder, S. (2018). Universal Language Model Fine-tuning for Text Classification. ArXiv:1801.06146 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1801.06146
Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1746–1751. https://doi.org/10.3115/v1/D14-1181
Kiros, R., Zhu, Y., Salakhutdinov, R., Zemel, R. S., Torralba, A., Urtasun, R., & Fidler, S. (2015). Skip-Thought Vectors. ArXiv:1506.06726 [Cs]. Retrieved from http://arxiv.org/abs/1506.06726
LeCun, Y. (1989). Generalization and network design strategies. In Connectionism in perspective (Vol. 19). Citeseer.
Li, Y., Li, W., Sun, F., & Li, S. (2015). Component-Enhanced Chinese Character Embeddings. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 829–834. https://doi.org/10.18653/v1/D15-1098
Liu, X., Chen, Q., Deng, C., Zeng, H., Chen, J., Li, D., & Tang, B. (2018). LCQMC:A Large-scale Chinese Question Matching Corpus. Proceedings of the 27th International Conference on Computational Linguistics, 1952–1962. Retrieved from http://www.aclweb.org/anthology/C18-1166
Luong, T., Socher, R., & Manning, C. (2013). Better word representations with recursive neural networks for morphology. Proceedings of the Seventeenth Conference on Computational Natural Language Learning, 104–113.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. ArXiv Preprint ArXiv:1301.3781.
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. Retrieved from http://www.aclweb.org/anthology/D14-1162
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. ArXiv:1802.05365 [Cs]. Retrieved from http://arxiv.org/abs/1802.05365
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving Language Understanding by Generative Pre-Training. 12.
Rücklé, A., Eger, S., Peyrard, M., & Gurevych, I. (2018). Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations. ArXiv:1803.01400 [Cs]. Retrieved from http://arxiv.org/abs/1803.01400
Shaosheng Cao, J. Z., Wei Lu, & Li, X. (2018). cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information.
Srivastava, R. K., Greff, K., & Schmidhuber, J. (2015). Highway Networks. ArXiv:1505.00387 [Cs]. Retrieved from http://arxiv.org/abs/1505.00387
Su, T., & Lee, H. (2017). Learning Chinese Word Representations From Glyphs Of Characters. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 264–273. https://doi.org/10.18653/v1/D17-1025
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is All you Need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 5998–6008). Retrieved from http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
Williams, A., Nangia, N., & Bowman, S. (2018). A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1112–1122. https://doi.org/10.18653/v1/N18-1101
Wu, W., Meng, Y., Han, Q., Li, M., Li, X., Mei, J., … Li, J. (2019). Glyce: Glyph-vectors for Chinese Character Representations. ArXiv:1901.10125 [Cs]. Retrieved from http://arxiv.org/abs/1901.10125
Yin, R., Wang, Q., Li, P., Li, R., & Wang, B. (2016). Multi-Granularity Chinese Word Embedding. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 981–986. Retrieved from https://aclweb.org/anthology/D16-1100
Yu, J., Jian, X., Xin, H., & Song, Y. (2017). Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 286–291. Retrieved from https://www.aclweb.org/anthology/D17-1027
Yu, S., Kulkarni, N., Lee, H., & Kim, J. (2017). Syllable-level neural language model for agglutinative language. ArXiv Preprint ArXiv:1708.05515.
Zhuang, H., Wang, C., Li, C., Li, Y., Wang, Q., & Zhou, X. (2018). Chinese Language Processing Based on Stroke Representation and Multidimensional Representation. IEEE Access, 6, 41928–41941. https://doi.org/10.1109/ACCESS.2018.2860058

指導教授

林熙禎

審核日期

2019-7-19

推文