應用門控機制與多層卷積深度學習模型於中文命名實體辨識之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：11

、訪客IP：18.116.164.246

姓名

張智皓(Chih-Hao Chang) 查詢紙本館藏

畢業系所

資訊工程學系在職專班

論文名稱

應用門控機制與多層卷積深度學習模型於中文命名實體辨識之研究
(Multi-Stack Convolution with Gating Mechanism for Chinese Named Entity Recognition)

相關論文

★ 行程邀約郵件的辨識與不規則時間擷取之研究	★ NCUFree校園無線網路平台設計及應用服務開發
★ 網際網路半結構性資料擷取系統之設計與實作	★ 非簡單瀏覽路徑之探勘與應用
★ 遞增資料關聯式規則探勘之改進	★ 應用卡方獨立性檢定於關連式分類問題
★ 中文資料擷取系統之設計與研究	★ 非數值型資料視覺化與兼具主客觀的分群
★ 關聯性字組在文件摘要上的探討	★ 淨化網頁：網頁區塊化以及資料區域擷取
★ 問題答覆系統使用語句分類排序方式之設計與研究	★ 時序資料庫中緊密頻繁連續事件型樣之有效探勘
★ 星狀座標之軸排列於群聚視覺化之應用	★ 由瀏覽歷程自動產生網頁抓取程式之研究
★ 動態網頁之樣版與資料分析研究	★ 同性質網頁資料整合之自動化研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在傳統的基於機器學習的中文命名實體辨識系統中，往往採用從中文文本中萃取出大量的人工特徵(hand-craft features)、甚至採用專家所設計實體專用關鍵詞庫(Dictionary)等，再利用線性統計與機率模型的方法統整出重要特徵進而找出中文語意規則，然而卻有兩個顯而易見的缺點：從大量中文文本中提取特徵是一件非常費時費力且複雜的任務；再者，模型的優劣完全相依於人工所設計之特徵辨識強度。因此，礙於中文語意混淆特性與未知詞彙，精確率難以提高。
有鑑於在不同的語系結構下，英文以空格作為斷詞特徵，而中文則無明確斷詞表現，但字詞間的關係卻具有強烈的相依性，並根據前後文語意將展現不同的差異性(同字異義、一詞多義)。因此，在龐大語料庫中如何辨識中文命名實體，極具挑戰與可能性。
為應對上述種種挑戰以及缺點，本研究採用深度學習架構完成中文命名實體辨識系統；首先透過非監督式學習(Unsupervised Learning)方式採用深度學習模型對大量文本預訓練詞嵌入字典；透過字典將字、詞數值化，再應用多層次卷積(Convolution)層階層式地萃取文字特徵，層與層間加入門控機制泛化特徵，在無任何特徵工程下自動萃取出蘊含於其中的特徵資訊，目的在於減少命名實體辨識對於人工特徵的依賴、及毋須設計中文識別特徵，該方法有效地應用於辨識實體類型。
本研究使用資料文檔包括SIGHAN Bakeoff-3[1]及透過客製化爬蟲程式所擷取網路之文章作為訓練資料；以實體報章電子檔做為測試資料[31]，作為基準用以評估各模型之效能，經研究測試結果呈現，本文所提出之模型F1-Measure達SIGHAN overall 90.76%和報章電子檔 90.42 %之出眾效能。

摘要(英)

Traditional Chinese Named Entity Recognition based on machine learning usually relies on large amounts of hand-craft features, even dictionaries created by experts specific for entity, and then, uses linear regression and statistical models to gather important features and Chinese semantic rules. However, two obvious flaws can be observed. Firstly, it is extremely time-consuming and complicated to extract features from Chinese texts. Secondly, the usefulness of the models completely depends on the recognition efficiency based on hand-craft features; as a result, it is difficult to improve its accuracy due to semantic confusion that is characteristic in Chinese and unknown vocabularies.
In English, spaces are used for word segmentation, and Chinese does not have similar word segmentation. However, Chinese words are highly interdependent and demonstrate semantic differences (homographs, polysemy) based on the context. Therefore, a great challenge as well as a possibility is how to recognize Chinese named entities in large corpora.
To provide a solution to the challenge and flaws mentioned above, this study employs deep learning structure to complete Chinese Named Entity Recognition. Firstly, the deep learning model is combined with unsupervised learning to embed a large amount of pre-training words in the vocabulary. Then, the vocabulary is used to numeralize words before using multi-stack convolution to extract textual features. Gating mechanism is also incorporated between layers to generalize features and automatically extract features without employing feature engineering. The purpose of doing so is to reduce the dependency on hand-craft features in Named Entity Recognition and avoid hand-craft Chinese recognition features. This method can be effectively applied to recognizing different types of entities.
This study uses documents from SIGHAN Bakeoff-3 and utilizes customized crawler programs to capture internet articles for training data. Electronic files of newspaper articles are used as testing data and form the standard by which the efficiency of different models can be evaluated. The results show that the F1-Measure model proposed by the study reaches outstanding an overall efficiency of 90.76% in SIGHAN and 90.42% in electronic files of newspaper articles.

關鍵字(中)

★ 深度學習
★ 命名實體辨識
★ 卷積神經網路
★ 門控機制

關鍵字(英)

★ Deep Learning
★ Named Entity Recognition
★ Convolutional Neural Networks
★ Gating Mechanism

論文目次

摘要 i
Abstract ii
目錄 iv
圖目錄 vi
表目錄 vii
1. 緒論 1
1.1. 研究動機 1
1.2. 研究目的與貢獻 3
1.3. 章節概要 5
2. 相關研究 6
3. 系統架構 10
3.1 資料準備 10
3.2 自然語言處理 (Natural Language Processing, NLP) 11
3.2.1 資料處理 12
3.2.2 資料分析 13
3.2.3 字/詞嵌入 15
3.3 深度學習模型架構 (Deep Learning architecture) 17
3.3.1 輸入層 (Input Layer) 18
3.3.2 嵌入層 (Embedding Layer) 18
3.3.3 堆疊卷積層 (Stacked Convolution Layer) 19
3.3.4 循環神經網路層 (Recurrent Neural Network Layer) 22
3.3.5 輸出層 (Output Layer) 23
4. 實驗與系統效能 26
4.1 評估方法 26
4.2 模型評估 27
4.3 因子分析與結果討論 31
5. 結論與未來工作 37
參考文獻 38

參考文獻

[1] Levow, G.A.: The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: Computational Linguistics, pp. 108–117 (2006).
[2] Sunita Sarawagi (2008), “Information Extraction,” Foundations and Trends® in Databases, pp. 261-377, 2008.
[3] L. Satish and B.I. Gururaj. 1993. Use of hidden Markov models for partial discharge pattern classification. Electrical Insulation, IEEE Transactions on 28, 2 (Apr 1993), 172–182.
[4] Gideon S. Mann and Andrew McCallum. 2010. Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data. J. Mach. Learn. Res. 11 (March 2010), 955–984.
[5] Andrew McCallum and Wei Li. 2003. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-enhanced Lexicons. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003 -Volume 4 (CONLL ’03). Association for Computational Linguistics, Stroudsburg, PA, USA, 188–191.
[6] Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF Models for Sequence Tagging [OL]. arXiv Preprint.arXiv: 1508.01991.
[7] Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of EMNLP-2014, pages 1532–1543, Doha, Qatar, October.
[8] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[9] Ronan Collobert, Jason Weston, Leon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. The Journal of Machine Learning Research, 12:2493–2537.
[10] Nanyun Peng and Mark Dredze. 2015. Named entity recognition for chinese social media with jointly trained embeddings. In Proceedings of EMNLP-2015, pages 548–554, Lisbon, Portugal, September.
[11] Shen, Y., Yun, H., Lipton, Z., Kronrod, y., & Anandkumar, A. (2018). Deep active learning for Named entity recognition. preprint arXiv:1707.05928v3.
[12] Ma X, Hovy E. End-to-End Sequence Labeling via Bi-directional LSTM-CNNs-CRF [OL]. (2016) arXiv Preprint. arXiv: 1603.01354.
[13] Glorot, Xavier and Bengio, Yoshua. Understanding the difficulty of training deep feedforward neural networks. The handbook of brain theory and neural networks, 2010.
[14] Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. 2016. Language modeling with gated convolutional networks. arXiv Preprint. arXiv: 1612.08083.
[15] Wang, C., and Xu, B. (2017) Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation. preprint arXiv:1711.04411.
[16] Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673-2681.
[17] Lev Ratinov and Dan Roth. 2009. Design challenges and misconceptions in named entity recognition. In Proceedings of CoNLL-2009, pages 147–155.
[18] Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. In Proceedings of NAACL-2016, San Diego, California, USA, June.
[19] Joohui An, Seungwoo Lee, and Gary Geunbae Lee. 2003. Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics -Volume 2 (ACL’03). Association for Computational Linguistics, Stroudsburg, PA, USA, 165–168.
[20] Salton, G., Wong, A., Yang, C. S., “A Vector Space Model for Automatic Indexing,” Commun. ACM, vol. 18, 1975, pp：613-620.
[21] Bottou. Stochastic gradient learning in neural networks. In Proceedings of Neuro-Nˆımes. EC2, 1991.
[22] Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard,Wayne Hubbard, and Lawrence D Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541–551.
[23] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing system. 2012.
[24] Sepp Hochreiter, Jürgen Schmidhuber, “Long Short-Term Memory”, in Neural Computation 9(8):1735-80, December 1997.
[25] Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.
[26] John D. Lafferty, Andrew Mccallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. pages 282–289.
[27] TensorFlow, https://www.tensorflow.org/
[28] CRF++: Yet Another CRFtoolkit：http://crfpp.sourceforge.net/
[29] Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky,Ilya Sutskever, and Ruslan Salakhutdinov. 2014.Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1):1929–1958.
[30] Chuanhai Dong, Jiajun Zhang, Chengqing Zong, Masanori Hattori, and Hui Di. 2016. Character based LSTM-CRF with radical-level features for Chinese named entity recognition. In International Conference on Computer Processing of Oriental Languages. Springer, pages 239–250.
[31] Y. Y. Huang, C.H. Chung, “A Tool for Web NER Model Generation Based on Google Snippets,” Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, pp. 148–163, 2015.
[32] Luong, T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 1412-1421).
[33] Jieba,https://github.com/fxsjy/jieba
[34] Mikolov, T., Karafiát, M., Burget, L., Černocký, J., & Khudanpur, S. (2010). Recurrent neural network based language model. In Eleventh Annual Conference of the International Speech Communication Association.
[35] Daqian Wei, Bo Wang, Gang Lin, Dichen Liu, Zhaoyang Dong, Hesen Liu, and Yilu Liu. Research on unstructured text data mining and fault classification based on rnn-lstm with malfunction inspection report. Energies, 10(3), 2017.

指導教授

張嘉惠(Chia-Hui Chang)

審核日期

2018-9-13

推文