DC 欄位 |
值 |
語言 |
DC.contributor | 資訊管理學系 | zh_TW |
DC.creator | 黃晧誠 | zh_TW |
DC.creator | Hao-Cheng Huang | en_US |
dc.date.accessioned | 2019-7-19T07:39:07Z | |
dc.date.available | 2019-7-19T07:39:07Z | |
dc.date.issued | 2019 | |
dc.identifier.uri | http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=106423012 | |
dc.contributor.department | 資訊管理學系 | zh_TW |
DC.description | 國立中央大學 | zh_TW |
DC.description | National Central University | en_US |
dc.description.abstract | 預訓練(Pre-training)在自然語言處理極為重要,然而中文在較新的自然語言處理 遷移學習研究較少,且多數是基於特徵及靜態嵌入方法之模型,因此本研究提出利用中 文更深層的特徵——筆順,納入輸入維度以學習子字元之特徵,並以近期提出基於特徵 方法 ELMo 及微調方法 BERT 的預訓練模型為基礎進行修改,試探討筆順對於中文預 訓練模型的影響,提出利用卷積類神經網路模型考量筆順特徵(Stroke)之 ELMo+S 及 BERT+S 模型。最後,使用下游任務 XNLI 及 LCQMC 資料集進行評估,結果顯示筆順 特徵對於這兩種預訓練模型並無明顯幫助。 | zh_TW |
dc.description.abstract | Pre-training is extremely important in natural language processing. However, Chinese studies about transfer learning are less, and most of them are uesd features-based and static embedding methods. Therefore, this study proposes to use deeper features by Chinese- strokes, and integrates input dimensions to learn the characteristics of sub-characters based on the recent proposed pre-training model ELMO with feature-based method and BERT with fine-tuning method. We proposed the ELMo+S and BERT+S models which consider stroke features by the convolutional neural network. Finally, the results show that stroke features are not significantly helpful for these two pre-training models on the downstream task XNLI and LCQMC datasets. | en_US |
DC.subject | 預訓練 | zh_TW |
DC.subject | 表徵 | zh_TW |
DC.subject | 自然語言處理 | zh_TW |
DC.subject | 中文 | zh_TW |
DC.subject | 筆順 | zh_TW |
DC.subject | Pre-training | en_US |
DC.subject | Representation | en_US |
DC.subject | Natural language processing | en_US |
DC.subject | Chinese | en_US |
DC.subject | Stroke | en_US |
DC.title | 中文筆順預訓練效能之研究 | zh_TW |
dc.language.iso | zh-TW | zh-TW |
DC.type | 博碩士論文 | zh_TW |
DC.type | thesis | en_US |
DC.publisher | National Central University | en_US |