博碩士論文 106423012 完整後設資料紀錄

DC 欄位 語言
DC.contributor資訊管理學系zh_TW
DC.creator黃晧誠zh_TW
DC.creatorHao-Cheng Huangen_US
dc.date.accessioned2019-7-19T07:39:07Z
dc.date.available2019-7-19T07:39:07Z
dc.date.issued2019
dc.identifier.urihttp://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=106423012
dc.contributor.department資訊管理學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract預訓練(Pre-training)在自然語言處理極為重要,然而中文在較新的自然語言處理 遷移學習研究較少,且多數是基於特徵及靜態嵌入方法之模型,因此本研究提出利用中 文更深層的特徵——筆順,納入輸入維度以學習子字元之特徵,並以近期提出基於特徵 方法 ELMo 及微調方法 BERT 的預訓練模型為基礎進行修改,試探討筆順對於中文預 訓練模型的影響,提出利用卷積類神經網路模型考量筆順特徵(Stroke)之 ELMo+S 及 BERT+S 模型。最後,使用下游任務 XNLI 及 LCQMC 資料集進行評估,結果顯示筆順 特徵對於這兩種預訓練模型並無明顯幫助。zh_TW
dc.description.abstractPre-training is extremely important in natural language processing. However, Chinese studies about transfer learning are less, and most of them are uesd features-based and static embedding methods. Therefore, this study proposes to use deeper features by Chinese- strokes, and integrates input dimensions to learn the characteristics of sub-characters based on the recent proposed pre-training model ELMO with feature-based method and BERT with fine-tuning method. We proposed the ELMo+S and BERT+S models which consider stroke features by the convolutional neural network. Finally, the results show that stroke features are not significantly helpful for these two pre-training models on the downstream task XNLI and LCQMC datasets.en_US
DC.subject預訓練zh_TW
DC.subject表徵zh_TW
DC.subject自然語言處理zh_TW
DC.subject中文zh_TW
DC.subject筆順zh_TW
DC.subjectPre-trainingen_US
DC.subjectRepresentationen_US
DC.subjectNatural language processingen_US
DC.subjectChineseen_US
DC.subjectStrokeen_US
DC.title中文筆順預訓練效能之研究zh_TW
dc.language.isozh-TWzh-TW
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明