A Robust and Generalizable Framework for Chinese Named Entity Recognition

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/98307

jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98307

题名:	A Robust and Generalizable Framework for Chinese Named Entity Recognition
作者:	王俊顏;Wang, Chun-Yen
贡献者:	資訊工程學系
关键词:	中文命名實體識別;自然語言處理;Chinese Named Entity Recognition;Natural Language Processing
日期:	2025-07-22
上传时间:	2025-10-17 12:37:06 (UTC+8)
出版者:	國立中央大學
摘要:	中文命名實體識別（NER）作為自然語言理解領域的一項基礎任務，在從非結構化文本中萃取結構化知識方面扮演著關鍵角色。然而，其效能常受中文語言固有的歧義性所限制，例如缺乏明確的詞彙邊界及大寫標記，這對其識別準確性構成重大挑戰。為應對此挑戰，本研究提出了一種具備魯棒性(robustness)及泛化能力的中文命名實體識別框架。此框架運用預訓練語言模型以生成深度的上下文嵌入表徵。這些嵌入表徵再整合進上下文感知模組，此感知模組旨在增強位置理解與消除歧義性。此外，該框架亦整合一種對抗式訓練技術以提升模型魯棒性，並結合條件隨機場（CRF）層來確保最終輸出的結構一致性。為驗證此框架的有效性，我們在來自社交媒體與醫療保健領域的資料集上進行了全面的評估。實驗結果證明，我們所提出的框架在性能表現上超越了當前最先進的模型。 ;Chinese Named Entity Recognition (NER) serves as a cornerstone to transform unstructured text into structured knowledge. However, its efficacy is frequently constrained by inherent linguistic ambiguities in the Chinese language, such as the absence of explicit word boundaries and capitalization, which poses significant chal- lenges to recognition accuracy. In response to this, we introduce a robust and gener- alizable framework for Chinese NER. The proposed framework utilizes a pre-trained language model to generate deep contextual embeddings. These embeddings are integrated with a contextual awareness module to enhance positional understand- ing and resolve ambiguity. Furthermore, the framework incorporates an adversarial training technique to improve model robustness and a conditional random field layer to ensure structural coherence of the final output. To validate the effectiveness of the framework, we conducted comprehensive evaluations on diverse datasets from the social media and healthcare domains. The experimental results reveal that our proposed framework outperforms existing leading models in Chinese NER.
显示于类别:	[資訊工程研究所] 博碩士論文

文件中的档案:

档案	描述	大小	格式	浏览次数
index.html		0Kb	HTML	20	检视/开启

在NCUIR中所有的数据项都受到原著作权保护.

社群 sharing

数据加载中.....