摘要: | 隨著科技的日新月異,在網路上表達自己的看法變得更加便利。因此對某個領域有興趣時,可以偵測網路聲量,進行各種分析。不過單純只討論提及次數,難以得到正確得評價,因為有可能這則評論真正的意見不是針對提及的人物。且使用者在社群媒體上的敘述較為口語,較不依循正規的文法表達方式,加上收集新的領域資料後,需要花大量的時間、金錢進行標記,因此本篇論文希望從這些資料中找出正確的意見目標,且使用標記資料訓練的模型,幫助新的領域資料進行標記。
因此使用遷移式學習 (Transformer Learning) 的方向設計模型架構,以多任務學習 (Multi-task Learning) 的方式進行中文歌手的辨識 (Named Entity Recognition, NER) 和基於面向的情感分析 (Aspect-Based Sentiment Analysis, ABSA) 的任務。我們應用參數生成網路 結合梯度反轉層 (Gradient Adversarial Layer, GRL) 來建立模型。並使用 Tie/Break 進行標記 ,以此提升中文斷詞的準確度。透過動態調節權重的機制 (Dynamic Weight Average, DWA) ,依據每個任務的損失變化率來調整任務權重。
實驗結果顯示,我們的擴展參數生成網路模型 (Extended Parameter Generation Network, E-PGN),在僅考慮 NER 任務時, F1 可以達到 90\% ,和 IBHB 效能 86% 相比,有所改善,加入 ABSA 任務後,平均 F1 能夠達到 78% ,和 IBHB 效能相差了 22% ,明顯的大幅成長。;With the rapid development of technology, it has become more convenient to express opinions on the Internet. Therefore, when you are interested in a certain field, you can detect the number of network sounds and perform various analyses. However, it is difficult to get a correct evaluation only by discussing the number of mentions, because the true opinion of this review is probably not aimed at the person mentioned. In addition, users’ narratives on social media are more colloquial and do not follow formal grammatical expressions. After collecting new domain data, it takes a lot of time and money to label. Therefore, this paper hopes to find the correct opinion target from these data, and use the model trained by the labeled data to help the new domain data to be labeled.
Therefore, we use the direction of Transformer Learning to design the model architecture, and perform Named Entity Recognition (NER) and Aspect-Based Sentiment Analysis (ABSA) tasks in the way of Multi-task Learning. We use the Parameter Generation Network (PGN) combined with the Gradient Reversal layer (GRL) to build the model. And use Tie/Break for labeling to improve the accuracy of Chinese word segmentation. Through Dynamic Weight Average (DWA) describe the loss change rate of each task to adjust the task weight.
Experimental results show that when only NER tasks are considered, our Extended Parameter Generation Network (E-PGN) can reach 90%, which is 86% more efficient than IBHB. After joining the ABSA task, the average F1 score can reach 78%, which is 22% different from the performance of IBHB, which is a significant improvement. |