中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/81058
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 81570/81570 (100%)
Visitors : 47034370      Online Users : 111
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/81058


    Title: 應用歌手辨識及角色標注於輿情意見目標分析之研究;Singer Recognition and Semantic Role Labeling for Opinion Target Extraction from Social Network
    Authors: 黎桂如;Li, Gui-Ru
    Contributors: 資訊工程學系
    Keywords: 深度學習;命名實體辨識;語意角色標記;意見目標偵測;Deep Learning;Named Entity Recognition;Semantic Role Labeling;Opinion Target Detection
    Date: 2019-06-28
    Issue Date: 2019-09-03 15:31:45 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 網路聲量偵測是在市場調查時常使用的手法之一,常見的偵測方法為將某事物被提及的次數作為熱門的指標。然而,只利用提及次數作為網路聲量是否真的足夠;有可能該子句真正的意見目標並非被提及的人物,因此本篇論文希望從社群網路的資料中找出意見目標。由於社群網路上的口語敘述並非正規的表達方式,這個問題導致模型在擷取意見目標時充滿挑戰性。
    為了應對上述問題與挑戰,本研究使用深度學習模型架構進行中文歌手辨識(Singer Name Recognition, NER)和語意角色標記(Semantic Role Labeling, SRL),並透過自定義規則對子句進行意見目標偵測(Opinion Target Detection, OTD)。我們使用深度學習模型作為歌手辨識模型,並且比較Word2Vec字元嵌入模型以及BERT嵌入模型對效能之影響。在SRL任務中,我們參考Zhou等人[38]使用了額外的特徵以及Zhang等人[37]的高速網路架構來進行模型建立與訓練,希望效能可以有所提升。最後在OTD任務中,我們設計了自定義規則來合併NER實驗結果與SRL實驗結果,作為意見目標偵測的方法。
    本研究使用的資料為利用客製化爬蟲程式從社群網站上擷取之文章作為訓練資料,測試資料同樣從社群網站上隨機挑選文章,作為基準效能以評估模型之效能。實驗結果顯示,我們的歌手辨識模型在擷取未知歌手效能可達44%的F1,在判斷子句中的語意角色時其F1可以達到71%的效能,在OTD任務的辨識精準度(Precision)則可以達到73%的效能。;Social network is a good resource to collect public opinions considering the diversity and variety in fashion, especially user generated content (UGC). UGC is defined as any type of content that created by users which could be pictures, videos, texts, comment, etc. Extracting the opinions from UGC can be the base of commercial policy, so how to extract the opinions correctly is an important problem. A common method is to regard mention times of entities as important indicator of network volume. There are two problems about the network volume: Are the opinions really talking about the target entities? Or the amount of opinions is enough for network volume analysis?
    There are several features about UGC, the various written format of entities and the fragmentary structure of sentences. The former means there may have nickname or punctuations in the entities and may drop the performance of NER. The latter means users write the sentences but omit part of words which may drop the performance of SRL. These problems of NER and SRL will also drop the performance of opinion target detection. Therefore, a great challenge is how to recognize entities and semantic role in large UGC corpora.
    In this study, we combine Named Entity Recognition (NER) and Semantic Role Labeling (SRL) to detect the opinion target (OTD) from UGC. In NER task, we compare the performance between CRF++ and neuron network models. In SRL task, we use highway connection and additional features to improve the performance. Finally, we design the rule to combined the result of NER and SRL for OTD task. The result show that our NER model gets 44% F1 on out-of-vocabulary entities extraction. On SRL task and OTD task, we get 71% F1 and 73% precision respectively.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML122View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明