English  |  正體中文  |  简体中文  |  Items with full text/Total items : 69561/69561 (100%)
Visitors : 23075194      Online Users : 405
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version

    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/72173

    Title: 以文件間差異為基礎並實作中文摘要
    Authors: 黃慶杰;Huang,Ching-Jie
    Contributors: 資訊管理學系
    Keywords: 文件間差異;文句位置;擷取式摘要;多文件摘要;中文摘要;主題追蹤;Inter-document based;Sentence position;Extractive Summarization;Multi-document summarization;Chinese summarization;topic tracking
    Date: 2016-07-25
    Issue Date: 2016-10-13 14:30:18 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 本研究提出以文件間差異的摘要方式實作多文件摘要,有別於單一架構實作多文件摘要,改善摘要文句來自於少數或單一子概念主題,並且避免單一主題追蹤時,摘要文句取自於非相關文件的相關文句,以非監督擷取式圖形化摘要方法實現單一與多文件摘要,方法中使用到的語義詞彙網路是依據最新的維基百科資料集,再使用單一文件摘要為基礎利用文句特徵中文句位置特性逐一挑選各文件中的第一個文句,過程中若使用不同的順序處理多文件摘要,能夠得到主題發展與主題集中的兩種概念摘要,使文件摘要能有更多不同的應用,實驗探討詞彙網路所使用的新維基百科資料集對於摘要品質的測試,發現資料集的更新並無顯著影響研究的參數值,本研究所提出的方法實作DUC 2002的英文摘要,品質與其他參賽者比較,單一文件摘要得到中間以上的排名,而多文件摘要維持在中間排名,另外中文摘要使用BBC中文網的新聞資料集,標題為能彰顯文件主題的文字,因此本研究將它視為文件的概念主題,利用概念主題與查詢主題做相似度運算探討主題追蹤效果,針對主題集中及發展性的新聞進行實作,結果發現主題集中的摘要文句多著重於主要主題上,而主題發展的摘要文句能有效的擷取出文件間子主題概念。
    ;This study proposed a way difference from Single-layer architecture based on inter-document to implement multi-document summary. This method improved the problem that summary was composed of the sentence in single or little sub-concepts, and that summary extracted the related sentence from unrelated document while topic tracking. The system applied an unsupervised graph-based extractive summarization, and the semantic relationship between terms was dependent on latest Wikipedia dataset. Multi-document summary used the concept of sentence-position in basic feature summarization by choosing the first sentence in each single-document summary. Through the process, there were two concept summaries of topic development and focus by different sequence to extract multi-document summary. The result of the investigation the new Wikipedia dataset whether influenced the parameters was not significant, and the performance of the method this study proposed with DUC 2002 dataset comparing to other participants in the single summary was above the middle of the rank, and in the multi-document summary is in the middle of the rank. The finding of the concept summary of topic focus and development with BBC Chinese news was the summary tended to primary concept in the topic focus and to sub-concept in the topic development. The effect of the topic tracking was calculating the similarity between title of the documents, because the title was the words to demonstrate the content. After the experiment, this way could effectively identify the related document.
    Appears in Collections:[資訊管理研究所] 博碩士論文

    Files in This Item:

    File Description SizeFormat

    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback  - 隱私權政策聲明