English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 78818/78818 (100%)
造訪人次 : 34626390      線上人數 : 1845
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/74807


    題名: 中文文件串流之摘要擷取研究
    作者: 張昇暉;Chang, Sheng-Hui
    貢獻者: 資訊管理學系
    關鍵詞: 動態摘要;擷取式摘要;單文件摘要;多文件摘要;中文摘要;Dynamic Summarization;Extractive Summarization;General Summarization;Chinese Summarization
    日期: 2017-07-26
    上傳時間: 2017-10-27 14:40:10 (UTC+8)
    出版者: 國立中央大學
    摘要: 隨著新聞媒體的蓬勃發展,新聞的產生是一連串的文件串流,過往使用以NGD為基礎之方式,找出和標題關鍵字具高度相關性的主題關鍵字,然而此步驟由於透過Solr全文檢索系統進行查詢,需要耗費相當長的時間,而使用非監督式圖形化摘要方法,其建立文句網路之結果也不如預期,以致於品質仍有提升空間。將過去應用於英文自動摘要之技術直接使用於中文自動摘要,然而其品質與效率皆不如預期。本研究透過增加中文詞性辨別強化中文分詞結果、以TextRank為基礎之關鍵字擷取和鏈結分析法和考慮了文句位置特徵,不僅在單文件摘要得到了較好的品質,且速度也提升了許多。並以單文件摘要方法為基礎,以瀑布式架構結合文句分群進行動態多文件摘要,不但能產生隨時間演進之摘要,也能過濾文件間的冗餘訊息。;With the rapid development of news media, and the news is a series of document stream. In the past, the production methods of news summary were based on NGD method, it found the keywords which were highly correlated to the title. However, because that method is through the Solr full text search system, it would take lots of time. In the other way, there are still a lot of improvements in quality for the unsupervised graph-based method, since the result of the sentence network is not as good as expected. Nevertheless, when used the techniques for the English summaries in Chinese summaries directly, the quality and efficiency are still not as good as expected.
    In this study, I enhance the Chinese word segmentation with increasing the Chinese part of speech recognition. In addition, I take into account the positions of the sentence through adopting the TextRank-based keyword extraction and link-analysis method. Eventually, not only it improves the quality of the single document, but also the speed is well improved.
    At last, based on the single document summary method, I use the sentence grouping in the waterfall architecture to produce the dynamic multi-document summary. It can produce the summary with the evolution of time, and also filter the redundant message in the documents.
    顯示於類別:[資訊管理研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML320檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明