中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/74807
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 80990/80990 (100%)
造访人次 : 41641272      在线人数 : 1436
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/74807


    题名: 中文文件串流之摘要擷取研究
    作者: 張昇暉;Chang, Sheng-Hui
    贡献者: 資訊管理學系
    关键词: 動態摘要;擷取式摘要;單文件摘要;多文件摘要;中文摘要;Dynamic Summarization;Extractive Summarization;General Summarization;Chinese Summarization
    日期: 2017-07-26
    上传时间: 2017-10-27 14:40:10 (UTC+8)
    出版者: 國立中央大學
    摘要: 隨著新聞媒體的蓬勃發展,新聞的產生是一連串的文件串流,過往使用以NGD為基礎之方式,找出和標題關鍵字具高度相關性的主題關鍵字,然而此步驟由於透過Solr全文檢索系統進行查詢,需要耗費相當長的時間,而使用非監督式圖形化摘要方法,其建立文句網路之結果也不如預期,以致於品質仍有提升空間。將過去應用於英文自動摘要之技術直接使用於中文自動摘要,然而其品質與效率皆不如預期。本研究透過增加中文詞性辨別強化中文分詞結果、以TextRank為基礎之關鍵字擷取和鏈結分析法和考慮了文句位置特徵,不僅在單文件摘要得到了較好的品質,且速度也提升了許多。並以單文件摘要方法為基礎,以瀑布式架構結合文句分群進行動態多文件摘要,不但能產生隨時間演進之摘要,也能過濾文件間的冗餘訊息。;With the rapid development of news media, and the news is a series of document stream. In the past, the production methods of news summary were based on NGD method, it found the keywords which were highly correlated to the title. However, because that method is through the Solr full text search system, it would take lots of time. In the other way, there are still a lot of improvements in quality for the unsupervised graph-based method, since the result of the sentence network is not as good as expected. Nevertheless, when used the techniques for the English summaries in Chinese summaries directly, the quality and efficiency are still not as good as expected.
    In this study, I enhance the Chinese word segmentation with increasing the Chinese part of speech recognition. In addition, I take into account the positions of the sentence through adopting the TextRank-based keyword extraction and link-analysis method. Eventually, not only it improves the quality of the single document, but also the speed is well improved.
    At last, based on the single document summary method, I use the sentence grouping in the waterfall architecture to produce the dynamic multi-document summary. It can produce the summary with the evolution of time, and also filter the redundant message in the documents.
    显示于类别:[Graduate Institute of Information Management] Electronic Thesis & Dissertation

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML287检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明