應用語意之字詞分群於多文件自動摘要之方法;Applying semantic clustering of words on multiple documents summarization method

NCUIR > School of Management at National Central University > Graduate Institute of Information Management > Electronic Thesis & Dissertation > Item 987654321/74775

Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/74775

Title:	應用語意之字詞分群於多文件自動摘要之方法;Applying semantic clustering of words on multiple documents summarization method
Authors:	林栗岑;Lin, Li-Tsen
Contributors:	資訊管理學系
Keywords:	多文件摘要;摘錄式摘要;WordNet;概念萃取;Multi-document summarization;Extract-based summarization;WordNet;Concept extraction
Date:	2017-07-06
Issue Date:	2017-10-27 14:39:02 (UTC+8)
Publisher:	國立中央大學
Abstract:	網路普及改變了我們接收資訊的方式，資訊的取得變得更加容易，但隨手可得的資訊也衍生出許多問題，在面臨龐大的資訊量時，人們無法快速及有效地找到需要的資訊。因此本研究提出一應用語意之字詞分群於多文件自動摘要之方法，自動找出文件重點產生摘要，讓讀者能快速理解文件內容。一般而言，文件通常會涵蓋許多小主題，因此本研究利用WordNet計算字詞間的語意關係，並透過分群找出文件潛在概念，再利用各概念權重表示概念之於文件的重要程度，並結合語句字詞權重、語句概念、語句位置得出語句分數，最後擷取包含重要概念且資訊量較豐富的語句作為摘要。本研究使用DUC 2004新聞文件集進行task2之實驗，作出665 bytes之摘要，並透過ROUGE指標評估摘要品質。;The popularity of internet has made the spread of information quickly and easier. However it also generates a lot of problems. People cannot find the information they need efficiently when they face huge amounts of information. Therefore, we apply semantic clustering of words on multiple documents summarization method, which can automatically identify the important content of the documents and provide readers a quick review of the news. In general, a document usually covers many topics, so we use WordNet to calculate the semantic relationship between words, and use clustering method to identify the concept of documents. Then we use the weight of concept to represent the importance of concept. Finally we combine the concept of sentence, sentence location, and word weight of sentence to calculate sentence score, and output the sentence which has higher score. In the experiments, we use the DUC 2004 news document set of task2, we generate a summary of 665 bytes, and evaluate the quality through ROUGE measurements.
Appears in Collections:	[Graduate Institute of Information Management] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	360	View/Open

社群 sharing

Loading...