應用語句字詞關係於 多文件自動摘要之方法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：16

、訪客IP：18.191.130.149

姓名

黃家榛(Chia-Chen Huang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

應用語句字詞關係於多文件自動摘要之方法
(Applying relevance terms of sentences on multiple documents summarization 指)

相關論文

★ 信用卡盜刷防治簡訊規則製作之決策支援系統	★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討	★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究	★ 使用AHP分析法在軟體度量議題之研究
★ 優化入侵規則庫	★ 商務資訊擷取效率與品質促進之研究
★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素	★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究
★ The Development of a CASE Tool with Knowledge Management Functions	★ 以PAT tree 為基礎發展之快速搜尋索引樹
★ 以複合名詞為基礎之文件概念建立方式	★ 利用使用者興趣檔探討形容詞所處位置對評論分類的重要性
★ 透過半結構資訊及使用者回饋資訊以協助使用者過濾網頁文件搜尋結果	★ 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

在網路的發達的時代，新聞資訊不斷地在網路世界中擴張散播，過多的新聞使我們
需花費時間閱讀全文才能找到想找的資訊，因此本研究提出一應用語句字詞關係於多文
件自動摘要之方法，能自動找出文件中的重點做為摘要，如此即可讓讀者節省閱讀全文
的時間，本研究將文件中每一語句視為一筆交易資料，並使用關聯規則演算法挖掘出頻
繁項目集，利用頻繁項目集計算產生關聯字詞，最後依照語句所含之關聯字詞，擷取最
高語句計分之語句產生摘要，提升從多文件擷取最佳語句作為摘要的準確率。本研究使
用DUC 2004 新聞文件集進行DUC 2004 task2 之實驗，作出665 bytes 之摘要，經過
ROUGE 評估摘要品質，本研究所提之方法能有改善多文件自動摘要之潛能。

摘要(英)

With the quick development of the internet, the news spread worldwide in minutes, the
presence of too much information make us hard to understand the issue and spend too much
time on reading the news to get what we want. Therefore, in this research, we aim to produce
an extract-based summary to provide readers a quick review of the news. In the research, we
attempt to use association rule to extract the relevance terms of sentences and apply it on
documents summarization. In the experiments, the results show that applying relevance terms
of sentences on multiple documents summarization could be effective in improving the
precision of summarization.

關鍵字(中)

★ 多文件摘要
★ 摘錄式摘要
★ 關聯規則

關鍵字(英)

論文目次

一、緒論 1
1-1 研究背景 1
1-2 研究動機 1
1-3 研究目的 1
1-4 研究範圍與限制 2
1-4-1 研究範圍 2
1-4-2 研究限制 2
1-5 論文架構 3
二、文獻探討 4
2-1 多文件摘要之技術 4
2-2 圖形網路模型之摘要技術應用 5
三、研究方法 8
3-1 系統架構 8
3-2 文件分析 9
3-2-1 文件前處理 9
3-2-2 計算字詞權重 10
3-3 語句轉交易資料 11
3-4 關聯字詞計算 12
3-5 語句選取 13
四、實驗分析 14
4-1 實驗環境 14
4-2 實驗資料集 14
4-3 實驗評估指標 15
4-4 實驗設計與流程 16
4-4-1 實驗一流程設 17
4-4-2 實驗二流程設 18
4-5 實驗結果 21
4-5-1 實驗1-1 結果 21
4-5-2 實驗1-2 結果 23
4-5-3 實驗2-1 結果 27
4-5-4 實驗2-2 結果 29
4-6 實驗結果討論 31
五、結論 32
5-1 研究結論與貢獻 32
5-2 未來研究方向 32
參考文獻 33

參考文獻

S. Brin and L. Page, "The anatomy of a large-scale hypertextual Web search engine,"
Computer Networks and ISDN Systems, vol. 30, pp. 107-117, 1998.
J. M. Kleinberg, "Authoritative sources in a hyperlinked environment," Journal of the ACM
(JACM), vol. 46, pp. 604-632, 1999.
R. Barzilay, K. R. McKeown, and M. Elhadad, "Information fusion in the context of
multi-document summarization," presented at the Proceedings of the 37th annual meeting of
the Association for Computational Linguistics on Computational Linguistics, College Park,
Maryland, USA, 1999.
D. Das and A. F. T. Martins, "A survey on automatic text summarization," Literature Survey
for the Language and Statistics II course at CMU, vol. 4, pp. 192-195, 2007.
H. P. Luhn, "The automatic creation of literature abstracts," IBM Journal of research and
development, vol. 2, pp. 159-165, 1958.
A. Abuobieda, N. Salim, A. T. Albaham, A. H. Osman, and Y. J. Kumar, "Text summarization
features selection method using pseudo Genetic-based model," in 2012 International
Conference on Information Retrieval & Knowledge Management (CAMP 2012), Kuala
Lumpur, Malaysia, 2012, pp. 193-197.
K. McKeown and D. R. Radev, "Generating summaries of multiple news articles," presented
at the Proceedings of the 18th annual international ACM SIGIR conference on Research and
development in information retrieval, Seattle, Washington, USA, 1995.
R. M. Aliguliyev, "CLUSTERING TECHNIQUES AND DISCRETE PARTICLE SWARM
OPTIMIZATION ALGORITHM FOR MULTI‐DOCUMENT SUMMARIZATION,"
Computational Intelligence, vol. 26, pp. 420-448, 2010.
D. R. Radev, H. Jing, and M. Budzikowska, "Centroid-based summarization of multiple
documents: sentence extraction, utility-based evaluation, and user studies," presented at the
Proceedings of the 2000 NAACL-ANLPWorkshop on Automatic summarization - Volume 4,
Seattle, Washington, USA, 2000.
I. Mani and E. Bloedorn, "Summarizing similarities and differences among related
documents," Information Retrieval, vol. 1, pp. 35-67, 1999.
R. Barzilay and M. Elhadad, "Using lexical chains for text summarization," Advances in
automatic text summarization, pp. 111-121, 1999.
R. Mihalcea, "Graph-based ranking algorithms for sentence extraction, applied to text
summarization," presented at the Proceedings of the ACL 2004 on Interactive poster and
demonstration sessions, Barcelona, Spain, 2004.
R. Mihalcea and P. Tarau, "A language independent algorithm for single and multiple
document summarization," in Proceedings of IJCNLP 2005 conference, Jeju Island, South
Korea, 2005.
G. Erkan and D. R. Radev, "LexRank: graph-based lexical centrality as salience in text
summarization," Journal of Artificial Intelligence Research, vol. 22, pp. 457-479, 2004.
X. Wan and J. Yang, "Improved affinity graph based multi-document summarization,"
presented at the Proceedings of the Human Language Technology Conference of the NAACL,
Companion Volume: Short Papers, New York City, USA, 2006.
R. Agrawal, T. Imieliński, and A. Swami, "Mining association rules between sets of items in
large databases," ACM SIGMOD Record, vol. 22, pp. 207-216, 1993.
E. Baralis, L. Cagliero, S. Jabeen, and A. Fiori, "Multi-document summarization exploiting
frequent itemsets," presented at the Proceedings of the 27th Annual ACM Symposium on
Applied Computing, Trento, Italy, 2012.
R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules in Large
Databases," presented at the Proceedings of the 20th International Conference on Very Large
Data Bases, Santiago de Chile, Chile, 1994.
C.-Y. Lin, "Rouge: A package for automatic evaluation of summaries," in Text Summarization
Branches Out: Proceedings of the ACL-04 Workshop, Barcelona, Spain, 2004, pp. 74-81.

指導教授

周世傑

審核日期

2015-7-21

推文