應用查詢擴展字詞及原始查詢字詞之語意資訊於文件重排序之方法;The application of the semantic information of terms residing in query expansion and original query for document re-ranking

NCU Institutional Repository > 管理學院 > 資訊管理研究所 > 博碩士論文 > Item 987654321/81316

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/81316

題名:	應用查詢擴展字詞及原始查詢字詞之語意資訊於文件重排序之方法;The application of the semantic information of terms residing in query expansion and original query for document re-ranking
作者:	蔡丞祐;Tsai, Cheng-You
貢獻者:	資訊管理學系
關鍵詞:	資訊檢索;相關回饋;查詢擴展;Word2Vec;文件重排序;Information Retrieval;Relevance Feedback;Query Expansion;Word2Vec;Document Re-ranking
日期:	2019-07-23
上傳時間:	2019-09-03 15:44:00 (UTC+8)
出版者:	國立中央大學
摘要:	近年來隨著網路的發展，使用者可以透過資訊檢索快速取得資訊，雖然資訊取得已變得容易，但如何更精確、有效率地讓使用者獲取所需資訊是重要的議題之一。而在相關回饋領域中，以Rocchio演算法最廣泛被應用，其分析相關與非相關文件出現頻率來產生新的查詢字詞，但Rocchio僅以字詞出現頻率作為依據，並未考量到其他字詞間的語意資訊。近年來有許多基於語意相關的研究被提出，其概念為挖掘字詞之間更深層的語意關係，因此本研究將以使用者的原始查詢以及相關回饋作為基礎，利用Word2Vec計算查詢擴展字詞間之語意相似度並萃取其概念資訊，再透過原始查詢字詞與查詢擴展之概念所隱含之語意關係計算出概念重要性，最後計算出文件與查詢擴展之概念匹配程度，用以重新排序查詢擴展之檢索結果。最後透過實驗證實，本研究所提出之方法於前五篇及前十篇準確率相較於Rocchio演算法能提升30%以及32%之效能，相較於Cai提出之方法能再提升9%與4%之效能。;In recent years, with the development of the Internet, users can quickly obtain information through information retrieval. But how to obtain the required information more accurately and efficiently is one of the important issues. In the field of relevance feedback, Rocchio′s query expansion is most widely used. The algorithm generates new query terms by analyzing the frequency of terms which residing in relevance and non-relevance document. However, Rocchio′s method only utilize the term frequency, and doesn′t concern semantic information between terms. Recently, the idea of semantic related study had been proposed, the concept of which is to explore the deeper semantic information between terms. Therefore, based on the user′s original query and relevance feedback, our study utilizes Word2Vec to analyze the semantic information and extract the concept of query expansion by using clustering algorithm, then calculate the concept importance through semantic information between terms of original query and query expansion. Finally, using concept score for document re-ranking by calculates concepts score between query expansions and documents. The result of experiments verify that the study is effective in document retrieval.
顯示於類別:	[資訊管理研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	205	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....