![]() |
以作者查詢圖書館館藏 、以作者查詢臺灣博碩士 、以作者查詢全國書目 、勘誤回報 、線上人數:27 、訪客IP:3.128.31.200
姓名 楊宗翰(Tsung-han Yang) 查詢紙本館藏 畢業系所 資訊管理學系 論文名稱 以單一使用者興趣檔為基礎的查詢擴展與文件重排序系統
(A Query Expansion and Document Re-Ranking System Based on Single User Profile)相關論文 檔案 [Endnote RIS 格式]
[Bibtex 格式]
[相關文章]
[文章引用]
[完整記錄]
[館藏目錄]
[檢視]
[下載]
- 本電子論文使用權限為同意立即開放。
- 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
- 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
摘要(中) 在面對資訊爆炸的時代裡,網際網路上的資訊量日漸龐大,如何協助使用者過濾掉無用的資訊,同時降低使用者瀏覽網頁的負擔,已經成為重要的議題。因此,本研究提出以單一使用者興趣檔為基礎的查詢擴展與文件重排序系統,實現個人化的資訊檢索。透過網頁爬行器擷取使用者瀏覽過的網頁來建置單一的多主題使用者興趣檔,解決傳統必頇將使用者的興趣主題分類成多個單一的使用者興趣檔的問題,當使用者輸入關鍵字進行資訊檢索時,系統會根據使用者興趣檔推薦個人化的擴展字詞,藉以過濾不符合使用者興趣的文件,且只需比對一個使用者興趣檔便可以產生擴展字詞,相較於傳統必頇先判斷關鍵字屬於的使用者興趣檔主題分類再進行查詢擴展,能提昇系統執行效率,再將搜尋結果根據使用者興趣檔進行重排序,將使用者有興趣的文件排在前面,藉以輔助使用者更快速地找到有用且有興趣的個人化的搜尋結果,降低使用者瀏覽文件的負擔。
實驗結果證實以單一使用者興趣檔為基礎的自動化查詢擴展確實能提昇檢索效能,而將搜尋結果再依文件的相關度重排序後,檢索效能更是大幅提昇,證明相較於傳統的個人化資訊檢索,本研究提出的以單一使用者興趣檔為基礎的查詢擴展與文件重排序系統確實能有效提昇個人化查詢擴展的執行效率,且能根據使用者的興趣推薦擴展字詞,過濾掉大部分不相關的網頁,取得使用者真正想要的資訊,達到有效降低使用者瀏覽網頁文件的負擔。
摘要(英) With the rapid growing information on Internet, the issue of helping user to filter useless information and to reduce the burden of browsing has become important. Therefore, a query expansion and document re-ranking system based on single user profile is proposed to accomplish personalized information retrieval. Collecting the Web pages of the user’s past browsing via web crawler to build a single user profile with multi-topic, and resolve the traditional problem of building multiple profiles for each topic of user’s interest. When user submits a query, the system will recommend personalized expansion words based on the user profile to filter uninterested documents, and just need to compare one single user profile so that expansion words can be produced. Comparing to the traditional process that must first determine which interest user profile is belonged to the query and then expansion words being produced; the proposed system can improve the efficiency. And re-ranking the search results based on user profile to rapidly help user to find personalized search results which are useful and interested and the burden of user browsing can be reduced.
The experimental results prove that automatic query expansion based on single user profile can improve the retrieval performance, and after re-rank the search results, the retrieval performance is significantly improved. The proposed query expansion and document re-ranking system based on single user profile provides better efficiency in personalized query expansion, and can recommend expansion words according to user’s interests to filter irrelevant Web documents to acquire the actual needed information, and the burden of user browsing is effectively reduced.
關鍵字(中) ★ 文件重排序
★ 資訊過濾
★ 查詢擴展
★ 使用者興趣檔關鍵字(英) ★ Document Re-ranking
★ Query Expansion
★ User Profile
★ Information Filtering論文目次 圖目錄 ....................................................................................................................... v
表目錄 ...................................................................................................................... vi
第一章 緒論 ............................................................................................................. 1
1.1 研究動機.................................................................................................... 1
1.2 研究目的.................................................................................................... 1
1.3 研究限制.................................................................................................... 2
1.4 論文架構.................................................................................................... 2
第二章 文獻探討 ..................................................................................................... 3
2.1 資訊檢索.................................................................................................... 3
2.2 使用者興趣檔 ............................................................................................ 3
2.2.1 傳統建置方法 ................................................................................ 3
2.2.2 Nootropia演算法—建構單一的多主題使用者興趣檔 .................. 4
2.3 查詢擴展.................................................................................................... 7
2.4 文件重排序 ................................................................................................ 9
2.4.1 傳統文件重排序方法 ..................................................................... 9
2.4.2 使用Nootropia演算法進行個人化文件重排序 ............................ 9
第三章 系統分析與設計 ........................................................................................ 12
3.1 系統架構.................................................................................................. 12
3.2 文件前處理 .............................................................................................. 14
3.3 詞語萃取.................................................................................................. 15
3.4 興趣檔建構 .............................................................................................. 16
3.4.1 計算詞語權重 .............................................................................. 17
3.4.2 計算詞語間的關聯度 ................................................................... 18
3.4.3 根據詞語權重排序 ....................................................................... 20
3.5 擴展字詞推薦 .......................................................................................... 20
3.6 搜尋結果重排序 ...................................................................................... 21
第四章 系統實作與驗證 ........................................................................................ 23
4.1 實驗評估準則 .......................................................................................... 23
4.2 實驗設計、結果與分析 .......................................................................... 24
4.2.1 實驗設計 ...................................................................................... 24
4.2.2 實驗結果 ...................................................................................... 27
4.2.3 實驗結果分析 .............................................................................. 31
第五章 結論與未來研究方向 ................................................................................ 33
5.1 結論與貢獻 .............................................................................................. 33
5.2 未來研究方向 .......................................................................................... 34
參考文獻 ................................................................................................................. 36
參考文獻 [1] P. Maes, Agents that Reduce Work and Information Overload, Communications ACM, Vol. 37, 1994, pp. 30-40.
[2] Z. Zhu, J. Xu, X. Ren, Y. Tian, and L. Li, Query Expansion Based on a Personalized Web Search Model, Proceedings of the Third International Conference on Semantics, Knowledge and Grid, IEEE Computer Society, 2007, pp. 128-133.
[3] N. Nanas, V. Uren, and A.D. Roeck, Building and Applying a Concept Hierarchy Representation of a User Profile, Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada: ACM, 2003, pp. 198-204.
[4] N. Nanas, V. Uren, A. de Roeck, and J. Domingue, Multi-topic Information Filtering with a Single User Profile, Methods and Applications of Artificial Intelligence, 2004, pp. 400-409.
[5] Dae-Won Kim and K. Lee, A New Fuzzy Information Retrieval System Based on User Preference Model, 10th IEEE International Conference on Fuzzy Systems, pp. 127-130.
[6] R.A. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley Longman Publishing Co., Inc., 1999.
[7] A. Pretschner and S. Gauch, Ontology Based Personalized Search, Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence, IEEE Computer Society, 1999, p. 391.
[8] N. Nanas, V.S. Uren, and A. de Roeck, Nootropia: A User Profiling Model Based on a Self-Organising Term Network, Artificial Immune Systems, 2004, pp. 146-160.
[9] G. Salton and M.J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, Inc., 1986.
[10] S.E. Robertson and K.S. Jones, Relevance Weighting of Search Terms, Document Retrieval Systems, Taylor Graham Publishing, 1988, pp. 143-160.
[11] F. Sebastiani, Machine Learning in Automated Text Categorization, ACM Computer Surveys, Vol. 34, 2002, pp. 1-47.
[12] G. Amati, D. D'Aloisi, V. Giannini, and F. Ubaldini, A Framework for Filtering News and Managing Distributed Data, Jounrnal of Universal Computer Science, Vol. 3, 1997, pp. 1007-1021.
[13] M. Pazzani, J. Muramatsu, and D. Billsus, Syskill & Webert: Identifying Interesting Web Sites, In Proceedings of the Thirteenth National Conference on Artificial Intelligence, 1996, pp. 54-61.
[14] B. Krulwich and C. Burkey, The InfoFinder Agent: Learning User Interests through Heuristic Phrase Extraction, IEEE Intelligent Systems, Vol. 12, 1997, pp. 22-27.
[15] L.B. Doyle, Semantic Road Maps for Literature Searchers, Journal of the ACM, Vol. 8, 1961, pp. 553-578.
[16] H. Sorensen, A.O. Riordan, and C.O. Riordan, Profiling with the INFOrmer Text Filtering Agent, Journal of Universal Computer Science, Vol. 3, 1997, pp. 988-1006.
[17] M. Sanderson and B. Croft, Deriving Concept Hierarchies from Text, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, California, United States: ACM, 1999, pp. 206-213.
[18] A. Maedche and S. Staab, Ontology Learning for the Semantic Web, IEEE Intelligent Systems, Vol. 16, 2001, pp. 72-79.
[19]R. Forsyth and R. Rada, Machine Learning: Expert Systems and Information Retrieval, Ellis Horwood, London: 1986.
[20] B.J. Jansen, A. Spink, and T. Saracevic, Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web, Information Processing Management, Vol. 36, 2000, pp. 207-227.
[21] A. Spink, D. Wolfram, M.B.J. Jansen, and T. Saracevic, Searching the Web: The Public and Their Queries, Journal of the American Society for Information Science and Technology, Vol. 52, 2001, pp. 226-234.
[22] C. Buckley, Automatic Query Expansion Using SMART: TREC 3, In Proceedings of the Third Text Retrieval Conference, 1994, pp. 69-80.
[23] H.J. Peat and P. Willett, The Limitations of Term Co-Occurrence Data for Query Expansion in Document Retrieval Systems, Journal of the American Society for Information Science, Vol. 42, 1991, pp. 378-383.
[24] J. Xu and W.B. Croft, Query Expansion using Local and Global Document Analysis, Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland: ACM, 1996, pp. 4-11.
[25] W. Woods, Conceptual Indexing: A Better Way to Organize Knowledge, Technical Report of Sun Microsystems, 1997.
[26] P.-.A. Chirita, C.S. Firan, and W. Nejdl, Personalized Query Expansion for the Web, Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands: ACM, 2007, pp. 7-14.
[27] J. Xu and W.B. Croft, Improving the Effectiveness of Information Retrieval with Local Context Analysis, ACM Transactions Information System, Vol. 18, 2000, pp. 79-112.
[28] Q. Youli, X. Guowei, and W. Jun, Rerank Method Based on Individual Thesaurus, Proceedings of NTCIR2 Workshop, 2002.
[29] S. Lovic, M. Lu, and D. Zhang, Enhancing Search Engine Performance using Expert Systems, 2006 IEEE International Conference on Information Reuse Integration, Waikoloa Village, HI, USA: 2006, pp. 567-572.
[30] M.C. D, R. Prabhakar, and S. Hinrich, Introduction to Information Retrieval, Cambridge University Press, 2008.
[31] 中文斷詞系統,http://ckipsvr.iis.sinica.edu.tw/
指導教授 薛義誠(Yih-chearng Shiue) 審核日期 2010-7-12 推文 plurk
funp
live
udn
HD
myshare
netvibes
friend
youpush
delicious
baidu
網路書籤 Google bookmarks
del.icio.us
hemidemi
myshare