以非隨機模型為基礎之自動查詢擴展演算法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：148

、訪客IP：3.17.179.132

姓名

黃思瑞(Szu-Jui Huang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

以非隨機模型為基礎之自動查詢擴展演算法
(Automatic Query Expansion based on Non-Ramdomness Model)

相關論文

★ 信用卡盜刷防治簡訊規則製作之決策支援系統	★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討	★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究	★ 使用AHP分析法在軟體度量議題之研究
★ 優化入侵規則庫	★ 商務資訊擷取效率與品質促進之研究
★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素	★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究
★ The Development of a CASE Tool with Knowledge Management Functions	★ 以PAT tree 為基礎發展之快速搜尋索引樹
★ 以複合名詞為基礎之文件概念建立方式	★ 利用使用者興趣檔探討形容詞所處位置對評論分類的重要性
★ 透過半結構資訊及使用者回饋資訊以協助使用者過濾網頁文件搜尋結果	★ 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

自動查詢擴展技術在許多資訊檢索相關研究中，已被證實可以有效增進檢索效能，此技術主要解決檢索過程中字彙不匹配 (Word mismatch)的問題，即使用者所提供的查詢字彙與文件使用字彙間差異所導致的檢索效能不佳問題。
本研究延伸機率模型之非隨機性概念，提出一查詢擴展演算法，並使用初始檢索排名前幾篇之文件為擴展關鍵字來源，利用Rocchio架構重新衡量候選關鍵字權重與選取擴展關鍵字，再加以進行查詢擴展。
經使用單一主題領域之Cranfield與多主題領域之npl測試資料集進行實驗與分析後，結果顯示本研究提出之方法可有效提升檢索效能。除此之外，本研究也針對影響查詢擴展效能的相關參數，進行詳細的實驗與分析，這些參數包括虛擬相關文件集合篇數、擴展關鍵字數目與Rocchio架構下之參數。

摘要(英)

Automatic query expansion addresses the problem of word mismatching that the words provided by the users in the query are not consistent with the words used by the authors. The problem of word mismatching can result in poor retrieval effectiveness. Many techniques of automatic query expansion have been developed and proved to improve retrieval effectiveness.
We apply the concept of the non-randomness of probabilistic model to conceive a method for automatic query expansion. Top-ranked documents that are retrieved in the initial retrieval are used as the source of expansion terms. The candidate expansion terms are re-weighted and selected within Rocchio framework.
Experimenting results show that our approach can improve the effectiveness of retrieving significantly. The experiments have the parameters that can influence the performance of automatic query expansion considered and analyzed, including number of selected documents, number of expansion terms and parameters in the Rocchio framework.

關鍵字(中)

★ 虛相關回饋
★ 自動查詢擴展
★ 資訊檢索

關鍵字(英)

★ Query expansion
★ Pseudo-relevance feedback
★ Information retrieval

論文目次

第一章緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 研究範圍與限制 2
1.4 研究流程 3
1.5 論文架構 5
第二章文獻探討 6
2.1 自動查詢擴展 6
2.2 自動查詢擴展基礎架構 9
2.3關鍵字重新衡量權重與查詢重組 13
2.4 非隨機分佈理論 15
第三章系統設計 17
3.1 研究構想 17
3.2 延伸非隨機模型查詢擴展演算法 18
3.3 功能性系統架構 23
第四章實驗分析 28
4.1 測試資料集、基底檢索演算法與實驗平台 28
4.2 ENR查詢擴展演算法效能 31
4.3不同基底演算法對查詢擴展之影響 34
4.4 查詢效能增進程度分析 36
4.5 擴展後無檢索結果之查詢分析 38
4.6 參數組合分析 40
4.7 個別查詢參數最佳化分析 45
4.8 不同查詢擴展演算法之跨測試資料集分析 46
第五章結論 54
5.1 研究結論與貢獻 54
5.2 未來研究方向 56
參考文獻 58

參考文獻

1.AMATI, G. AND VAN RIJSBERGEN, C. J. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information Systems. 20(4), 357-389.
2.ATTAR, R. AND FRAENKEL, A. S. 1977. Local feedback in full-text retrieval systems. J. ACM 24, 3(July), 397-417.
3.BAEZA-YATES, R. AND ROBEIRO-NETO, B. 1999. Modern Information Retrieval. Addison-Wesley Longman.
4.BODO, B., FALK, S., HUGH, E. W., AND JUSTIN, Z. 2003. Query expansion using associated queries. CIKM ’ 03, November 3-8, New Orleans, Louisiana, USA, 2-9.
5.BODO, B. AND JUSTIN, Z. 2003. When query expansion fails. SIGIR ’03, Toronto, Canada, 387-388.
6.BUCKLEY, C., MITRA, M., WALTZ, J., AND CARDIE, C. 1998. Using clustering and superconcepts within SMART. In Proceedings of the Sixth Text Retrieval Conference (TREC-6) (Gaithersburg, Md.), NIST Special Publication 500-240, 107–124.
7.BUCKLEY, C., SALTON, G., ALLAN, J., AND SINGHAL, A. 1995. Automatic query expansion using SMART: TREC3. In proceedings of the Third Text Retrieval Conference (TREC-3) (Gaithersburg, Md.), NIST Special Publication 500-226, 69-80.
8.CAI, D., VAN RIJSBERGEN, C. J., AND JOSE, J. M. 2001. Automatic query expansion based on divergence. CIKM ’01, November 5-10, Atlanta, Georgia, USA, 419-426.
9.CARPINETO, C., DE MORI, R., ROMANO, G.., AND BIGI, B. 2001. An information-theoretic approach to automatic query expansion. ACM Transactions on Information Systems, 19(1), 1-27.
10.CARPINETO, C., ROMANO, G., AND GIANNINI, V. 2002. Improving retrieval feedback with multiple term-ranking function combination. ACM Transactions on Information Systems, Vol. 20, No 3, July, 259-290.
11.CROFT, B. AND HARPER, D. J. 1979. Using probabilistic models of document retrieval without relevance information. J. Doc. 35, 285-295.
12.DEERWESTER, S., DUMAIS, S. T., FURNAS, W., LANDAUER, T. K., AND HARSHMAN, R. 1990. Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 6, 391-407.
13.DOSZKOC, T. E. 1978. AID: An associative interactive dictionary for online searching. Online. Rev. 2, 2, 163-174.
14.EFTHIMIADIS, E. 1996. Query expansion. Annual Review of Information Systems and Technology (ARIST), Williams, M. E., ed., v31, 121-187.
15.FITZPATRICK, L. AND DENT, M. 1997. Automatic feedback using past queries: Social searching?. In Proceedings of the 20th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR ’ 97) (Philadelphia, PA, July 27-31), W. Hersh, F. Can and E. Voorhees. ACM Press, New York, NY, 306-313.
16.FURNAS, G. W., LANDAUER, T. K., GOMEZ, L. M., AND DUMAIS, S. T. 1987. The vocabulary problem in human-system communication. Commun. ACM 30, 11 (Nov. 1987), 964–971.
17.HARTER, S. P. 1975a. A probabilistic approach to automatic keyword indexing. Part I: On the distribution of specialty words in a technical literature. J. ASIS 26, 197-216.
18.HARTER, S. P. 1975b. A probabilistic approach to automatic keyword indexing. Part II: An algorithm for probabilistic indexing. J. ASIS 26, 280-289.
19.HAWKING, D., THISTLEWAITE, P., AND CRASWELL, N. 1998. ANU/ACSys TREC-6 Experiments. In proceedings of the Sixth Text Retrieval Conference (TREC-6) (Gaithersburg, Md.), NIST Special Publication 500-240, 275-290.
20.HELMUT, B., MICHAEL, D., AND DIETER, M. 2004. An adaptive information retrieval system based on associative networks. Asia-Pacific Conference on Conceptual Modeling (ACPPM 2004), Dunedin, New Zealand, 27-36.
21.HINTIKKA, J. 1970. On semantic information. In Information and Inference, J. Hintikka, and P. Suppes, Eds., Synthese Library. D. Reidel, Dordrecht, Holland, 3-27.
22.HULL, D. 1993. Using statistical testing in the evaluation of retrieval experiments. In proceedings of the 16th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR ’93) (Pittsburgh, PA, June 27-July), R. Korfhage, E. Rasmussen, and P. Willett, Eds. ACM Press, New York, NY, 329-388.
23.KOBAYASHI, M. AND TAKEDA, K. 2000. Information retrieval on the web. ACM Comput. Surv. 32, 2, 144–173.
24.LAM-ADESINA, A. M. AND JONES, G. J. F. 2001. Applying summarization techniques for term selection in relevance feedback. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001) (New Orleans), 1–9.
25.MITRA, M., SINGHAL, A., AND BUCKLEY, C. 1998. Improving automatic query expansion. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’98) (Melbourne), 206–214.
26.OUNIS, I., AMATI, G., PLACHOURAS, V., HE, B., MACDONALD, C., AND JOHNSON, D. 2005. Terrier information retrieval platform. In Proceedings of the 27th European Conference on Information Retrieval (ECIR ’05), Santiago de Compostela, Spain.
27.PONTE, J. AND CROFT, W. B. 1998. A language modeling approach to information retrieval. In proceedings of the 21st Annual international ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’98) (Melbourne, Australia, Aug.24-28), W. B. Croft, A. Moffat, C. J. Van Rijsbergen, R. Wilkinson, and J. Zobel, Eds. ACM Press, New York, NY, 206-214.
28.ROBERTSON, S. E. 1990. On term selection for query expansion. J. Doc. 46, 4, 359–364.
29.ROBERTSON, S. E., WALKER, S., AND BEAULIEU, M. 1999. Okapi at TREC-7: Automatic ad hoc filtering, VLC and interactive track. In proceedings of the Seventh Text Retrieval Conference (TREC-7), NIST Special Publication, 500-242, 253-254.
30.ROCCHIO, J. 1971. Relevance feedback in information retrieval, in Gerard Salton, Editor, the SMART Retrieval System, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 313-323.
31.SALTON, G. AND BUCKLEY, C. 1990. Improving retrieval performance by relevance feedback. J. Am. Soc. Inf. Sci. 41, 4, 288-297.
32.SINGHAL, A., BUCKLEY, C., AND MITRA, M. 1996. Pivoted document length normalization. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’96) (Zurich, Switzerland, Aug. 18–22), H.-P. Frei, D. Harman, P. Schaüble, and R. Wilkinson, Eds. ACM Press, New York, NY, 21–29.
33.SINGHAL, A., CHOI, J., HINDLE, D., LEWIS, D., AND PEREIRA, F. 1999. AT&T at TREC-7. In Proceedings of the Seventh Text Retrieval Conference (TREC-7) (Gaithersburg, Md.), NIST Special Publication 500-242, 239-252.
34.SPARK JONES, K. 1971. Automatic keyword classification for information retrieval. Butterworths, London.
35.SRINAVASAN, P. 1996. Query expansion and Medline. Inf. Proc. Manage. 32, 4, 431–443.
36.Terrier Website, http://ir.dcs.gla.ac.uk/terrier/doc/contents.html
37.WALKER, S., ROBERTSON, S. E., BOUGHANEM, M., JONES, G. J. F., AND SPARCK JONES, K. 1998. Okapi at TREC-6 Automatic ad hoc, VLC, routing, filtering and QSDR. In Proceedings of the Sixth Text Retrieval Conference (TREC-6) (Gaithersburg, Md.), NIST Special Publication 500–240, 125–136.
38.WEIGUO, F., MING, L., LI, W., XI, W., AND EDWARD, A. F. 2004. Tuning before feedback: combining ranking discovery and blind feedback for robust retrieval. SIGIR 04’. July 25-29, Shefield, South Yorkshire, UK, 138-145.
39.WILKINSON, R., ZOBEL, J., AND SACKS-DAVIS, R. 1996. Similarity measures for short queries. In Proceedings of the 4th Text Retrieval Conference, D. Harman, Ed. NIST Special Publication 500-236, 277–286.
40.XU, J. AND CROFT, W. B. 1996. Query expansion using local and global document analysis. In proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR ’96) (Zurich), 4-11.
41.XU, J. AND CROFT, W. B. 2000. Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems, Vol.18, No. 1, 79-112.

指導教授

周世傑(Shih-Chieh Chou)

審核日期

2007-1-23

推文