博碩士論文 104423040 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:27 、訪客IP:3.135.219.166
姓名 蔡啓詳(Ci-Siang Cai)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 應用查詢擴展之字詞語意資訊於文件重排序之方法
(The application of the semantic information of terms residing in query expansion for document re-ranking)
相關論文
★ 信用卡盜刷防治簡訊規則製作之決策支援系統★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究★ 使用AHP分析法在軟體度量議題之研究
★ 優化入侵規則庫★ 商務資訊擷取效率與品質促進之研究
★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究
★ The Development of a CASE Tool with Knowledge Management Functions★ 以PAT tree 為基礎發展之快速搜尋索引樹
★ 以複合名詞為基礎之文件概念建立方式★ 利用使用者興趣檔探討形容詞所處位置對評論分類的重要性
★ 透過半結構資訊及使用者回饋資訊以協助使用者過濾網頁文件搜尋結果★ 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 在相關回饋的領域當中,Rocchio演算法因容易實作且有一定的效能水準,在查詢擴展的研究中被廣為使用與研究,而該演算法依據相關文件詞頻及非相關文件詞頻之資訊,藉此產生新的查詢擴展字詞,並進行再次檢索。然而在進行再次檢索時,查詢擴展當中可用的字詞資訊並不會被使用於檢索的過程當中,因此查詢擴展字詞之間是否存在其他可以加以利用的資訊並應用於檢索當中,仍然是個有趣的議題。近年來,語意搜索的研究陸續被提出,主要考量到查詢字詞所涵蓋的語意概念,而非單純使用查詢字詞本身。因此,本研究基於現有的自然語言相關研究,運用WordNet之字詞語意資訊,計算查詢擴展字詞間之語意相似度,接著透過分群萃取出查詢擴展之概念資訊,依據本研究提出之方法計算出文件與查詢擴展之概念匹配程度,並將之使用於修正原始查詢擴展之排序。最後透過實驗證明,本研究所提出之方法與原始查詢擴展之效能相比之下,皆有更好的檢索效果。
摘要(英) In the field of relevance feedback, Rocchio’s query expansion method is simple and effective. It has been widely used in information retrieval. Rocchio’s algorithm produces query expansion according to term frequency in the feedback documents provided by the user and uses it to retrieve documents. Although Rocchio’s method are effective, the terms’ information such as semantics are not utilized in the retrieval process. This study aims to analyze the terms’ information of query expansion and uses the terms’ information for document re-ranking. Recently, the idea of semantic search is getting more and more popular. It is concerned with the semantic meaning of the query terms. Based on NLP technique, this study utilizes WordNet which is a large lexical database of English to calculate semantic similarity between query terms and extract concepts of query expansion by using clustering algorithm. The proposed method calculates concept score between query expansion and each document, and uses the calculated concept score for document re-ranking. The results of experiments show that the proposed method of this study is effective in document retrieval.
關鍵字(中) ★ 資訊檢索
★ 相關回饋
★ 查詢擴展
★ WordNet
★ 文件重排序
關鍵字(英) ★ Information Retrieval
★ Relevance Feedback
★ Query Expansion
★ WordNet
★ Document Re-ranking
論文目次 中文摘要 i
英文摘要 ii
誌謝 iii
目錄 iv
圖目錄 vi
表目錄 viii
一、 緒論 1
1-1 研究背景與動機 1
1-2 研究目的 2
1-3 研究範圍與限制 2
1-4 論文架構 3
二、 文獻探討 4
2-1 相關回饋 4
2-1-1 相關回饋背景與應用 4
2-1-2 Rocchio演算法 6
2-2 查詢擴展 7
2-2-1 局部查詢擴展 (Local Query Expansion) 8
2-2-2 全域查詢擴展 (Global Query Expansion) 9
2-3 WordNet 9
2-4 分群相關研究 12
2-4-1 K-Means 12
2-4-2 Affinity Propagation 12
2-4-3 字詞語意分群 15
三、 研究方法 17
3-1 系統架構 17
3-2 方法設計 18
3-2-1 原始查詢結果處理 18
3-2-2 查詢擴展字詞之語意分群處理 19
3-2-3 重排序處理 26
四、 實驗設計 29
4-1 實驗資料 29
4-2 實驗評估指標 32
4-3 實驗參數設定 35
4-3-1 概念之數量設定 35
4-3-2 重排序演算法之參數設定 36
4-4 實驗流程 37
4-4-1 實驗一之流程 37
4-4-2 實驗二之流程 38
4-5 實驗結果 39
4-5-1 實驗一之結果 39
4-5-2 實驗二之結果 47
4-6 實驗結果討論 55
五、 結論 56
5-1 結論與貢獻 56
5-2 未來研究方向 57
參考文獻 58
參考文獻
[1] Furnas, G. W., Landauer, T. K., Gomez, L. M., & Dumais, S. T. (1987). The vocabulary problem in human-system communication. Communications of the ACM, 30(11), 964-971.
[2] Rocchio, J. J. (1971). Relevance feedback in information retrieval. The Smart retrieval system-experiments in automatic document processing. In G. Salton (Ed.), (pp. 313-323).
[3] Bendersky, M., Metzler, D., & Croft, W. B. (2012, February). Effective query formulation with multiple information sources. In Proceedings of the fifth ACM international conference on Web search and data mining (pp. 443-452). ACM.
[4] Li, Y., Luk, W. P. R., Ho, K. S. E., & Chung, F. L. K. (2007, July). Improving weak ad-hoc queries using wikipedia asexternal corpus. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 797-798). ACM.
[5] Cui, H., Wen, J. R., Nie, J. Y., & Ma, W. Y. (2002, May). Probabilistic query expansion using query logs. In Proceedings of the 11th international conference on World Wide Web (pp. 325-332). ACM.
[6] Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1), 1-167.
[7] Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
[8] Wei, T., Lu, Y., Chang, H., Zhou, Q., & Bao, X. (2015). A semantic approach for text clustering using WordNet and lexical chains. Expert Systems with Applications, 42(4), 2264-2275.
[9] Bhogal, J., MacFarlane, A., & Smith, P. (2007). A review of ontology based query expansion. Information processing & management, 43(4), 866-886.
[10] Salton, G., & McGill, M. J. (1986). Introduction to modern information retrieval (pp. 177-194).
[11] Salton, G. (1971). The SMART retrieval system—experiments in automatic document processing (pp. 316-329).
[12] Dillon, M., & Desper, J. (1980). The use of automatic relevance feedback in Boolean retrieval systems. Journal of Documentation, 36(3), 197-208.
[13] Robertson, S. E., van Rijsbergen, C. J., & Porter, M. F. (1980, June). Probabilistic models of indexing and searching. In Proceedings of the 3rd annual ACM conference on Research and development in information retrieval (pp. 35-56). Butterworth & Co..
[14] Buckley, C., & Salton, G. (1995, July). Optimization of relevance feedback weights. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 351-357). ACM.
[15] Demir, B., & Bruzzone, L. (2015). A novel active learning method in relevance feedback for content-based remote sensing image retrieval. IEEE Transactions on Geoscience and Remote Sensing, 53(5), 2323-2334.
[16] Yasmin, M., Mohsin, S., Irum, I., & Sharif, M. (2013). Content based image retrieval by shape, color and relevance feedback. Life Science Journal, 10(4s), 593-598.
[17] Yan, R., Hauptmann, A., & Jin, R. (2003). Multimedia search with pseudo-relevance feedback. Image and Video Retrieval, 649-654.
[18] Yang, Y., Nie, F., Xu, D., Luo, J., Zhuang, Y., & Pan, Y. (2012). A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 723-742.
[19] Gay, G., Haiduc, S., Marcus, A., & Menzies, T. (2009, September). On the use of relevance feedback in IR-based concept location. In Software Maintenance, 2009. ICSM 2009. IEEE International Conference on (pp. 351-360). IEEE.
[20] Kelly, D., & Belkin, N. J. (2001, September). Reading time, scrolling and interaction: exploring implicit sources of user preferences for relevance feedback. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 408-409). ACM.
[21] Kelly, D., & Teevan, J. (2003, September). Implicit feedback for inferring user preference: a bibliography. In ACM SIGIR Forum (Vol. 37, No. 2, pp. 18-28). ACM.
[22] Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval (Vol. 1, No. 1, p. 496). Cambridge: Cambridge university press.
[23] Vechtomova, O., & Wang, Y. (2006). A study of the effect of term proximity on query expansion. Journal of Information Science, 32(4), 324-333.
[24] Pinto, F. J., & Pérez-Sanjulián, C. F. (2008). Automatic query expansion and word sense disambiguation with long and short queries using WordNet under vector model. Actas de los Talleres de las Jornadas de Ingeniería del Software y Bases de Datos, 2(2), 17-23.
[25] Shi, Z., Gu, B., Popowich, F., & Sarkar, A. (2005). Synonym-based query expansion and boosting-based re-ranking: A two-phase approach for genomic information retrieval. In the Fourteenth Text REtrieval Conference (TREC 2005), NIST, Gaithersburg, MD.(October 2005).
[26] Araujo, L., & Pérez-Agüera, J. (2008). Improving query expansion with stemming terms: a new genetic algorithm approach. Evolutionary Computation in Combinatorial Optimization, 182-193.
[27] Chen, Q., Li, M., & Zhou, M. (2007, June). Improving Query Spelling Correction Using Web Search Results. In EMNLP-CoNLL (Vol. 7, pp. 181-189).
[28] Harman, D. (1992, June). Relevance feedback revisited. In Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 1-10). ACM.
[29] Sihvonen, A., & Vakkari, P. (2004). Subject knowledge improves interactive query expansion assisted by a thesaurus. Journal of Documentation, 60(6), 673-690.
[30] Xu, J., & Croft, W. B. (1996, August). Query expansion using local and global document analysis. In Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 4-11). ACM.
[31] Crouch, C. J. (1990). An approach to the automatic construction of global thesauri. Information Processing & Management, 26(5), 629-640.
[32] Miller, G. A. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39-41.
[33] Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to WordNet: An on-line lexical database. International journal of lexicography, 3(4), 235-244.
[34] Miller, G. & Fellbaum, C. (1998, May). Wordnet: An electronic lexical database (pp. 274-281).
[35] Banerjee, S., & Pedersen, T. (2003, August). Extended gloss overlaps as a measure of semantic relatedness. In Ijcai (Vol. 3, pp. 805-810).
[36] Voorhees, E. M. (1993, July). Using WordNet to disambiguate word senses for text retrieval. In Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 171-180). ACM.
[37] Scott, S., & Matwin, S. (1998, August). Text classification using WordNet hypernyms. In Use of WordNet in natural language processing systems: Proceedings of the conference (pp. 38-44).
[38] Dang, C. & Luo, X. (2008, April). WordNet-Based Dcument Summarization. In WSEAS International Conference. Proceedings. Mathematics and Computers in Science and Engineering (No. 7). World Scientific and Engineering Academy and Society.
[39] Pal, D., Mitra, M., & Datta, K. (2014). Improving query expansion using WordNet. Journal of the Association for Information Science and Technology, 65(12), 2469-2478.
[40] Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: a review. ACM computing surveys (CSUR), 31(3), 264-323.
[41] Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., & Wu, A. Y. (2002). An efficient k-means clustering algorithm: Analysis and implementation. IEEE transactions on pattern analysis and machine intelligence, 24(7), 881-892.
[42] Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. science, 315(5814), 972-976.
[43] He, Y., Chen, Q., Wang, X., Xu, R., Bai, X., & Meng, X. (2010, March). An adaptive affinity propagation document clustering. In Informatics and Systems (INFOS), 2010 The 7th International Conference on (pp. 1-7). IEEE.
[44] Chang, H. C., & Hsu, C. C. (2005). Using topic keyword clusters for automatic document clustering. IEICE TRANSACTIONS on Information and Systems, 88(8), 1852-1860.
[45] Salton, G., & Buckley, C. (1997). Improving retrieval performance by relevance feedback. Readings in information retrieval, 24(5), 355-363.
[46] Liddy, E. D. (1998). Enhanced text retrieval using natural language processing. Bulletin of the Association for Information Science and Technology, 24(4), 14-16.
[47] Potts, K. (2007). Web design and marketing solutions for business websites. Apress.
[48] Wu, Z., & Palmer, M. (1994, June). Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics (pp. 133-138). Association for Computational Linguistics.
[49] Leacock, C., & Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. WordNet: An electronic lexical database, 49(2), 265-283.
指導教授 周世傑(Shih-Chieh Chou) 審核日期 2017-7-6
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明