博碩士論文 994203048 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:51 、訪客IP:18.119.119.252
姓名 張弘杰(Hong-Kiat Chong)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 應用負相關回饋資訊於文件重排序之分析
(An analysis of the application of non-relevance feedback in document ranking)
相關論文
★ 信用卡盜刷防治簡訊規則製作之決策支援系統★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究★ 使用AHP分析法在軟體度量議題之研究
★ 優化入侵規則庫★ 商務資訊擷取效率與品質促進之研究
★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究
★ The Development of a CASE Tool with Knowledge Management Functions★ 以PAT tree 為基礎發展之快速搜尋索引樹
★ 以複合名詞為基礎之文件概念建立方式★ 利用使用者興趣檔探討形容詞所處位置對評論分類的重要性
★ 透過半結構資訊及使用者回饋資訊以協助使用者過濾網頁文件搜尋結果★ 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 負相關回饋的資訊雖然被認為可利用價值不高, 但它在促進資訊擷取的效能上,仍然有可利用之處。有研究嘗試利用負相關回饋資訊於文件檢索結果的重排序,並且初步顯現效能,本研究依據理論探討,認為負相關回饋資訊於文件檢索結果重排序的應用,可能受資料離散性等資料分佈情境的影響, 因此,分析資料離散等資料分佈情境與負相關回饋資訊的應用成為本研究的目的。為此,本研究針對初始檢索結果進行資料離散等分析, 確定資料分佈情境與負相關回饋資訊的應用是否有直接關聯, 提出了資料分佈情境對負相關回饋資訊應用的影響。實驗結果指出, 文件資料的離散性並沒有與負相關回饋資訊應用的效能有線性關係, 但是相關與不相關文件之間差異性小的文件型態會對負相關回饋資訊的應用有不良影響。根據這種情況, 本研究提出了數個未來研究發展的方向。
摘要(英) Although the information of non-relevance feedback information is thought as not much useful in information retrieval, it still can be applied. Some research tried using non-relevance feedback information in document re-ranking. In this research, our goal is to disclose the relation between the data distribution and the application of non-relevance feedback according to the theory that we had studied. In order to do so, we focus on the analysis of the distribution of initial retrieval result, and the direct links between distribution scenario and the application of non-relevance feedback. The final result shows that the distribution of the text data and the application of non-relevance feedback doesn’t exist linear relationship and the significance of difference between relevance and non-relevance in dataset could affect the application of non-relevance feedback. Base on this result, our research propose some direction in future study.
關鍵字(中) ★ 負相關回饋
★ 資訊檢索
★ 文件重排序
★ 文件分析
關鍵字(英) ★ Document re-ranking
★ Information retrieval
★ Non-relevance feedback
★ Document Analysis
論文目次 論文摘要 ....................................................................................................................................i
Abstract ..................................................................................................................................... ii
銘謝 .......................................................................................................................................... iii
第一章 緒論 ............................................................................................................................. 1
1-1 研究背景與動機 ............................................................................................................ 1
1-2 研究目的 ........................................................................................................................ 2
1-3 研究範圍與限制 ............................................................................................................ 2
第二章 文獻探討 ................................................................................................................... 3
2-1 資訊檢索 ........................................................................................................................ 3
2-2 相關回饋 ........................................................................................................................ 4
2-3 字詞敏感度 .................................................................................................................... 5
2-4 負相關回饋資訊用於重排序 ........................................................................................ 6
2-6 文件分群 ........................................................................................................................ 6
2-6-1分群假說 (Clusters Hypothesis) ............................................................................. 7
2-6-2分群數量假說 (Number-of-Clusters Hypothesis) .................................................. 7
第三章 系統分析與假設 ....................................................................................................... 8
3-1 影響效能之各個假設 .................................................................................................... 8
3-1-1 離散程度對於負相關文件的影響......................................................................... 8
3-1-2字詞出現在文件的頻率 ......................................................................................... 9
3-2 分析方法和實驗設計 .................................................................................................. 10
3-2-1資料集的分佈問題的分析方法............................................................................ 10
3-2-2只在不相關字典的字詞出現在文件的分析方法 ................................................ 11
3-3參數設定 ....................................................................................................................... 11
第四章 實驗結果 ................................................................................................................. 12
4-1 實驗資料 ...................................................................................................................... 12
4-2 實驗結果 ...................................................................................................................... 16
4-2-2各個主題分群結果 ............................................................................................... 16
4-2-3 各個主題NRO字典檔裡面的文字出現在文件的比率 .................................... 22
4-3 實驗結果討論 .............................................................................................................. 34
第五章 結論 ......................................................................................................................... 36
5-1 研究限制與貢獻 .......................................................................................................... 36
5-2 未來研究方向 .............................................................................................................. 37
參考文獻 ................................................................................................................................ 38
參考文獻 [1] Brin, S., Page, L. (1988). The anatomy of a large-scale hypertextual Web search engine, Computer Networks and ISDN Systems, Volume 30, Issues 1–7.
[2] Yi-Ru, W. (2011). Using Non-Relevance Information for Document Re-ranking, Thesis, National Central University, Taiwan.
[3] Chou, S., & Chang, W. (2008). CyberIR – A Technological Approach to Fight Cybercrime. Lecture Notes in Computer Science, 5075, 32–43.
[4] Bernardini, A., & Carpineto, C. (2008). FUB at TREC 2008 Relevance Feedback Track: Extending Rocchio with Distributional Term Analysis. Proceedings of the 17th Text REtrieval Conference (TREC 2008), Gaithersburg, MD, USA.
[5] He, B., Macdonald, C., Ounis, I., Peng, J., & Santos, R. L. T. (2008). University of Glasgow at TREC 2008: Experiments in Blog, Enterprise, and Relevance Feedback Tracks with Terrier. Proceedings of the 17th Text REtrieval Conference (TREC 2008), Gaithersburg, MD, USA.
[6] Yang, C.C.; Dorbin Ng, T. (2011). Analyzing and Visualizing Web Opinion Development and Social Interactions with Density-Based Clustering. Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, vol.41, no.6, pp.1144 - 1155.
[7] Liddy, E. D. (1998). Enhanced Text Retrieval Using Natural Language Processing. Bulletin of the American Society for Information Science, 24(4), 14–16.
[8] Baeza-Yates, R. A., & Ribeiro-Neto, B. (1999). Modern Information Retrieval. Boston, MA: Addison-Wesley.
[9] Can, F. and Ozkarahan, E. A. (1990). Concepts and effectiveness of the cover-coefficient-based clustering methodology for text databases. ACM Trans. Database Syst. 15, 483-517.
[10] Spink, A., Wolfram, D., Jansen, B.J. and Saracevic, T. (2001). Searching the web: the public and their queries. Journal of the American Society of Information Science, Vol. 53 No. 2, pp. 226-34.
[11] Dunlop, M. (1997). The effect of accessing non-matching documents on relevance feedback. ACM Transactions on Information Systems, 15(2), 137–153.
[12] He, B., Macdonald, C., Ounis, I., Peng, J., & Santos, R. L. T. (2008). University of Glasgow at TREC 2008: Experiments in Blog, Enterprise, and Relevance Feedback Tracks with Terrier. Proceedings of the 17th Text REtrieval Conference (TREC 2008), Gaithersburg, MD, USA.
[13] Kaptein, R., Kamps, J., & Hiemstra, D. (2008). The Impact of Positive, Negative and Topical Relevance Feedback. Proceedings of the 17th Text REtrieval Conference (TREC 2008), Gaithersburg, MD, USA.
[14] Lease, M. (2008). Incorporating Relevance and Psuedo-Relevance Feedback in the Markov Random Field Model. Proceedings of the 17th Text REtrieval Conference (TREC 2008), Gaithersburg, MD, USA.
[15] Fresno, V. and Ribeiro, A., (2004). An analytical approach to concept extraction in html environments. Journal of Intelligent Information Systems, vol. 22, pp. 215-235.
[16] Buckley, C., Mitra, M., Walz, J., Cardie, C. (2000). Using clustering and SuperConcepts within SMART: TREC 6, Information Processing & Management, Volume 36, Issue 1, Pages 109-131.
[17] Wu, M., Fuller, M., Wilkinson, F. (2001). Using clustering and classification approaches in interactive retrieval, Information Processing & Management, Volume 37, Issue 3, Pages 459-484.
[18] Cutting, D.R., Karger, D.R., Pedersen, J.O., and Tukey, J.W. (1992). Scatter/Gather: a cluster-based approach to browsing large document collections. In Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR ’’92). ACM, New York, NY, USA, 318-329.
[19] Crestani, F., Wu, S. (2006). Testing the cluster hypothesis in distributed information retrieval, Information Processing & Management, Volume 42, Issue 5, pp.1137-1150.
[20] Ishioka, T. (2000). X-means: Extended K-means with an Efficient Estimation of the Number of Clusters. Intelligent Data Engineering and Automated Learning (IDEAL 2000), Vol. 1983, pp. 17-22.
[21] The Lemur Toolkit. (2010). Lemur Project Home main page. [Online]. Available: http://www.lemurproject.org/.
指導教授 周世傑(Shihchieh Chou) 審核日期 2012-7-23
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明