探討使用者回饋之半結構化文件字詞特性於檢索文件的應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：15

、訪客IP：3.16.70.101

姓名

楊宗瑜(Tsung-Yu Yang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

探討使用者回饋之半結構化文件字詞特性於檢索文件的應用
(A study of the application of the term characteristics in the semi-structure document of user’s feedback)

相關論文

★ 信用卡盜刷防治簡訊規則製作之決策支援系統	★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討	★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究	★ 使用AHP分析法在軟體度量議題之研究
★ 優化入侵規則庫	★ 商務資訊擷取效率與品質促進之研究
★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素	★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究
★ The Development of a CASE Tool with Knowledge Management Functions	★ 以PAT tree 為基礎發展之快速搜尋索引樹
★ 以複合名詞為基礎之文件概念建立方式	★ 利用使用者興趣檔探討形容詞所處位置對評論分類的重要性
★ 透過半結構資訊及使用者回饋資訊以協助使用者過濾網頁文件搜尋結果	★ 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

由於近年來網際網路的快速發展，網際網路顯然已經成為人們日常生活中不可或缺的資訊來源。然而隨著網際網路快速發展，存在於網際網路上的資訊量也呈現急速增長的狀態，進而造成資訊過量的現象。因此如何依據使用者的興趣與喜好，擷取合適的資訊以提供給使用者，便成為一個重要的議題。
本研究利用相關回饋機制，從使用者相關回饋的半結構化文件，分析出何種字詞特性能夠正確反應出使用者的正向興趣與負向興趣，並且將這些字詞特性應用在使用者的檢索文件上進行字詞權重調整，讓擁有使用者基本興趣的使用者興趣檔，能夠正確地辨識出檢索文件是否符合使用者的需求與喜好。
經由本研究實驗後，發現標題且只在相關文件與標題且只在非相關文件具有反應出使用者正向興趣與負向興趣的能力，讓使用者興趣檔對檢索文件的辨別力提升。

摘要(英)

With the rapid development, Internet has become an essential source to get information. However, the explosive growth of information on Internet also causes the user to face the situation of information-overload. Therefore, how to extract and provide relevant information to the user has become an important issue.
In this study, we analyze the information residing in the relevance feedback and the semi-structured document to construct term sets with different characteristics. Then we apply these term characteristics to adjust the term weight of retrieved document in the information retrieval. Experiments results have shown that the method of our proposition could enhance the effectiveness of information retrieval.

關鍵字(中)

★ 使用者興趣檔
★ 搜尋結果過濾
★ 個人化搜尋
★ 字詞敏感度

關鍵字(英)

★ Personalized search
★ Sensitivity
★ Search result filtering
★ User profile

論文目次

第一章緒論 1
第一節研究動機 1
第二節研究目的 2
第三節研究範圍與限制 2
第四節研究流程 3
第五節論文架構 4
第二章文獻探討 5
第一節資訊檢索相關研究 5
第二節半結構化資訊相關研究 6
第三節半結構化資訊擷取 7
第四節相關回饋 8
第五節字詞敏感度 11
第三章系統架構 13
第一節網頁分析器 15
第二節內文分析器 16
第三節文件特徵建置器 17
第四節字詞集合建置器 19
第五節相似度計算器 19
第四章實驗分析 21
第一節實驗設計與流程 21
第二節實驗分析與結果 21
第五章結論 33
第一節研究結論與貢獻 33
第二節未來研究方向 34
參考文獻 36

參考文獻

[1]Chen,L., and Chue,W. L. Using web structure and summarisation techniques for web content mining. Information Processing and Management 41(5), pp. 1225-1242.
[2]Chou,S., and Chang,W. (2009) , The identification of distinguishing term characteristics from relevance feedback. Online Information Review, 33(4), 745-760.
[3]Cutler,M., Shih,Y., and Meng,W. (1997), "Using the structure of HTML documents to improve retrieval," Proceeding of the USENIX Symposium on Internet Technologies and Systems Monterey, California.
[4]Desjardins, G. and Godin, R., (2000), "Combining Relevance Feedback and Genetic Algorithms in an Internet Information Filtering Engine," Proceedings of RIAO2000 Conference, Vol. 2, pp. 1676-1685, Paris, France.
[5]Du,T. C., Li,F. and King,I. Managing knowledge on the web – extracting ontology from HTML web. Decision Support Systems In Press, Corrected Proof
[6]Fresno,V. and Ribeiro,A., (2004), "An analytical approach to concept extraction in html environments," Journal of Intelligent Information Systems, vol. 22, pp. 215-235.
[7]Gerard Salton and M.J.McGill, (1983), "Introduction to Modern Information Retrieval, " McGraw-Hill.
[8]He,X., Zha,H., HQ Ding,C. and D. Simon,H. Web document clustering using hyperlink structures. Computational Statistics and Data Analysis 41(1), pp. 19-45.
[9]Muslea,I., Minton,S. and Knoblock,C., (1998), "Stalker: Learning extraction rules for semistructured, web-based information sources," Proceedings of AAAI-98 Workshop on AI and Information Integration, Menlo Park, California, pp. 74-81, 1998.
[10]Muslea,I., Minton,S. and Knoblock,C., (1999), "A hierarchical approach to wrapper induction," Proceedings of the Third Annual Conference on Autonomous Agents, Seattle, Washington, pp. 190-197.
[11]Nick,Z. Z. and Themis,P., (2001), "Web search using a genetic algorithm," IEEE Internet Computing, vol. 5, pp. 18-26.
[12]Porter,M. F., (1980), "An algorithm for suffix stripping," Program, vol. 3, pp. 130-137.
[13]Ricardo Baeza-Yates and Berthie Ribeiro-Neto, (1999), "Modern Information Retrieval, " Addison-Wesley.
[14]Riboni, D., (2002), "Feature selection for web page classification," EURASIA-ICT 2002 Proceedings of the Workshop, pp. 473–477.
[15]Rocchio,J. J., (1966), "Document Retrieval Systems: Optimization and Evaluation," ,Unpublished doctoral dissertation ed.Cambridge, MA, USA: Harvard University,
[16]Salton,G. and Lesk,M., (1968), "Computer evaluation of indexing and text processing," Journal of the ACM (JACM), vol. 15, pp. 8-36,
[17]Salton,G. and McGill,M. J., (1983), "Introduction to Modern Information Retrieval." New York: McGraw-Hill.
[18]William B. Frakes and Ricardo Baeza-Yates, (1992), "Information Retrieval:Data Structures and Algorithms", Prentice-Hall.
[19]Xu,J., Liu,D. and Hu,M., (2004), "Feature selection and text classification for chinese web documents," Proceedings of 2004 International Conference on Machine Learning and Cybernetics, pp. 1304-1309, 26-29.
[20]Zhang,H., Ma,Y., Zhang,Q. and Xie,P., (2005), "Study and design of chinese concept-based search engine," Proceedings of ISCIT2005, pp. 40-43.

指導教授

周世傑(Shih-Chieh Chou)

審核日期

2010-7-26

推文