博碩士論文 994403002 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:8 、訪客IP:35.173.234.140
姓名 戴彰廷(DAI, ZHANG-TING)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 建構與應用特殊語詞資訊於文件重排序之研究
(A Study on Construction and Application of Specialty-Term Information for Document Re-ranking)
相關論文
★ 信用卡盜刷防治簡訊規則製作之決策支援系統★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究★ 使用AHP分析法在軟體度量議題之研究
★ 優化入侵規則庫★ 商務資訊擷取效率與品質促進之研究
★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究
★ The Development of a CASE Tool with Knowledge Management Functions★ 以PAT tree 為基礎發展之快速搜尋索引樹
★ 以複合名詞為基礎之文件概念建立方式★ 利用使用者興趣檔探討形容詞所處位置對評論分類的重要性
★ 透過半結構資訊及使用者回饋資訊以協助使用者過濾網頁文件搜尋結果★ 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 傳統相關回饋的研究,主要是將出現在檢索文件中的語詞區分為相關或不相關。本研究的主軸,是將檢索文件的語詞,依照語詞在相關與不相關文件中的出現狀況,將語詞做更進一步的區分,藉此來產生具備相關性質的語詞資訊,並且展示該語詞資訊與目前具備成效的查詢擴展方法整合後的應用性。
本研究發展了兩個方法,CASTI 與 CASTI-PSO,來應用前述具備相關性質的語詞資訊。這兩個方法,首先依照語詞在檢索文件中的出現狀況,將語詞做更進一步的區分,然後利用區分語詞產生相關性質的語詞資訊來增進查詢擴展,並將擴展後的新查詢應用於文件重排序。為了展示本研究方法的真實性並證明本研究方法,我們發展了一個資訊檢索系統,並且將所提出的兩個方法實作於此系統,然後進行正式的測試來驗證具備相關性質之語詞資訊的可用性。實驗的結果顯示,與兩種參數設定的 Rocchio 方法比較 (Rocc1 與 Rocc2),本研究方法在文件重排序的效能上是有明顯的增進。在 MAP 指標上與 Rocc1 (α=1, β=1, γ=0) 相比, CASTI 增進 42% ,CASTI-PSO 增進 51%。與 Rocc2 (α=1, β=0.75, γ=0.15) 相比, CASTI 增進 10%,CASTI-PSO 增進 17%。
本研究的重要性在於,它揭露了,除在目前相關回饋的一般語詞資訊之外,依照語詞在相關與不相關文件中的出現狀況,將語詞做更進一步的區分所產生的語詞資訊的可用性。因為相關性質的資訊屬於相關回饋領域資訊架構中的底層資訊,任何應用相關回饋資訊的方法,即能夠應用本研究所區分出的具備相關性質的語詞資訊。
摘要(英) Terms’ appearances in the retrieved documents are mainly classified as either relevant or irrelevant by the relevance feedback of conventional studies. The aim of this study is to differentiate the terms’ appearances in the retrieved documents in more detailed situations to generate relevance-related information and to demonstrate the applicability of the derived information in combination with current methods of query expansion.
In this study, two methods, CASTI and CASTI-PSO, were developed to utilize the derived information of term appearance differentiation within a conventional query expansion approach that has been proven as an effective technology in the enhancement of information retrieval. The methods differentiate the terms’ appearances in the retrieved documents in more detailed situations, and utilize the derived information of term appearance differentiation to enhance the expanded query for document re-ranking. To demonstrate the realization and sustain the study of the methods, an information retrieval system was developed, and the methods were implemented on the system for tests and evaluations. Formal tests were conducted to examine the distinguishing capability of the proposed information utilized in the methods. The experimental results show that substantial differences in performances can be achieved between the proposed methods and the conventional query expansion method alone. For the measurement of MAP, CASTI gained a 42% increase rate and CASTI-PSO gained a 51% increase rate, over Rocc1 (α=1, β=1, γ=0). CASTI gained a 10% increase rate and CASTI-PSO gained a 17% increase rate, over Rocc2 (α=1, β=0.75, γ=0.15).
Since the derived information resides at the bottom of the information hierarchy of relevance feedback, any technology regarding the application of relevance feedback information could consider the utilization of this piece of information. The importance of the study is the disclosure of the applicability of the proposed information beyond current usage of term appearances in relevant/irrelevant documents and the initiation of a query expansion technology in the application of this information.
關鍵字(中) ★ 語詞出現
★ 語詞權重修改
★ 查詢擴展
★ 查詢修改
★ 相關回饋
★ 資訊檢索
★ 文件重排序
★ 粒子群最佳化
★ 自適應演算法
關鍵字(英) ★ Term appearance
★ Term weight modification
★ Query expansion
★ Query modification
★ Relevance feedback
★ Information retrieval
★ Document re-ranking
★ Particle swarm optimization
★ Self-adaptive algorithm
論文目次 Table of Contents
CHINESE ABSTRACT....................................I
ABSTRACT............................................II
ACKNOWLEDGEMENT.....................................IV
TABLE OF CONTENTS...................................V
LIST OF TABLES......................................VII
LIST OF FIGURES.....................................VIII
EXPLANATION OF SYMBOLS..............................X
1. INTRODUCTION................................1
2. RELATED WORKS...............................3
2.1 Relevance Feedback..........................3
2.2 Particle Swarm Optimization.................5
2.3 Research Question and Purpose...............12
3. THE METHOD..................................14
3.1 Analysis....................................14
3.2 Embodiment..................................19
3.2.1 CASTI.......................................19
3.2.2 CASTI-PSO...................................24
4. EXPERIMENTS.................................33
4.1 Experimental parameters and settings........33
4.2 Data Presentation and Analysis..............36
4.2.1 The selection of the SI value...............36
4.2.2 Comparisons of CASTI, Rocchio’s method and the initial query.......................................38
4.2.3 Comparisons of CASTI-PSO, CASTI, Rocchio’s method and the initial query...............................46
4.3 Discussions.................................58
4.3.1 CASTI.......................................58
4.3.2 CASTI-PSO...................................60
4.4 Implications................................61
5. CONCLUSION..................................63
REFERENCES..........................................65
參考文獻 [1] Chou, S. and Chang, W., “The identification of distinguishing term characteristics from relevance feedback”, Online Information Review, vol. 33, no. 4, pp. 745-760, 2009.
[2] Baeza-Yates, R.A. and Ribeiro-Neto, B., Modern information retrieval, Addison-Wesley Longman Publishing Co., Inc., 1999.
[3] Carpineto, C., de Mori, R., Romano, G. and Bigi, B., “An information-theoretic approach to automatic query expansion”, ACM Transactions on Information Systems, vol. 19, no. 1, pp. 1-27, 2001.
[4] Manning, C.D., Raghavan, P. and Schütze, H., Introduction to information retrieval, Cambridge, Cambridge University Press, 2008.
[5] Shen, X., Tan, B. and Zhai, C., “Context-sensitive information retrieval using implicit feedback”, in Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, Salvador, Brazil, ACM, New York, NY, USA, pp. 43-50, 2005.
[6] Pu, Q. and He, D., “Pseudo relevance feedback using semantic clustering in relevance language model”, in Proceedings of the 18th ACM conference on Information and knowledge management, Hong Kong, China, ACM, New York, NY, USA, pp. 1931-1934, 2009.
[7] Rooney, N., Patterson, D., Galushka, M. and Dobrynin, V., “A relevance feedback mechanism for cluster-based retrieval”, Information Processing & Management, vol. 42, no. 5, pp. 1176-1184, 2006.
[8] Alshaar, R., “Measuring the stability of query term collocations and using it in document ranking”, University of Waterloo Library, 200 University Avenue West, Waterloo, Ontario, Canada, 2009, available at: http://hdl.handle.net/10012/4256 (accessed 31 July 2009).
[9] Harman, D., “Relevance feedback revisited”, in Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, Copenhagen, Denmark, ACM, New York, NY, USA, pp. 1-10, 1992.
[10] Rocchio, J.J., “Relevance feedback in information retrieval”, In Salton, G (ed), The SMART retrieval system, Englewood Cliffs, N.J., Prentice Hall, pp. 313-323, 1971.
[11] Ide, E., “New experiments in relevance feedback”, In Salton, G (ed), The SMART retrieval system, Englewood Cliffs, N.J., Prentice Hall, pp. 337-354, 1971.
[12] Singhal, A., Mitra, M. and Buckley, C., “Learning routing queries in a query zone”, SIGIR Forum, vol. 31, no. SI, pp. 25-32, 1997.
[13] Desjardins, G. and Godin, R., “Combining relevance feedback and genetic algorithm in an internet information filtering engine”, in RIAO, College de France, Paris, France, pp. 1676-1685, 12-14, April, 2000.
[14] Nick, Z.Z. and Themis, P., “Web search using a genetic algorithm”, Internet Computing, IEEE, vol. 5, no. 2, pp. 18-26, 2001.
[15] Kim, B.M., Kim, J.Y. and Kim, J., “Query term expansion and reweighting using term co-occurrence similarity and fuzzy inference”, in IFSA World Congress and 20th NAFIPS International Conference, 2001. Joint 9th, Vancouver, BC, IEEE, pp. 715-720, 25-28, July, 2001.
[16] Azimi-Sadjadi, M.R., Salazar, J., Srinivasan, S. and Sheedvash, S., “An adaptable connectionist text-retrieval system with relevance feedback”, IEEE Transactions on Neural Networks , vol. 18, no. 6, pp. 1597-1613, 2007.
[17] Koster, C.H.A. and Beney, J.G., “On the importance of parameter tuning in text categorization”, In Virbitskaite, I & Voronkov, A (eds), Perspectives of Systems Informatics, Springer Berlin Heidelberg, pp. 270-283, 2007.
[18] Harter, S.P., “A probabilistic approach to automatic keyword indexing. Part I. on the distribution of specialty words in a technical literature”, Journal of the American Society for Information Science, vol. 26, no. 4, pp. 197-206, 1975.
[19] Harter, S.P., “A probabilistic approach to automatic keyword indexing. Part II. an algorithm for probabilistic indexing”, Journal of the American Society for Information Science, vol. 26, no. 5, pp. 280-289, 1975.
[20] Robertson, S.E. and Sparck Jones, K., “Relevance weighting of search terms”, Journal of the American Society for Information Science, vol. 27, no. 3, pp. 129-146, 1976.
[21] Harper, D.J. and van Rijsbergen, C.J., “An evaluation of feedback in document retrieval using co‐occurrence data”, Journal of Documentation, vol. 34, no. 3, pp. 189-216, 1978.
[22] Wu, H. and Salton, G., “The estimation of term relevance weights using relevance feedback”, Journal of Documentation, vol. 37, no. 4, pp. 194-214, 1981.
[23] Croft, W.B., “Experiments with representation in a document-retrieval system”, Information Technology-Research Development Applications, vol. 2, no. 1, pp. 1-21, 1983.
[24] Porter, M. and Galpin, V., “Relevance feedback in a public access catalogue for a research library: Muscat at the Scott Polar Research Institute”, Program, vol. 22, no. 1, pp. 1-20, 1988.
[25] Takano, K., Chen, X. and Masuda, K., “A framework for a feedback process to analyze and personalize a document vector space in a feature extraction model”, Information Technology and Management, vol. 10, no. 2-3, pp. 151-176, 2009.
[26] Chen, Z. and Lu, Y., “Using text classification method in relevance feedback”, In Nguyen, N, Le, M & Świątek, J (eds), Intelligent Information and Database Systems, Springer Berlin Heidelberg, pp. 441-449, 2010.
[27] Dang, E.K.F., Luk, R.W.P. and Allan, J., “Beyond bag-of-words: bigram-enhanced context-dependent term weights”, Journal of the Association for Information Science and Technology, vol. 65, no. 6, pp. 1134-1148, 2014.
[28] Miao, J., Huang, J.X. and Ye, Z., “Proximity-based Rocchio’s model for pseudo relevance”, in Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, Portland, Oregon, USA, ACM, New York, NY, USA, pp. 535-544, 2012.
[29] Haiduc, S., Bavota, G., Marcus, A., Oliveto, R., Lucia, A.D. and Menzies, T., “Automatic query reformulations for text retrieval in software engineering”, in Proceedings of the 2013 International Conference on Software Engineering, San Francisco, CA, USA, IEEE Press, Piscataway, NJ, USA, pp. 842-851, 2013.
[30] Vargas, S., Santos, R.L.T., Macdonald, C. and Ounis, I., “Selecting effective expansion terms for diversity”, in Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, Lisbon, Portugal, LE CENTRE DE HAUTES ETUDES INTERNATIONALES D”INFORMATIQUE DOCUMENTAIRE, Paris, France, pp. 69-76, 2013.
[31] Rodriguez Perez, J.A., Moshfeghi, Y. and Jose, J.M., “On using inter-document relations in microblog retrieval”, in Proceedings of the 22nd international conference on World Wide Web companion, Rio de Janeiro, Brazil, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp. 75-76, 2013.
[32] Dalton, J. and Dietz, L., “A neighborhood relevance model for entity linking”, in Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, Lisbon, Portugal, LE CENTRE DE HAUTES ETUDES INTERNATIONALES D”INFORMATIQUE DOCUMENTAIRE, Paris, France, pp. 149-156, 2013.
[33] Kennedy, J. and Eberhart, R., “Particle swarm optimization”, in Proceedings of ICNN′95 - International Conference on Neural Networks, Perth, WA, Australia, IEEE, pp. 1942-1948, vol. 4, 27, Nov.-1, Dec., 1995.
[34] Eberhart, R. and Kennedy, J., “A new optimizer using particle swarm theory”, in MHS′95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, pp. 39-43, 4-6, Oct., 1995.
[35] Shi, Y. and Eberhart, R., “A modified particle swarm optimizer”, in 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360), Anchorage, AK, USA, pp. 69-73, 4-9, May, 1998.
[36] Shi, Y. and Eberhart, R., “Empirical study of particle swarm optimization”, in Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), Washington, DC, USA, IEEE, pp. 1945-1950, vol. 3, 6-9, July, 1999.
[37] Kennedy, J., “The particle swarm: social adaptation of knowledge”, in Proceedings of 1997 IEEE International Conference on Evolutionary Computation (ICEC ′97), Indianapolis, IN, USA, USA, IEEE, pp. 303-308, 13-16, April, 1997.
[38] Khan, S. A. and Engelbrecht, A. P., “A fuzzy particle swarm optimization algorithm for computer communication network topology design”, Applied Intelligence, vol. 36, no. 1, pp. 161-177, 2012.
[39] Chen, M.-R., Li, X., Zhang, X. and Lu, Y.-Z., “A novel particle swarm optimizer hybridized with extremal optimization”, Applied Soft Computing, vol. 10, no. 2, pp. 367-373, 2010.
[40] Pandey, S., Wu, L., Guru, S. M. and Buyya, R., “A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments”, in Advanced Information Networking and Applications (AINA), 2010 24th IEEE International Conference on, Perth, WA, Australia, IEEE, pp. 400-407, 20-23, April, 2010.
[41] Zhang, Y. and Wu, L., “A robust hybrid restarted simulated annealing particle swarm optimization technique”, Advances in Computer Science and its Applications, vol. 1, no. 1, pp. 5-8, 2012.
[42] Ishaque, K., Salam, Z., Amjad, M. and Mekhilef, S., “An improved particle swarm optimization (PSO) – based MPPT for PV with reduced steady-state oscillation”, IEEE Transactions on Power Electronics, vol. 27, no. 8, pp. 3627-3638, 2012.
[43] Park, J.-B., Jeong, Y.-W., Shin, J.-R. and Lee, K. Y., “An improved particle swarm optimization for nonconvex economic dispatch problems”, IEEE Transactions on Power Systems, vol. 25, no. 1, pp. 156-166, 2010.
[44] Puranik, P., Bajaj, P., Abraham, A., Palsodkar, P. and Deshmukh, A., “Human perception-based color image segmentation using comprehensive learning particle swarm optimization”, Journal of Information Hiding and Multimedia Signal Processing, vol. 2, no. 3, pp. 227-235, 2011.
[45] Liu, H., Cai, Z. and Wang, Y., “Hybridizing particle swarm optimization with differential evolution for constrained numerical and engineering optimization”, Applied Soft Computing, vol. 10, no. 2, pp. 629-640, 2010.
[46] Yildiz, A. R. and Solanki, K. N., “Multi-objective optimization of vehicle crashworthiness using a new particle swarm based approach”, The International Journal of Advanced Manufacturing Technology, vol. 59, no. 1, pp. 367-376, 2012.
[47] Zhang, Y. and Wu, L., “Rigid image registration by PSOSQP algorithm”, Advances in Digital Multimedia, vol. 1, no. 1, pp. 4-8, 2012.
[48] Eberhart, R. and Shi, Y., “Particle swarm optimization: Development, applications and resources”, in Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546), Seoul, South Korea, IEEE, pp. 81-86, vol. 1, 27-30, May, 2001.
[49] The Lemur Project, University of Massachusetts and Carnegie Mellon University, available at: http://www.lemurproject.org/ (accessed 25 October 2012).
[50] Salton, G., The SMART retrieval system: experiments in automatic document processing, Englewood Cliffs, N.J., Prentice-Hall, 1971.
指導教授 周世傑(Shihchieh Chou) 審核日期 2019-8-7
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明