摘要: | 傳統相關回饋的研究,主要是將出現在檢索文件中的語詞區分為相關或不相關。本研究的主軸,是將檢索文件的語詞,依照語詞在相關與不相關文件中的出現狀況,將語詞做更進一步的區分,藉此來產生具備相關性質的語詞資訊,並且展示該語詞資訊與目前具備成效的查詢擴展方法整合後的應用性。 本研究發展了兩個方法,CASTI 與 CASTI-PSO,來應用前述具備相關性質的語詞資訊。這兩個方法,首先依照語詞在檢索文件中的出現狀況,將語詞做更進一步的區分,然後利用區分語詞產生相關性質的語詞資訊來增進查詢擴展,並將擴展後的新查詢應用於文件重排序。為了展示本研究方法的真實性並證明本研究方法,我們發展了一個資訊檢索系統,並且將所提出的兩個方法實作於此系統,然後進行正式的測試來驗證具備相關性質之語詞資訊的可用性。實驗的結果顯示,與兩種參數設定的 Rocchio 方法比較 (Rocc1 與 Rocc2),本研究方法在文件重排序的效能上是有明顯的增進。在 MAP 指標上與 Rocc1 (α=1, β=1, γ=0) 相比, CASTI 增進 42% ,CASTI-PSO 增進 51%。與 Rocc2 (α=1, β=0.75, γ=0.15) 相比, CASTI 增進 10%,CASTI-PSO 增進 17%。 本研究的重要性在於,它揭露了,除在目前相關回饋的一般語詞資訊之外,依照語詞在相關與不相關文件中的出現狀況,將語詞做更進一步的區分所產生的語詞資訊的可用性。因為相關性質的資訊屬於相關回饋領域資訊架構中的底層資訊,任何應用相關回饋資訊的方法,即能夠應用本研究所區分出的具備相關性質的語詞資訊。;Terms’ appearances in the retrieved documents are mainly classified as either relevant or irrelevant by the relevance feedback of conventional studies. The aim of this study is to differentiate the terms’ appearances in the retrieved documents in more detailed situations to generate relevance-related information and to demonstrate the applicability of the derived information in combination with current methods of query expansion. In this study, two methods, CASTI and CASTI-PSO, were developed to utilize the derived information of term appearance differentiation within a conventional query expansion approach that has been proven as an effective technology in the enhancement of information retrieval. The methods differentiate the terms’ appearances in the retrieved documents in more detailed situations, and utilize the derived information of term appearance differentiation to enhance the expanded query for document re-ranking. To demonstrate the realization and sustain the study of the methods, an information retrieval system was developed, and the methods were implemented on the system for tests and evaluations. Formal tests were conducted to examine the distinguishing capability of the proposed information utilized in the methods. The experimental results show that substantial differences in performances can be achieved between the proposed methods and the conventional query expansion method alone. For the measurement of MAP, CASTI gained a 42% increase rate and CASTI-PSO gained a 51% increase rate, over Rocc1 (α=1, β=1, γ=0). CASTI gained a 10% increase rate and CASTI-PSO gained a 17% increase rate, over Rocc2 (α=1, β=0.75, γ=0.15). Since the derived information resides at the bottom of the information hierarchy of relevance feedback, any technology regarding the application of relevance feedback information could consider the utilization of this piece of information. The importance of the study is the disclosure of the applicability of the proposed information beyond current usage of term appearances in relevant/irrelevant documents and the initiation of a query expansion technology in the application of this information. |