引用本體論至相關文件檢索之研究; Applying Ontology to Relevant Document Discovery

NCU Institutional Repository > 管理學院 > 資訊管理研究所 > 博碩士論文 > Item 987654321/13298

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/13298

題名:	引用本體論至相關文件檢索之研究;Applying Ontology to Relevant Document Discovery
作者:	施亮如;Liang-Lu Shih
貢獻者:	資訊管理研究所
關鍵詞:	相關文件檢索;本體論萃取;本體論對應;本體論;Relevant Document Discovery;Ontology Extraction;Ontology
日期:	2006-06-28
上傳時間:	2009-09-22 15:28:24 (UTC+8)
出版者:	國立中央大學圖書館
摘要:	相關文件檢索的議題已被廣泛地討論，並有各種不同的方法或技術被提出或實際應用至上線的文件檢索系統中。大部分的方法採取讓使用者輸入查詢，系統對查詢字串做些處理，再進行全文比對以找到相關文件；或者，提供使用者特定欄位的查詢，如標題、摘要、關鍵字、參考文獻等，再將這些特定欄位轉成特定的模式做相似度計算，如向量模式搭配TF/IDF 來計算文章相似度。整體而言，這些方法主要來自於資訊檢索(Information Retrieval)這門領域中。語意網(Semantic Web)是一門新興的研究領域，並已被用來和其他研究領域相結合以產生各種應用，這些領域包括知識管理、代理人通訊、網路服務等。語意網的核心概念為本體論(Ontology)，根據本體論的特性，以標籤語言方式將特定內容具備的語意充份地呈現出來，不但具可讀性，更能被電腦系統作進一步的處理；而目前大多提出的相關文件檢索的方法對於文件內容中語意特性的處理仍然有限，再加上較少文獻論及將本體論的概念應用至相關文件檢索的方法，因此促成本研究的產生。於本研究中，將本體論應用至相關文件檢索的架構被設計出來，並實作一個雛型系統。系統的輸入為一份文件，而輸出為和輸入文件相關的文件；而系統處理程序主要分成若干步驟：(1)將輸入文件轉換成本體論的格式。(2)若輸入文件已存在於系統中，則直接輸出相關文件。(3)若輸入文件不存在於系統中，則進行輸入文件和已存在於系統中文件的相似度計算。其中，本研究設計兩種相似度計算方法來計算相似度，並搭配遺傳演算法來分別計算兩種相似度計算結果所對應的權重，完成最終的相似值。 Research of relevant document discovery is practical and attractive to many researchers, and there are different solutions to this issue. Some solutions have been adopted in real world environments, such as electronic articles publishers. These publishers offer different information search options such as keywords, full-text, phrase, boolean expression…etc, for users to retrieve documents. Most relevant document discovery techniques are originally from the domain of information retrieval. The core concept of semantic web is ontology, which has been applied in various domains, such as web service, agent communication, knowledge management… etc. However, there was few paper applied ontology to the research of relevant document discovery. Therefore, in this paper, ontology is applied to the issue of relevant documents discovery and a prototype system is constructed to implement the method proposed. With the input of a user selected document, the designed prototype system could generate a number of closely related documents that originally stored in the repository. The process of the prototype system could be mainly divided into the following steps: (1) transforming the input text document into OWL format (2) determining if the input document already exists in the ontology repository of the system (3) if the input document does not exist in ontology repository, then the program will calculate the similarity between the input ontology and the documents originally stored in ontology repository, and retrieving related documents with higher similarity values. Ontology extraction and similarity calculation are the cores that applied the concept of ontology to the prototype system. The objective of ontology extraction is to transform TXT format documents into OWL formats according to the characteristics of ontology. Secondly, similarity calculation is composed of two methods: concept similarity and instance similarity are proposed and implemented in the prototype system.
顯示於類別:	[資訊管理研究所] 博碩士論文

文件中的檔案:

檔案	大小	格式	瀏覽次數

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....