中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/48961
English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 41742843      線上人數 : 1438
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/48961


    題名: 以形式概念分析為基礎之文件向量模型建立方式及其於文件分群之應用;A Formal Concept Analysis-Based Document Representation and its Application on Document Clustering
    作者: 鄭敬譯;Chin-Yi Cheng
    貢獻者: 資訊管理研究所
    關鍵詞: 概念關係;文件分群;形式概念;資訊擷取;文件向量;vector space model;information retrieval;document clustering;Formal concept analysis;conceptual relationship
    日期: 2011-07-12
    上傳時間: 2012-01-05 15:11:32 (UTC+8)
    摘要: 隨著網際網路的日益發達,有越來越多以文字為基礎的資訊出現,為了協助人們快速的搜尋到他們所需要的資訊,資訊擷取、文件分類、文件分群等技術被發展出來,這類技術有一大部分以所謂的向量模式為基礎,將文件或是查詢文字以單一文字為維度的向量加以表示,並以文字出現在文件或查詢文字中的頻率為維度值。這類以單一文字為維度的向量表示方式,忽略了那些可能有助於提升上述技術效果的文字間概念關係,例如同義字、上意字、下意字等。為了發展一套自動化的文字概念關係擷取技術,本研究應用型式概念分析,自動化的去針對一個文件集建立其文字關係架構,並發展一文件向量表示方式,應用所建立的文字關係架構將文件以概念為維度的向量加以表式,而為了評估其在相關應用上的效果,我們利用文件分群技術做為一個應用評估的方式。 With the continual improvement in internet-related technology, more and more information, especially text-based information, becomes available online. The implementation of most of these techniques draws upon Salton’s vector space model (VSM) in which documents or query strings are represented by vectors. Most implementations based on VSM employ the individual terms extracted from the documents or query strings as the dimensionalities of the vectors, and the frequency of terms appearing in the documents or query strings as the value of the dimensionalities. These implementations, or so-called bag-of-terms methods, ignore the conceptual relationships between terms such as synonyms, hypernyms and hyponyms that have been proven capable of improving the effectiveness of information retrieval, document classification and document clustering. To deal with the problem of an automatically- constructed thesaurus for a given document, in this study, we apply FCA to construct the term ontology to deal with the hierarchical conceptual relationships together with synonym-like relationships for the document set. We also develop a document representation method that applies ontology to represent documents by concept-based vectors. In order to evaluate the usability and effectiveness of our method, we make use of document clustering as the application used to evaluate the generated concept-based vectors.
    顯示於類別:[資訊管理研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML1052檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明