中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/81324
English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 78818/78818 (100%)
造訪人次 : 34713454      線上人數 : 667
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/81324


    題名: 應用語意分析資訊於相關回饋以進行文件分類之方法;The Application of Semantic Analysis Information in Relevance Feedback for Document Classification
    作者: 陳君櫂;Chen, Chun-Chao
    貢獻者: 資訊管理學系
    關鍵詞: 資訊檢索;相關回饋;LDA;Word2Vec;語意分析;Information Retrieval;Related Feedback;LDA;Word2Vec;Semantic Analysis
    日期: 2019-07-23
    上傳時間: 2019-09-03 15:44:25 (UTC+8)
    出版者: 國立中央大學
    摘要: 在資訊檢索領域中,相關回饋演算法是從使用者所回傳的相關文件清單中,萃取重要字詞作為回饋的特徵值,常使用向量空間模型(Vector Space Model)來表示文件之字詞特徵,然而此方法只考慮字詞出現的頻率,而未考量到字詞和文件間存在之語意關係,並且對於原始查詢字詞之語意資訊未加以利用,而近年來語意搜索(Semantic search)的研究陸續被提出,目的是挖掘字詞間隱含的語意關係。因此,本研究發展一套基於語意資訊之文件特徵擷取方法,以主題模型萃取隱含於相關文件與非相關文件中之主題資訊,並擷取出較能代表使用者資訊需求之主題字詞,再使用神經網路模型Word2Vec來分析原始查詢字詞與主題字詞間之語意資訊,也同時考量主題字詞之字詞出現情況(Term-appearance situation),最終給予不同主題字詞適當的權重。實驗結果表明,本研究提出之方法的分類準確率相較於BASELINE提升27個百分點,可以找出具代表性之重要主題字詞,進而檢索出更符合使用者資訊需求之文件。;In the field of information retrieval, the relevant feedback algorithm extracts important words as feedback feature values from the list of related documents returned by the user. The vector space model is often used to represent the word features of the document. However, this method only considering the frequency of occurrence of words, but not considering the semantic relationship between words and document, and the semantic information of the original query words is not used. And the research on semantic search has been proposed in recent years, the purpose is to explore the implicit semantic relationship between words. Therefore, this study develops a document feature extraction method based on semantic information, extracts the topic information implicit in related documents and non-related documents, and extracts the topic words that are more representative of users′ information needs. Then use the neural network model Word2Vec to analyze the semantic information between the original query words and the topic words, and also consider the term-appearance situation of the topic words, and finally give appropriate weights to different topic words.
    The experimental results show that the classification precision of our proposed method is 27 percentage points higher than that of BASELINE, and it can find representative and important topic words, and then retrieve the documents that are more in line with the user′s information needs.
    顯示於類別:[資訊管理研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML202檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明