DC 欄位 |
值 |
語言 |
DC.contributor | 資訊工程學系在職專班 | zh_TW |
DC.creator | 翁梓勝 | zh_TW |
DC.creator | Tzu-Sheng Weng | en_US |
dc.date.accessioned | 2016-7-20T07:39:07Z | |
dc.date.available | 2016-7-20T07:39:07Z | |
dc.date.issued | 2016 | |
dc.identifier.uri | http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=103552021 | |
dc.contributor.department | 資訊工程學系在職專班 | zh_TW |
DC.description | 國立中央大學 | zh_TW |
DC.description | National Central University | en_US |
dc.description.abstract | 最近這幾年來,隨著網際網路 (World Wide Web) 的發展,社群問答的網站在最近這段時間也成長的非常多,大量的問答網站擁有非常多的資訊形成網路線上一個很有價值的知識寶庫,然而有一個現象,這些網站都會遇到的就是會有重複的問題,因此問題檢索的主要任務就是用來協助從存檔裡面找出之前已經被回答過的相關問題,然而詞語上同義詞性質的多樣性是問題檢索的一個極大挑戰,有些研究方法利用計算新的問題以及存檔問題之間相互關係的機率來處理這樣的狀況,另外也有許多研究是著重在字串之間的相似度。
在這篇論文裡,我們提出了一個方法首先利用 CBoW 的模型使用華碩 ROG 論壇的資料庫來做訓練資料,然後利用訓練出來的資料計算輸入的新問題以及存檔的問題之間的相似程度,與其他研究不同的地方在於我們將問題的標題以及問題的完整描述分開來看,將他們當作是兩個不同的特徵來做計算,另外我們也將使用者的榮譽點數拿來當做我們評估的一個要素, 我們的實驗顯示,對 ROG 論壇的資料庫做出來的結果優於其他的方法。 | zh_TW |
dc.description.abstract | In recent years, there has been much development of community based question and answer (cQA) site. The number of large-scale Q&A sites has significantly increased over time, and the information on these sites represents a valuable online knowledge pool. However, one issue with such sites is the problem of duplicate questions. The task of question retrieval aims to find previously answered semantically similar questions in cQA archives. Nevertheless, synonymous lexical variations pose a big challenge for question retrieval. Some approaches address this issue by calculating the probability of correlation between new questions and archived questions. Much recent research has also focused on surface string similarity among questions.
In this paper, we propose a method that first builds a continuous bag-of-word (CBoW) model with data from Asus’s Republic of Gamers (ROG) forum and then determines the similarity between a given new question and the Q&As in our database. Unlike most other studies, we calculate the similarity between the given question and the archived questions and descriptions separately with two different features. In addition, we factor user reputation into our ranking model. Our experimental results on ROG forum dataset show that our CBoW model with reputation features outperforms other top methods. | en_US |
DC.subject | 社群 | zh_TW |
DC.subject | 論壇 | zh_TW |
DC.subject | 問題 | zh_TW |
DC.subject | 檢索 | zh_TW |
DC.subject | Question | en_US |
DC.subject | Retrieval | en_US |
DC.subject | Community | en_US |
DC.subject | Forum | en_US |
DC.title | 社群論壇之問題檢索 | zh_TW |
dc.language.iso | zh-TW | zh-TW |
DC.title | Question Retrieval of Community Forum | en_US |
DC.type | 博碩士論文 | zh_TW |
DC.type | thesis | en_US |
DC.publisher | National Central University | en_US |