結合Google Similarity的Item-Base協同過濾

NCU Institutional Repository > 管理學院 > 資訊管理研究所 > 博碩士論文 > Item 987654321/68826

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/68826

題名:	結合Google Similarity的Item-Base協同過濾
作者:	陳旻駿;Chen,Min-Chun
貢獻者:	資訊管理學系
關鍵詞:	協同過濾;NGD;推薦系統;預測;Collaborative Filtering;NGD;Recommendation Systems;Prediction
日期:	2015-07-27
上傳時間:	2015-09-23 14:43:36 (UTC+8)
出版者:	國立中央大學
摘要:	在協同過濾系統(Collaborative Filtering)中，我們觀察過去的研究，大部分都是以系統所收集資料(Local Resources)來做為分析基礎，採用使用者評分矩陣(Rating Matrix)來做相似性的分析和預測。像是以項目為基礎(Item-Based)協同過濾的效能和正確性，得完全依靠評分矩陣(Rating Matrix)的資料收集量及完整性而決定，當資料量不足時，就會遇到稀疏性問題(Sparsity Problem)的問題，而冷啟動(Cold-Start)則是以系統所收集資料(Local Resources)為基礎的分析條件下所無法避免的問題。本篇論文提出了一個新的觀點，我們希望能夠找到一個額外的資料庫，來輔助以項目為基礎(Item-Based)協同過濾，不論是一般情況下，又或者是當遇到稀疏矩陣和新商品加入時，能夠使用這個額外的資料庫來計算出更準確的相似度，並結合兩個不同資料基礎的預測結果，以增加最後預測或推薦成功的準確性。我們利用全球資訊網(www)這個的現成的龐大資料庫來當作外部資料(Global Resources)來源，在網際網路中眾多的評論、討論等資訊，越常被放在同一篇文章所討論或評判的兩商品，代表兩者之間擁有越高的相似度，本論文利用Google Similarity的計算，來求得在www中所反映的兩商品之間的相似度資訊，減緩只使用現有資料(Local Resources)所產生的問題。 ;Based on the previous research, mostly we applied the Local Resources as the basic analysis in Collaborative Filtering, adopting the Rating Matrix for the analysis and prediction of similarities. For example, the efficacy and the correctness of Item-Based is exclusively determined by the quantity of the collected data and the completeness of Rating Matrix. When the quantity is insufficient, it might cause the Sparsity Problem, and the Cold-Start is another inevitable problem caused by the analysis of Local Resources. We argued for a new perspective that finding an extra database to assist the Item-Based Collaborative Filtering. No matter under which circumstances, the normal one or encountering the arsematrix and new product, we could apply the extra database to calculate the similarity more accurately, combining the predictions of the two different database to increase the accuracy and success of the final prediction. We utilize the existed huge database, www, as Global Resources. Within the numerous comment and discussion on the Internet, the more frequently compared or discussed between the two products, the higher similarities they have. In the previous study, with the calculation of the Google Similarity, we gained the similarity information between the two products reflected in www, to soften the problem of adopting Local Resources alone.
顯示於類別:	[資訊管理研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	745	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....