自編碼器於推薦系統之應用分析

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：28

、訪客IP：3.21.248.47

姓名

謝得人(Ter-Jen Hsieh) 查詢紙本館藏

畢業系所

數學系

論文名稱

自編碼器於推薦系統之應用分析
(Application and Analysis of Autoencoder in Recommender Systems)

相關論文

★ 氣流的非黏性駐波通過不連續管子之探究	★ An Iteration Method for the Riemann Problem of Some Degenerate Hyperbolic Balance Laws
★ 影像模糊方法在蝴蝶辨識神經網路中之應用	★ 單一非線性平衡律黎曼問題廣義解的存在性
★ 非線性二階常微方程組兩點邊界值問題之解的存在性與唯一性	★ 對接近音速流量可壓縮尤拉方程式的柯西問題去架構區間逼近解
★ 一些退化擬線性波動方程的解的性質.	★ 擬線性波方程中片段線性初始值問題的整體Lipchitz連續解的
★ 水文地質學的平衡模型之擴散對流反應方程	★ 非線性守恆律的擾動Riemann 問題的古典解
★ BBM與KdV方程初始邊界問題解的週期性	★ 共振守恆律的擾動黎曼問題的古典解
★ 可壓縮流中微黏性尤拉方程激波解的行為	★ 非齊次雙曲守恆律系統初始邊界值問題之整域弱解的存在性
★ 有關非線性平衡定律之柯西問題的廣域弱解	★ 單一雙曲守恆律的柯西問題熵解整體存在性的一些引理

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本研究探討神經網路家族中的自編碼器於推薦系統的應用，主要分為兩部分：第一部分觀察在超參數如隱藏層維數、層數以及正則化與dropout程度不同時模型的表現；第二部分嘗試混合模型，將自編碼器抽取出來的特徵視為內容過濾算法的預處理，觀察並分析模型的表現。推薦場景使用MovieLens 1M資料集，共有6040位使用者對3706部電影的共1000209筆評分資料，以訓練模型預測使用者對電影的評分，最終以RMSE作為模型評估指標。實驗結果發現，隱藏層維數增加容易造成過擬合，隱藏層層數增加則可加速收斂並提升模型表現，而正則化與dropout防止過擬合的效果顯著；混合模型使用自編碼器降維、抽取使用者的特徵，與傳統的協同過濾相比表現略有提升。

摘要(英)

This research explores the application of the autoencoder in the neural network family for the recommender system. The thesis is divided into two parts: The first part is to observe the performance of the model when the hyperparameters, such as the hidden layer dimension, the number of layers, the degree of regularization and dropout, are different. The second part is to mix the model so that the feature extracted from the autoencoder is regarded as the preprocessing of the content filtering algorithm. The performance of the model is observed and analyzed. The recommended scene is used from the MovieLens 1M dataset. A total of 6,040 users have scored 1,000,209 ratings on 3,706 movies. We use this dataset to predict the user′s ratings on the movie, and finally use RMSE as the index of evaluation. The experimental results show that the increasing of the hidden layer dimension is likely to cause over-fitting. The increasing of the number of hidden layers can accelerate the convergence and improve the performance of the model, while the regularization and dropout prevent the overfitting effect. The hybrid model uses the autoencoder to reduce the dimension and extracted the feature of the user. The performance is slightly improved compared with the traditional collaborative filtering.

關鍵字(中)

★ 自編碼器
★ 推薦系統
★ 特徵抽取
★ 過擬合
★ 混合模型

關鍵字(英)

★ autoencoder
★ recommender system
★ feature extraction
★ overfitting
★ hybrid model

論文目次

摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vi
表目錄 vii
一、緒論 1
1-1研究背景 1
1-2研究動機 3
1-3研究問題與方法 4
二、文獻探討 5
2-1推薦系統 5
2-1-1基本架構 6
2-1-2評估指標 7
2-1-3實務上的難題 12
2-1-4推薦算法分類 13
2-2協同過濾 15
2-2-1基於相似度 15
2-2-2降維 19
2-3內容過濾 22
2-4神經網路 25
2-4-1多層感知機 25
2-4-2自編碼器 28
三、實驗設計與結果 30
3-1資料集簡介 30
3-2資料預處理 32
3-3模型結構與訓練 33
3-4超參數變化對模型的影響 35
3-4-1 隱藏層維數的影響 36
3-4-2 隱藏層層數的影響 37
3-4-3 正則化與dropout程度的影響 38
3-5混合模型 39
四、結論與展望 40
4-1結果觀察與討論 40
4-2未來展望 42
參考文獻 43

參考文獻

〔1〕 Bennett, James, and Stan Lanning. "The netflix prize", Proceedings of KDD cup and workshop, Vol.2007, pp.35, August 2007.
〔2〕 R. Bell, Y. Koren and C. Volinsky, "Matrix Factorization Techniques for Recommender Systems", Computer, Vol.42, no.08, pp.30-37, August 2009.
〔3〕 Netflix Prize, Leaderboard, 2009年7月26日,取自https://www.netflixprize.com/leaderboard.html。
〔4〕 LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning", Nature, Vol.521.7553, pp.436-444, May 2015
〔5〕 Harper, F. Maxwell, and Joseph A. Konstan. "The movielens datasets: History and context." ACM Transactions on Interactive Intelligent Systems (TIIS), Vol.5, Issue 4, no.19, January 2016.
〔6〕 Lü, L., Medo, M., Yeung, C. H., Zhang, Y. C., Zhang, Z. K., Zhou, T., et al, "Recommender systems", Physics reports, Vol.519, Issue 1, pp.1-49, October 2012.
〔7〕 Alisa，訓練集（train)、驗證集（validation）和測試集（test）的意義，2017年10月27日，取自https://hk.saowen.com/a/ccf871080e6f444ceba7924230f2cada409baa8a395900f8c677b365a33c1e4d。
〔8〕 Herlocker, Jonathan L., et al. "Evaluating collaborative filtering recommender systems", ACM Transactions on Information Systems (TOIS), Vol.22, Issue 1, pp.5-53, January 2004.
〔9〕 Powers, David Martin, "Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation", Journal of Machine Learning Technologies, Vol.2, Issue 1, pp.37–63, February 2011.
〔10〕 Roger，如何評估推薦系統（一），2008年1月25日，取自https://blurkerlab.blogspot.com/2008/01/blog-post_25.html。
〔11〕 Josephine Liu，Precision Recall and ROC Curves for Pregnancy Tests，2017年5月12日，取自 https://www.periscopedata.com/blog/precision-recall-and-roc-curves-for-pregnancy-tests。
〔12〕王建興，憑藉推薦系統來活化長尾的部分，2012年8月13日，取自http://online.ithome.com.tw/itadm/article.php?c=75493&s=1。
〔13〕 Su, Xiaoyuan, and Taghi M. Khoshgoftaar, "A survey of collaborative filtering techniques", Advances in artificial intelligence, Vol.2009, ArticleID 421425, pp.1-19, August 2009.
〔14〕陳上進，Recommender System: Collaborative Filtering 協同過濾推薦演算法，2017年1月31日，取自https://vinta.ws/code/recommender-system-memory-based-collaborative-filtering.html。
〔15〕 Ron Zacharski， A Programmer’s Guide to Data Mining: The Ancient Art of the Numerati，2015年，取自http://guidetodatamining.com/。
〔16〕 Koren, Yehuda, "Collaborative filtering with temporal dynamics", Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp.447-456, July 2009.
〔17〕郭耀華，深度學習——優化器演算法Optimizer詳解（BGD、SGD、MBGD、Momentum、NAG、Adagrad、Adadelta、RMSprop、Adam），2018年3月10日，取自https://tw.saowen.com/a/1145a32d8e5672f205fcb15275f029c756c6c13c25c1bbc30a7bac27d168b781。
〔18〕 Adomavicius, Gediminas, and Alexander Tuzhilin, "Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions", IEEE Transactions on Knowledge & Data Engineering ,Vol.17, Issue 6, pp.734-749, June 2005.
〔19〕 Pazzani, Michael J., and Daniel Billsus, "Content-based recommendation systems", The adaptive web, Springer, Berlin, Heidelberg, 2007
〔20〕 Tommy Huang，機器學習- 神經網路(多層感知機 Multilayer perceptron, MLP)運作方式，2018年4月1日，取自https://medium.com/@chih.sheng.huang821/f0e108e8b9af。
〔21〕 Alison yang，[魔法陣系列]Artificial Neural Network (ANN) 之術式解析，2018年10月18日，取自https://ithelp.ithome.com.tw/articles/10201931。
〔22〕 Hecht-Nielsen, Robert, "Theory of the backpropagation neural network", Neural networks for perception, pp.65-93, 1992.
〔23〕不會停的蝸牛，常用激活函數比較，2017年3月14日，取自https://www.jianshu.com/p/22d9720dbf1a。
〔24〕 Venelin Valkov，What to do when data is missing? - Part II
，2017年2月2日，取自http://curiousily.com/data-science/2017/02/02/what-to-do-when-data-is-missing-part-2.html。
〔25〕 Kuchaiev, Oleksii, and Boris Ginsburg, "Training deep autoencoders for recommender systems", ICLR 2018 Conference Blind Submission, February 2018.
〔26〕 Resnick, Paul, et al, "GroupLens: an open architecture for collaborative filtering of netnews", Proceedings of the 1994 ACM conference on Computer supported cooperative work, ACM, pp.175-186, October 1994.
〔27〕 Batmaz, Zeynep, et al, "A review on deep learning for recommender systems: challenges and remedies", Artificial Intelligence Review, pp.1-37, August 2018.
〔28〕 Sedhain, Suvash, et al, "Autorec: Autoencoders meet collaborative filtering", Proceedings of the 24th International Conference on World Wide Web, pp.111-112, Florence, Italy, May 2015.
〔29〕 Alexander Kun，正則化為什麼能防止過擬合?，2017年5月31日，取自https://www.cnblogs.com/alexanderkun/p/6922428.html。
〔30〕 Srivastava, Nitish, et al, "Dropout: a simple way to prevent neural networks from overfitting", The Journal of Machine Learning Research, Vol.15, pp.1929-1958, January 2014.

指導教授

洪盟凱

審核日期

2019-1-24

推文