推薦系統廣泛被主流的線上服務商(例如:Amazon、Spotify、Netflix)應用來增加服務、商品能見度進而誘發使用者購買商品或持續使用服務,受益於網際網路技術成熟與巨量資料相關技術不斷進步,推薦系統逐漸從分析傳統交易資料(熱門購買商品)跨進使用各種演算法預測使用者對歌曲的喜好程度進而做到個人化推薦。 本研究使用Yahoo! Music中使用者對於歌曲評分資料,以目前廣泛被使用在個人化推薦的協同過濾演算法作為基準輔以兩種基於使用者行為上找商品相似度的演算法關聯法則、Word2vec組合出來的混合模型,同時考量實際上的情境: 1.時間序問題:使用Real-life split的概念來切割訓練與驗證資料集。 2.有限的推薦商品數:取Top k的資料驗證map@5,map@10效果。 結果顯示兩種方法皆可以提升準確率且本論文的技術採用Apache Spark,處理大量資料集將帶來顯著的效益。 ;The recommendation system is widely used in the on-line entertainment industries.By building the system, services prociders like Amazon、Spotify、Netflix can reveal as more products or contents to their users as possible. The more satisfaction they get from their users means the more user engagement they win. Take digital music services, in trandition, the system recommended musics based on the historical records or its’ metadata. Along with the improvement of technology, we can easily process large datasets such as user-ratings data or user-behavior data and apply some data mining algorithm such as collaborative filtering algorithm to do the personalization recommendation. In this study, the Yahoo! Music dataset is used.First, we try to tune the performance of collaborative filtering algorithm and treat it as the baseline of our recommendation system. Second, we reform the user-ratings data to apply two algorithms: Frequent-Pattern Growth and Word2vec in order to find the similarity of songs. Finally, the hybrid models combine the results of CF and fp-growth/Word2vec and both their evaluation metrics : map@5、map@10 are improved. Moreover, the approach we provided is adopted in the Apache Spark framework. It benefits us when dealing with the larger datasets in real world.