摘要: | 近年來,電子商務拍賣平台已成眾多賣家選擇開店的網路平台。隨者電子商務規模的擴大,競爭也更為激烈。若能預測顧客的購買行為,包括買什麼商品和跟誰買,賣家可以成功地留住顧客,並以最具成本效益的方式增加營收。過去研究使用羅吉斯迴歸(LR)等分類模型很難預測顧客沒有購買過的商品分類;結合協同過濾和序列樣式探勘(SPM)等技術找出相似顧客可能喜歡的商品,但容易有資料稀疏性的問題。本研究提出一個RFM資料為基礎的混合(Hybrid)預測系統,結合LR方法所建立的商品分類預測模型,和SPM所找出大部分顧客的購買樣式,藉此預測顧客未來可能會購買的賣家和其商品分類。 本研究以目前兩岸三地最大的網路拍賣平台淘寶網及其中最熱銷的女裝服飾商品為對象,使用網頁內容探勘技術蒐集平台上所揭露的2013年1月1日至2013年4月1日之間的買、賣方交易記錄。本研究依序找出RFM-SPM所使用的參數,混合LR和SPM最適的權重值,進而比較RFM-LR,RFM-SPM和Hybrid三個預測系統的準確度,最後是將顧客分群後比較三個系統的準確度。研究結果顯示Hybrid預測系統的所有評價指標在賣家(0.75)和賣家×商品分類(0.6)的預測上均為三個預測系統中最高的,而RFM-SPM的評價指標則是最低的。在分群顧客的購買行為預測上,Hybrid的綜合評價指標(F1)也是最高的,對於低F高M的顧客群的F1值達0.75~0.82,對於低F低M的顧客群的F1值更高達0.9。 ;In recent years, more and more sellers expand their businesses through E-commerce auction platform. With the ever-growing of E-commerce, it becomes more competitive to do business on Internet. If the customer’s purchase behavior—what to buy and from whom—can be predicted, the seller would be able to retain its customers and increase its revenue in a more cost-effective way. In the literatures we surveyed, classification models like Logistic regression (LR) was hardly used to predict the product category from which a consumer has not yet purchased before. Recommendation system could find out the product preferred by similar customers by combining collaborative filtering and sequential pattern mining (SPM), but it would suffer from the problem of data sparsity. We propose a RFM-based hybrid prediction system by combining the LR model for prediction of product category, and the purchase patterns of most customers using SPM, to establish the probability of purchasing from a particular seller and a particular product category.
We target at the largest cross-strait auction platform and the most popular product category, women’s apparel at “Taobao” platform, and has collected the trading records between Jan. 1, 2013 and April 1, 2013 using web mining technology. Firstly, we identify the parameters used in RFM-SPM, and then determine the most appropriate weight used in the Hybrid system. We then use precision, recall, and F1 measures to compare the three prediction systems, RFM-LR, RFM-SPM, and the Hybrid. It is shown that the Hybrid exhibits the highest performance in all three measures in predicting the seller (0.75) and the seller×product category (0.6) among the three prediction systems, while those of RFM-SPM are the lowest. In predicting the purchase behavior of customer clusters, the Hybrid again shows the best performance in terms of F1 measure, which is 0.75~0.82 for low F/high M cluster, and 0.9 for low F/low M cluster. |