在這資訊科技快速發展的年代,降低了企業對於資料蒐集的門檻,企業與 顧客的連結更加緊密,因此瞭解並掌握顧客儼然已是現今行銷人員最重要的課 題,以及智慧零售透過數據分析顧客全通路的購物、瀏覽行為,使企業針對不 同屬性的會員,提供精準的行銷資訊,因此顧客分群成為重要方法,目的是將 顧客依照不同的特性劃分,大部分的行銷人員以 RFM 模型和集群分析來完成 顧客分群,然而此方法依然存在許多可改進的地方,且鮮少將交易資料以外的 數據也納入分析中,因此本研究欲應用多種機器學習方法以及多個面向的變數 進行顧客的分群分析。 本研究透過台灣零售業的資料進行顧客分群分析,採用極限梯度提升樹 (XGBoost)找出關鍵的特徵值,接著透過多種集群分析演算法的比較找出最合適 的分群方法與結果,包含 K-means、Hierarchical Clustering、Birch 和 SOM,後 續針對不同群體的各個變數進行討論,接著進行關聯規則分析,除了購物籃分 析之外本研究也探討了網路瀏覽和促銷之間的關聯性。本研究結果在 RFM 模 型的三個變數之外,另外找出兩個重要變數彌補了原有模型的缺陷,且 Birch 演算法在該資料的集群分析具有良好的表現,最後將顧客分為五群,針對分群 結果進一步的探討分析,深入了解各集群的變數差異性以及顧客行為的不同, 並透過關聯規則分析找出多個關聯組合,為企業提供決策支援並使顧客價值最 大化。 ;With the rapid development of information technology, companies can build connections with customers more easily by collecting various data. Therefore, understanding customers has become the most important issue for today′s marketers. By using data to analyze customers′ shopping and browsing behaviors in all channels, retail companies can provide accurate marketing information for different customers. Therefore, how to divide customers according to different features and complete customer segmentation has become one of the most important research topics in retail. While most marketers use RFM model and cluster analysis to complete customer segmentation, the drawback of this method is that it only considers recent purchase and only have transaction data in the analysis. For this reason, this study aims to apply different machine learning methods with a variety of variables for customer segmentation analysis. In this thesis, customer segmentation analysis was conducted based on the data of Taiwan′s retail industry. I use extreme gradient boosting tree (XGBoost) to find the important features and the most suitable clustering method and results through the comparison of various cluster analysis algorithms, including K- means, Hierarchical Clustering, Birch, and SOM. Finally, I discuss the impact of various variables for different groups by association rule analysis. The results of the research show that besides the three features of the RFM model, there are two important variables such as Period and NES that could improve the defects of the original model, and Birch algorithm has a good performance in the cluster analysis of the data. The results of clustering analysis indicate that customers can be divided into five segments with different features and purchase behaviors. Together with the results of clustering analysis and association rules, retailers can design suitable marketing campaign and improve their customer relationships.