姓名 郭明豪(Min-hao Kuo) 查詢紙本館藏 畢業系所 資訊管理學系 論文名稱 從顧客購買資料中挖掘RFM序列樣式
(Discovering RFM sequential patterns from customers’ purchasing data)
檔案 [Endnote RIS 格式] [Bibtex 格式] [檢視] [下載]
摘要(中) 序列資料挖掘是一種在資料挖掘領域中非常重要的一種方法，其目標是從序列資料庫中，找出與時間相關的行為樣式，目前，序列資料挖掘已被應用到許多的領域中，包含網路日誌的分析，顧客行為的研究，科學實驗流程的分析，醫學記錄的分析…等等。然而，雖然已經有許多序列資料挖掘相關的研究，但這些研究在挖掘序列樣式時，通常只考慮樣式的發生頻率(frequency)，因為樣式的發生頻率是唯一可以被用來過濾掉那些不有趣的樣式(uninteresting patterns)，不過有個前提，最小支持度(minimum support)不能設定過高，否則，很多有價值的樣式將不會被找出。 然而，這樣的做法卻會導致組合性的爆炸，以致於產生過多的規則，為了解決這樣的問題，本研究使用了行銷學者用於做顧客及市場區隔的RFM(Recency, Frequency and Monetary)概念，去挖掘出有價值的序列樣式，提出了一套演算法叫RFM-Apriori，它是修改傳統序列挖掘演算法Apriori ，除了考慮序列資料頻率之外(Frequency)，額外考慮了二個限制，最近購買時間(Recency)和購買金額(Monetary)，去挖掘同時滿足RFM的樣式(RFM-patterns)。 經由額外考慮兩個要素，可以確保所有被挖掘出來的樣式是近期發生且有價值的，經實驗評估，證明我們所提出的方法比傳統的方法更有效率，並且提供使用者一個更有效率的方法來找出有價值的樣式。 摘要(英) Sequential pattern mining is an important data mining task of discovering time-related behaviors in sequence databases. Sequential pattern mining technology has been applied in many domains, including web-log analysis, the analyses of customer purchase behavior, process analysis of scientific experiments, medical record analysis, etc. Although a lot of works have been done to sequential pattern mining, most of them discover sequential patterns only based on frequency. Due to this reason, the minimum support must be set to a low value; otherwise, a lot of valuable patterns may not be found. Unfortunately, doing so may cause combinatorial explosion, producing too many rules. To resolve this dilemma, this research uses the concept of Recency, Frequency and Monetary (RFM), which is usually used by marketing researchers to do customer or market segmentation, to find valuable sequential patterns. The proposed algorithm RFM-Apriori modifies the traditional sequential pattern mining algorithm-Apriori, so that, except the frequency, we also consider two additional constraints, the last purchasing time (Recency) and purchasing money (Monetary), to discover the RFM-patterns. The advantage of considering these two additional factors is that this can ensure all patterns are recently active and profitable. The empirical evaluation shows that the proposed method is computationally efficient and can offer users an effective means to discover valuable patterns. 關鍵字(中) ★ 分類
關鍵字(英) ★ sequential patterns
★ data mining
★ constraint-based mining
論文目次 CHAPTER 1 INTRODUCTION 1
CHAPTER 2 RELATED WORKS 6
2.1 SEQUENTIAL PATTERN MINING 6
2.2 RFM 7
CHAPTER 3 DEFINITION 9
CHAPTER 4 ALGORITHMS 13
4.1 CANDIDATE GENERATION 16
4.2 COUNTING SUPPORT BY TRAVERSING AN INVERSE CANDIDATE TREE 18
4.2.1 Inverse Candidate Tree 20
4.2.2 Counting Support for Candidates 21
4.3 RFM-APRIORI ALGORITHM - EXAMPLE 22
CHAPTER 5 EXPERIMENTS 27
5.1 SYNTHETIC DATA GENERATION AND REAL-LIFE DATASET 27
5.2 PERFORMANCE EVALUATION 29
CHAPTER 6 CONCLUSION 39
參考文獻  A. Agrawal, T. Imielinksi and A. Swami, Mining association rules between sets of items in large databases. In Proc. ACM SIGMOD Int. Conf. on the Management of Data, Washington, D.C., (1993), pp. 207-216.
 R. Agrawal, C. Faloutsos, & A. Swami. Efficient similarity search in sequence databases. Proceedings of Conference on Foundations of Data Organization and Algorithms, (1993), pp. 69-84.
 R. Agrawal and R. Srikant, Fast algorithms for mining association rules, Proceedings of the 20th International Conference on Very Large Data Bases, (1994), pp. 487-499.
 R. Agrawal, R. Srikant, Mining sequential patterns. Proceedings of 1995 International Conference Data Engineering, (1995), pp. 3-14.
 TBrock Barber, Howard J. Hamilton.T HExtracting Share Frequent Itemsets with Infrequent SubsetsH, TData Mining and Knowledge Discovery, TVol. 7, Iss. 2, (2003), pp. 153-158.
 J.R. Bult, T.J. Wansbeek, Optimal selection for direct mail. Marketing Science, Vol. 14, Iss. 4, (1995), pp.378-394.
 C. L. Carter, H. J. Hamilton and N. Cercone, Share based measures for itemsets. In Proc. First European Conf. on the Principles of Data Mining and Knowledge Discovery, Trondheim, Norway, (1997), pp. 14–24.
 M. S. Chen, J. Han, P. S. Yu, Data mining: an overview from a database perspective, IEEE Transactions on Knowledge and Data Engineering, Vol. 8, Iss. 6, (1996), pp. 866-883.
 M. S. Chen, J. S. Park, P. S. Yu, Efficient data mining for path traversal patterns, IEEE Transactions on Knowledge and Data Engineering, Vol. 10, Iss. 2, (1998), pp. 209-221.
 R. S. Chen, R. C. Wu and J. Y. Chen, Data mining application in customer relationship management of credit card business, Computer Software and Applications Conference, COMPSAC 2005. 29th Annual International, Vol. 2, (2005), pp. 39-40.
 Y. L. Chen, S. S. Chen, P. Y. Hsu, Mining hybrid sequential patterns and sequential rules, Information Systems, Vol. 27, Iss. 5, (2002), pp. 345-362.
 Y. L. Chen, M. C. Chiang, M. T. Kao, Discovering time-interval sequential patterns in sequence databases, Expert Systems with Applications, Vol. 25, Iss. 3, (2003), pp. 343-354.
 Y. L. Chen and Y. H. Hu, Constraint-based sequential pattern mining: the consideration of recency and compactness, Decision Support Systems, Vol. 42, Iss. 2, (2006), pp. 1203-1215.
 Y. L. Chen, C. K. Huang, Discovering fuzzy time-interval sequential patterns in sequence databases, IEEE Transactions on Systems, Man and Cybernetics, accepted paper.
 R. S. Chen, G. H. Tzeng, C. C. Chen, Y. C. Hu, Discovery of fuzzy sequential patterns for fuzzy partitions in quantitative attributes, ACS/IEEE International Conference on Computer Systems and Applications, (2001), pp. 144-150.
 R. Cooley, B. Mobasher, J.Srivastava, Data preparation for mining world wide web browsing patterns, Journal of Knowledge and Information Systems, Vol. 1, Iss. 1, (1999), pp. 5-32.
 S. Dibb and L. Simkin, The market segmentation workbook: target marketing for marketing managers, Routledge, London, (1996).
 W. J. Frawley, G. Piatetsky-Shapiro, Matheus, C. J., Knowledge discovery in databases: an overview, AAAI/MIT press, (1991).
 J. Han, G. Dong, Y. Yin, Efficient mining of partial periodic patterns in time series database, Proceedings of 1999 International Conference on Data Engineering, (1999), pp. 106-115.
 J. Han and Y. Fu, Discovery of multiple-level association rules from large databases. In Proc. Int. Conf. on Very Large Databases, Zurich, Switzerland, (1995), pp. 420–431.
 J. Han, W. Gong, Y. Yin, Mining segment-wise periodic patterns in time-related databases, Proceedings of 1998 International Conference on Knowledge Discovery and Data Mining, (1998), pp. 214-218.
 C. Hidber, Online association rule mining. In Proc. ACM SIGMOD Int. Conf. on Management of Data, Philadephia, Pennsylvania, (1999), pp. 145–156.
 M. Hipp, A. Myka, R. Wirth and U. G¨untzer, A new algorithm for faster mining of generalized association rules. In Proc. Second European Symposium on Principles of Data Mining and Knowledge Discovery, Nantes, France, (1998), pp. 74–82.
 R. Kahan, Using database marketing techniques to enhance your one-to-one marketing initiatives. J Consum Mark Vol. 15, Iss. 5, (1998), pp. 491–493.
 U. Kaymak, Fuzzy Target Selection Using RFM Variables, IFSA World Congress and 20th NAFIPS International Conference, Joint 9PthP, Vol. 2, (2001), pp. 1038-1043.
 M. Kitayama, R. Matsubara, Y. Izui, Application of data mining to customer profile analysis in the power electric industry, Power Engineering Society Winter Meeting, IEEE, Vol. 1, (2002), pp. 632-634.
 M. Y. Lin, S. Y. Lee, S. S. Wang, DELISP: Efficient Discovery of Generalized Sequential Patterns by Delimited Pattern-Growth Technology, Lecture Notes in Computer Science, 2336, (2002), pp. 198-209.
 D. R. Liu, Y. Y. Shih, Hybrid approaches to product recommendation based on customer lifetime value and purchase preferences, The Journal of Systems and Software, Vol. 77, Iss. 2, (2005), pp. 181-191.
 J. Luo, Bridges S. M., Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection, International Journal of Intelligent Systems. Vol. 15, Iss. 8, (2000), pp. 687-703.
 S. Ma, J. L. Hellerstein, Mining partially periodic event patterns with unknown periods, Proceedings of the 17th International Conference Data Engineering, (2001), pp. 205-214.
 H. Mannila, H. Toivonen, A. I. Verkamo, Discovery of frequent episodes in event sequences, Data Mining and Knowledge Discovery, Vol. 1, Iss. 3, (1997), pp. 259-289.
 H. Mannila, H. Toivonen, A. I. Verkamo, Efficient algorithms for discovering association rules. In Proc. AAAI Workshop on Knowledge Discovery in Databases, Seattle, Washington, (1994), pp. 144–155.
 C. Marcus, A practical yet meaningful approach to customer segmentation. J Consum Mark Vol. 15, No. 5, (1998), pp. 494–504.
 J. Pei, J. Han, B. Mortazavi-Asl, Q. Chen, U. Dayal, M. C. Hsu, FreeSpan: frequent pattern-projected sequential pattern mining, Proceedings of 2000 International Conference on Knowledge Discovery and Data Mining, (2000), pp. 355-359.
 J. Pei, J. Han, B. Mortazavi-Asl, H. Zhu, Mining access patterns efficiently from web logs. Proceedings of 2000 Pacific-Asia Conference on Knowledge Discovery and Data Mining, (2000), pp. 396-407.
 J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, M. C. Hsu, PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. Proceedings of 2001 International Conference on Data Engineering, (2001), pp. 215-224.
 Helen Pinto, J. Han, J. Pei, K. Wang, Q. Chen, Umeshwar Dayal, Multi-dimensional sequential pattern mining. Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM 2001), (2001), pp. 81-88.
 S. Russell, W. Lodwick, Fuzzy clustering in data mining for telco database marketingcampaigns, Fuzzy Information Processing Society, NAFIPS. 18th International Conference of the North American, (1999), pp. 720-726.
 R. Srikant, R. Agrawal, Mining sequential patterns: generalizations and performance improvements, Proceedings of 5th International Conference on Extending Database Technology, (1996), pp.3-17.
 R. Srikant, Y. Yang, Mining web logs to improve website organization, Proceedings of the Tenth International World Wide Web Conference, Hong Kong, (2001).
 H. H. Sung, C. P. Sang, Application of data mining tools to hotel data mart on the Intranet for database marketing, Expert Systems with Applications, Vol. 15, Iss. 1, (1998), pp.1-31
 Ismail H. Toroslu, Repetition support and mining cyclic patterns, Expert Systems with Applications, Vol. 25, Iss. 3, (2003), pp. 303-311.
 C.-Y. Tsai, C.-C. Chiu, A purchase-based market segmentation methodology, Expert Systems with Applications, Vol. 27, Iss. 2, (2004), pp.265-276
 S. L. Wang, C. Y. Kuo, T. P. Hong, Mining fuzzy similar sequential patterns from quantitative data, IEEE International Conference on Systems, Man and Cybernetics, Hammamet, Tunisia, (2002).
 M. J. Zaki, SPADE: an efficient algorithm for mining frequent sequences, Machine Learning Journal, Vol. 42, Iss. (1-2), (2001), pp. 31-60.
 Q. Zhao, S. S. Bhowmick, Sequential Pattern Mining: A Survey, Technical Report Center for Advanced Information Systems, School of Computer Engineering, Nanyang Technological University, Singapore, (2003).
指導教授 陳彥良(Yen-Liang Chen) 審核日期 2007-7-2 推文 facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu