博碩士論文 92443003 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:45 、訪客IP:18.222.1.187
姓名 胡雅涵(Ya-Han Hu)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 使用以限制為基礎的序列規則方法的顧客購買行為研究
(The Research of Customer Purchase Behavior Using Constraint-based Sequential Pattern Mining Approach)
相關論文
★ 零售業商業智慧之探討★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析-以國內某DRAM廠為例★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式★ 應用資料探勘技術建立機車貸款風險評估模式之研究-以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題★ 關聯式資料庫之廣義知識探勘
★ 考量屬性值取得延遲的決策樹建構★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題
★ 利用分割式分群演算法找共識群解群體決策問題★ 以新奇的方法有序共識群應用於群體決策問題
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 序列資料挖掘是一種在資料挖掘領域中非常重要的一種方法,其目標是從序列資料庫中,找出與時間相關的行為樣式。近幾年來,用序列資料挖掘方法來找出有用的資訊已被應用到各種不同的應用領域,例如:行銷決策、醫療紀錄分析、銷售分析等。過去大多數的序列資料挖掘方法都只注重在序列樣式頻率上的探討,主要的原因在於過去在做序列資料分析均假設序列資料並不會隨著時間而有所變動。然而,在現實生活上企業銷售的資料卻是具有高度的變動性與複雜性的,所以這導致了序列行為會經常隨著時間而有所變動。針對這個問題,我們在本文中將之分為兩個子問題:「企業對企業(B2B)環境下的序列資料挖掘」與「企業對顧客(B2C)環境下的序列資料挖掘」,而分為這兩個子問題來做後續探討的主要原因在於其序列資料具有各自的特色。緊接著我們介紹三種新的概念:考量新穎性(Recency)、考量重覆性(Repetition)、與考量簡潔性(Compactness)。新穎性的概念在於讓所產生的序列樣式可以考量到最近發生的行為,重覆性的概念可以確保序列樣式在一個序列中最少出現的次數必須滿足使用者的要求,而簡潔性的概念則確保一個序列樣式是在使用者自訂的一個時間區間下所發生。在本文中我們針對兩種不同的環境,運用了上述的三種概念來定義了兩種獨特的序列樣式,同時並發展出兩套有效率的演算法。我們也進行非常完整的實驗評估,結果顯示本文所提出的兩種演算法不但非常的有效率,且當序列資料在高度變動下,相對於傳統方法我們可以找出更有趣的序列樣式。
摘要(英) Sequential pattern mining is an important data-mining method for determining time-related behavior in sequence databases. The information obtained from sequential pattern mining can be used in marketing, medical records, sales analysis, and so on. Existing methods only focus on the concept of frequency because of the assumption that sequences’ behaviors do not change over time. Business sales environments are always highly dynamic and complicated, however, so the sequences’ behaviors may change over time. In this study, we first divide this problem into two sub-problems: sequential pattern mining in business-to-business (B2B) environment and business-to-customer (B2C) environment due to their unique sequence characteristics. Then, three new concepts, recency, repetition, and compactness, are incorporated into traditional sequential pattern mining to discover meaningful patterns in these two environments. The concept of recency causes patterns to quickly adapt to the latest behaviors in sequence databases. The concept of repetition ensures the occurrences of a pattern in a data-sequence must exceed user-specified thresholds. The concept of compactness ensures reasonable time spans for the discovered patterns. Two new patterns as well as efficient algorithms are presented in this dissertation. Thorough empirical evaluations are also given. The results show that the proposed methods are computationally efficient and they are more advantageous than traditional methods when sequences’ behaviors change over time.
關鍵字(中) ★ 序列資料
★ 以限制為基礎的資料挖掘方法
★ 時間序列資料庫
關鍵字(英) ★ Constraint-based mining
★ temporal database
★ Sequential pattern
論文目次 Table of Contents i
List of Illustrations iii
List of Tables iv
Chapter 1. Introduction 1
1.1. Motivations and Research Objectives 2
1.2. Considering Time Constraints on Sequential Pattern Mining in B2C Environment 3
1.3. Considering Time Constraints on Sequential Pattern Mining in B2B Environment 5
1.4. Organization of the Dissertation 7
Chapter 2. Literature Review 8
2.1. Sequential Pattern Mining: An Overview 8
2.2.1. Improve the Efficiency in Sequential Pattern Mining Process: 9
2.2.2. Extend the Mining of Sequential Pattern to Other Time-Related Patterns 13
2.2. Data Mining in a Changing Environment 18
2.3. Constraint-based Sequential Pattern Mining 19
2.4. Discussion 20
Chapter 3. The Problem of Sequential Pattern Mining in B2C Environment 22
3.1. Problem Definition 22
3.2. The CFR-PostfixSpan Algorithm 25
3.3. Experimental Study 31
3.3.1. Data 31
3.3.2. Performance Measures 33
3.3.3. Experimental Setup 34
3.4. Results and Discussions 35
3.5. Summary 40
Chapter 4. The Problem of Sequential Pattern Mining in B2B Environment 42
4.1. Problem Definition 42
4.2. Algorithm 46
4.2.1. The CFR2-apriori Algorithm 46
4.2.2. The Support Counting Process 51
4.3. Performance Evaluation 57
4.3.1. Synthetic Data Generation and real-life data 57
4.3.2. Performance Evaluation 60
4.4. Summary 65
Chapter 5. Conclusions and Future Research 67
References 69
Appendix 73
參考文獻 [1] R. Agrawal, C. Faloutsos, and A. Swami, “Efficient similarity search in sequence databases”, Proceedings of Conference on Foundations of Data Organization and Algorithms, pp. 69-84, 1993.
[2] M. Last, Y. Klein, and A. Kandel, “Knowledge Discovery in Time Series Databases”, IEEE transactions on systems, man, and cybernetics, Vol. 31, No. 1, pp. 160-168, 2001.
[3] B. LeBaron and A. S. Weigend, “A Bootstrap Evaluation of the Effect of Data Splitting on Financial Time Series”, IEEE Transactions on Neural Networks, Vol. 9, No. 1, pp. 213-220, 1998.
[4] C. Y. Chang, M. S. Chen, and C. H. Lee, “Mining general temporal association rules for items with different exhibition periods”, IEEE International Conference on Data Mining, pp. 59-66, 2002.
[5] C. H. Lee, M. S. Chen, and C. R. Lin, “Progressive partition miner: an efficient algorithm for mining general temporal association rules”, IEEE Transactions on Knowledge and Data Engineering, Vol. 15, No. 4, pp. 1004-1017, 2003.
[6] Y. Li, P. Ning, X. S. Wang, and S. Jajodia, “Discovering calendar-based temporal association rules”, Data & Knowledge Engineering, Vol. 44, No. 2, pp. 193-218, 2003.
[7] R. Agrawal and R. Srikant, “Mining sequential patterns”, Proceedings of 1995 International Conference Data Engineering, pp. 3-14, 1995.
[8] R. Srikant and R. Agrawal, “Mining sequential patterns: generalizations and performance improvements”, Proceedings of the 5th International Conference on Extending Database Technology, pp. 3-17, Avignon, France, 1996.
[9] X. Yan, J. Han, and R. Afshar, “CloSpan: Mining Closed Sequential Patterns in Large Datasets”, Proceedings of the 2003 SIAM International Conference on Data Mining (SDM'03), pp. 166-177, San Francisco, CA, 2003.
[10] J. Yang, P. Yu, W. Wang, and J. Han, “Mining Long Sequential Patterns in a Noisy Environment”, Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 406-417, Madison, Wisconsin, 2002.
[11] J. Han and M. Kamber, Data mining: concepts and techniques, Academic Press, 2001.
[12] J. Srivastava, Mining temporal data, http://www.cs.umn.edu/research/websift/ survey/.
[13] Q. Zhao and S. S. Bhowmick, “Sequential Pattern Mining: A Survey”, Technical Report Center for Advanced Information Systems, School of Computer Engineering, Nanyang Technological University, Singapore, 2003.
[14] R. Bellazzi, C. Larizza, P. Magni, and R. Bellazzi, “Quality Assessment of Hemodialysis Services through Temporal Data Mining”, Lecture Notes in Computer Science, Vol. 2780, pp. 11-20, 2003.
[15] J. T. Lee and Y. T. Wang, “Efficient data mining for calling path patterns in GSM networks”, Information Systems, Vol. 28, No. 8, pp. 929-948, 2003.
[16] R. Srikant and Y. Yang, “Mining web logs to improve website organization”, Proceedings of the Tenth International World Wide Web Conference, pp. 430-437, Hong Kong, 2001.
[17] M. S. Chen, J. Han, and P. S. Yu, “Data mining: an overview from a database perspective”, IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 6, pp. 866-883, 1996.
[18] W. J. Frawley, G. Piatetsky-Shapiro, and C. J. Matheus, Knowledge discovery in databases: an overview, AAAI/MIT press, 1991.
[19] W. G. Aref, M. G. Elfeky, and A. K. Elmagarmid, “Incremental, online, and merge mining of partial periodic patterns in time-series databases”, IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 3, pp. 335-345, 2004.
[20] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M. C. Hsu, “FreeSpan: frequent pattern-projected sequential pattern mining”, Proceedings of 2000 International Conference on Knowledge Discovery and Data Mining, pp. 355-359, Boston, Massachusetts, 2000.
[21] M. Y. Lin, S. Y. Lee, and S. S. Wang, “DELISP: Efficient Discovery of Generalized Sequential Patterns by Delimited Pattern-Growth Technology”, Lecture Notes in Computer Science, Vol. 2336, pp. 198-209, 2002.
[22] M. Y. Lin and S.-Y. Lee, “Incremental update on sequential patterns in large databases by implicit merging and efficient counting”, Information Systems, Vol. 29, No. 5, pp. 385-404, 2004.
[23] F. Masseglia, P. Poncelet, and M. Teisseire, “Incremental mining of sequential patterns in large databases”, Data and Knowledge Engineering, Vol. 46, No. 1, pp. 97-121, 2003.
[24] J. Pei, J. Han, B. Mortazavi-Asl, and H. Zhu, “Mining access patterns efficiently from web logs”, Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 396-407, Kyoto, Japan, 2000.
[25] J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M. C. Hsu, “PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth”, Proceedings of 12th International Conference on Data Engineering, pp. 215-224, Heidelberg, Germany, 2001.
[26] M. J. Zaki, “SPADE: an efficient algorithm for mining frequent sequences”, Machine Learning Journal, Vol. 42 No.1-2, pp. 31-60, 2001.
[27] M. S. Chen, J. S. Park, and P. S. Yu, “Efficient data mining for path traversal patterns”, IEEE Transactions on Knowledge and Data Engineering, Vol. 10, No.2, pp.209-221, 1998.
[28] Y. L. Chen, S. S. Chen, and P. Y. Hsu, “Mining hybrid sequential patterns and sequential rules”, Information Systems, Vol. 27, No. 5, pp.345-362, 2002.
[29] Y. L. Chen, M. C. Chiang, and M. T. Kao, “Discovering time-interval sequential patterns in sequence databases”, Expert Systems with Applications, Vol. 25, No. 3, pp. 343-354, 2003.
[30] Y. L. Chen and C. K. Huang, “Discovering fuzzy time-interval sequential patterns in sequence databases”, IEEE Transactions on Systems, Man and Cybernetics, Vol. 35, No. 5, pp. 959-972, 2005.
[31] Y. L. Chen, Y. H. Hu, “Constraint-based sequential pattern mining: The consideration of recency and compactness”, Decision Support Systems, Vol. 42, No. 2, pp. 1203-1215, 2006.
[32] R. S. Chen, G. H. Tzeng, C. C. Chen, and Y. C. Hu, “Discovery of fuzzy sequential patterns for fuzzy partitions in quantitative attributes”, ACS/IEEE International Conference on Computer Systems and Applications, pp. 144-150, 2001.
[33] R. Cooley, B. Mobasher, and J.Srivastava, “Data preparation for mining world wide web browsing patterns”, Journal of Knowledge and Information Systems, Vol. 1, No. 1, pp. 5-32, 1999.
[34] J. Han, G. Dong, and Y. Yin, “Efficient mining of partial periodic patterns in time series database”, Proceedings of 1999 International Conference on Data Engineering, pp. 106-115, Sydney, Australia, 1999.
[35] J. Han, W. Gong, and Y. Yin, “Mining segment-wise periodic patterns in time-related databases”, Proceedings of 1998 International Conference on Knowledge Discovery and Data Mining, pp. 214-218, New York, New York, 1998.
[36] S. Ma and J. L. Hellerstein, “Mining partially periodic event patterns with unknown periods”, Proceedings of the 17th International Conference Data Engineering, pp. 205-214, Heidelberg, Germany, 2001.
[37] H. Mannila, H. Toivonen, and A. Inkeri Verkamo, “Discovery of frequent episodes in event sequences”, Data Mining and Knowledge Discovery, Vol. 1, No. 3, pp. 259-289, 1997.
[38] Helen Pinto, J. Han, J. Pei, K. Wang, Q. Chen, and Umeshwar Dayal, “Multi-dimensional sequential pattern mining”, Proceedings of the 10th International Conference on Information and Knowledge Management (CIKM 2001), pp. 81-88, Atlanta, Georgia, 2001.
[39] S. L. Wang, C. Y. Kuo, and T. P. Hong, “Mining fuzzy similar sequential patterns from quantitative data”, IEEE International Conference on Systems, Man and Cybernetics, Hammamet, Tunisia, 2002.
[40] G. Dong and J. Li, “Efficient mining of emerging patterns: Discovering trends and differences”, Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 43-52, San Diego, California, 1999.
[41] B. Liu, Y. Ma, and R. Lee, “Analyzing the interestingness of association rules from the temporal dimension”, IEEE International Conference on Data Mining (ICDM-2001), pp. 377-384, Silicon Valley, CA, 2001.
[42] H. S. Song, J. K. Kim, and S. H. Kim, “Mining the change of customer behavior in an internet shopping mall”, Expert Systems with Applications, Vol. 21, No. 3, pp. 157-168, 2001.
[43] I. H. Toroslu, “Repetition support and mining cyclic patterns”, Expert Systems with Applications, Vol. 25, No. 3, pp. 303-311, 2003.
[44] Y. H. Hu and Y. L. Chen, “Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism”, Decision Support Systems, Vol. 42, No. 1, pp. 1-24, 2006.
[45] J. Luo and Bridges S. M., “Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection”, International Journal of Intelligent Systems, Vol. 15, No. 8, pp. 687-703, 2000.
[46] M. N. Garofalakis, R. Rastogi, K. Shim, “SPIRIT: Sequential Pattern Mining with Regular Expression Constraints”, Proceedings of 25th VLDB Conference, pp. 223-234, San Francisco, California, 1999.
[47] C. M. Kuok, A. Fu, M. H. Wong, “Mining fuzzy association rules in databases,” SIGMOD Record, Vol. 27, No. 1, pp.41-46, 1998.
[48] W. Zhang, “Mining fuzzy quantitative association rules”, Proceedings 11th International Conference Tools Artificial Intelligence, pp. 99-102, Chicago, IL, 1999.
[49] J. Pei, G. Dong, W. Zou, and J. Han, “Mining Condensed Frequent-Pattern Bases”, Knowledge and Information Systems, Vol. 6, No. 5, pp. 570-594, 2004.
[50] M. V. Joshi, G. Karypis, and V. Kumar, “A Universal Formulation of Sequential Patterns”, Technical Report # 99-021, University of Minnesota, 1999.
[51] R. Agrawal and R. Srikant, “Fast algorithms for mining association rules, Proceedings of 1994 International Conference Very Large Data Bases”, pp. 487-499, 1994.
指導教授 陳彥良(Yen-Liang Chen) 審核日期 2007-7-4
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明