模糊探勘程序來挖掘序列樣式

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：26

、訪客IP：3.147.28.47

姓名

黃正魁(Cheng-Kui Huang) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

模糊探勘程序來挖掘序列樣式
(A Fuzzy Mining Process for Discovering Sequential Patterns)

相關論文

★ 零售業商業智慧之探討	★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例	★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析－以國內某DRAM廠為例	★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式	★ 應用資料探勘技術建立機車貸款風險評估模式之研究－以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證	★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 關聯式資料庫之廣義知識探勘
★ 考量屬性值取得延遲的決策樹建構	★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題
★ 利用分割式分群演算法找共識群解群體決策問題	★ 以新奇的方法有序共識群應用於群體決策問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著資料的大量增加，資料探勘(Data Mining)已經被使用在處理資料過剩的問題，並且在既有的資料中，去挖掘有用的、新的和具有潛力的樣式。然而，我們在挖掘量化型的資料(Quantitative Data)時，卻可能產生傳統不是0就是1的切割問題(Sharp Boundary Problem)，而這問題是傳統資料探勘方法無法解決的。為了這個問題，已經有許多學者，運用模糊集合(Fuzzy Sets)去挖掘帶有數量資料的樣式，尤其是在序列樣式(Sequential Patterns)的挖掘[18][25]。為了有更一般化的觀點來看資料探勘和模糊領域的結合，而幫助去挖掘序列樣式，本研究提出了一個模糊探勘的運作程序，來引導如何挖掘序列樣式(Fuzzy Mining Process for Discovering Sequential Patterns, FMPDSP)。此程序的目的是建立一個跨兩個領域合作的橋樑，進而瞭解並分析模糊序列樣式探勘的研究步驟。另外，本研究提出了三種不同的模糊序列樣式的研究，來證明這個新程序的可行性(Workable)和其一般化(Generalization)，並引導這兩個領域結合的新研究。

摘要(英)

With the increase of data, data mining has been introduced to solve the overloading problem and to discover valid, novel, potentially useful patterns in existing data. In order to discover quantitative data, we may encounter a sharp boundary problem which the traditional data mining techniques cannot overcome. In view of this weakness, a lot of researches have been applied fuzzy sets to discover a variety of quantitative patterns, especially in sequential pattern mining [18][25]. Therefore, we devote to proposing a work process, Fuzzy Mining Process for Discovering Sequential Patterns (FMPDSP), to hold more general viewpoint combining Data Mining and Fuzzy Sets fields for discovering sequential patterns. The purpose of the process is to establish a cooperative relationship for the both fields to understand and analyze the investigating steps of fuzzy sequential pattern mining. Three researches were proposed to demonstrate that the FMPDSP can be workable and generalization to lead the future studies in the both fields.

關鍵字(中)

★ 資料探勘
★ 序列樣式
★ 模糊集合
★ 時間區間
★ 多階層
★ 數量資料

關鍵字(英)

★ multi-level
★ time interval
★ fuzzy sets
★ sequential patterns
★ data mining
★ quantitative data

論文目次

CHAPTER 1　INTRODUCTION 1
1.1　DESCRIPTION OF THE PROCESS 2
1.2　FUZZY APPLICATION 5
1.3　ORGANIZATION OF THE DISSERTATION 6
CHAPTER 2　RELATED WORKS AND BACKGROUND 8
2.1　DATA MINING 8
2.1.1　Sequential Patterns Mining 11
2.2　FUZZY SETS 15
2.3　FUZZY DATA MINING 16
CHAPTER 3　DISCOVERING FUZZY TIME-INTERVAL SEQUENTIAL PATTERNS 19
3.1　RESEARCH PROBLEM 19
3.2　PROBLEM DEFINITION 21
3.3　ALGORITHMS FOR MINING FUZZY TIME-INTERVAL SEQUENTIAL PATTERNS 25
3.3.1　The FTI-Apriori Algorithm 25
3.3.2　The FTI-PrefixSpan Algorithm 31
3.3.3　The Post-ftiapriori Algorithm 40
3.4　EXPERIMENTAL RESULTS AND PERFORMANCE STUDY 40
3.5　SUMMARY 52
3.5.1　Implications for Academic Researchers 53
3.5.2　Implications for Business Practitioners 53
3.5.3　Future Works 53
CHAPTER 4　DISCOVERING FUZZY MULTI-LEVEL SEQUENTIAL PATTERNS 54
4.1　RESEARCH PROBLEM 54
4.2　LITERATURE REVIEW FOR TAXONOMY 56
4.3　PROBLEM DEFINITION 57
4.4　ALGORITHMS FOR MINING FUZZY MULTI- AND CROSS- LEVEL SEQUENTIAL PATTERNS 66
4.4.1　Fuzzy Multi-level Sequential Mining Algorithm 66
4.4.2　Fuzzy Cross-level Sequential Patterns 73
4.5　EXPERIMENTAL RESULTS AND PERFORMANCE STUDY 76
4.5.1　Synthetic Dataset 76
4.5.2　Real Dataset 85
4.6　SUMMARY 89
4.6.1　Implications for Academic Researchers 90
4.6.2　Implications for Business Practitioners 90
4.6.3　Future Works 90
CHAPTER 5　DISCOVERING FUZZY QUANTITATIVE SEQUENTIAL PATTERNS 91
5.1　RESEARCH PROBLEM 91
5.2　PROBLEM DEFINITION 93
5.3　ALGORITHMS FOR MINING FUZZY-BASED SEQUENTIAL PATTERNS WITH QUANTITATIVE DATA 100
5.3.1　The Hong et al. Algorithm 100
5.3.2　The Divide-and-conquer Fuzzy Sequential Mining Algorithm 101
5.4　EXPERIMENTAL RESULTS AND PERFORMANCE STUDY 110
5.4.1　Synthetic Dataset 111
5.4.2　Real Dataset 120
5.5　SUMMARY 122
5.5.1　Implications for Academic Researchers 123
5.5.2　Implications for Business Practitioners 123
5.5.3　Future Works 124
CHAPTER 6　CONCLUSIONS AND FUTURE WORKS 125
REFERENCES 127
APPENDIXES 136
PUBLICATION LIST 144

參考文獻

[1] R. Agrawal, S. Ghosh, T. Imielinski, B. Iyer, and A. Swami, “An interval classifier for database mining applications,” In Proc. 18th Int. Conf. Very Large Data Bases, pp. 560-573, Aug. 1992.
[2] R. Agrawal, and R. Srikant, “Fast algorithms for mining association rules,” In Proc. of 1994 Int. Conf. Very Large Data Bases, pp. 487-499, 1994.
[3] R. Agrawal, and R. Srikant, “Mining sequential patterns,” In Proc. of 1995 Int. Conf. Data Engineering, pp. 3-14, 1995.
[4] W. H. Au, and K. C. C. Chan, “Mining fuzzy association rules,” In Proc. 6th Int. Conf. Information Knowledge Management, Las Vegas, NV, pp. 209-215, 1997.
[5] W. H. Au, and K. C. C. Chan, “An effective algorithm for discovering fuzzy rules in relational databases,” In Proc. IEEE Int. Conf. Fuzzy Systems, vol. II, pp. 1314-1319, 1998.
[6] W. H. Au, and K. C. C. Chan, “FARM: A data mining system for discovering fuzzy association rules,” In Proc. FUZZ-IEEE’99, vol. 3, pp. 22-25, 1999.
[7] W. H. Au, and K. C. C. Chan, “Mining fuzzy association rules in a bank-account database,” IEEE Transaction on Fuzzy Systems, vol. 11, pp. 238-248, 2003.
[8] R. E. Bellman and L. A. Zadeh, “Decision-making in a fuzzy environment,” Management Science, vol. 17(4), pp. 141-164, 1970.
[9] G. Bojadziev and M. Bojadziev, “Fuzzy logic for business, finance, and management,” World Scientific Publishing Co., Inc. River Edge, NJ, USA, 1997.
[10] P. K. Chan and S. J. Stolfo, “Learning arbiter and combiner trees from partitioned data for scaling machine learning,” In Proc. First Int. Conf. Knowledge Discovery and Data Mining (KDD ‘95), pp. 39-44, Aug. 1995.
[11] D. Cheung, S.D. Lee, B. Kao, “A general incremental technique for maintaining discovered association rules,” In the Proc. of the Fifth Int. Conf. On Database Systems For Advanced Applications (DASFAA '97), pp. 185-194, Melbourne, Australia. March 1997.
[12] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, “Introduction to algorithms”, 2nd Edition, MIT Press, 2001.
[13] G. Chen, and Q. Wei, “Fuzzy association rules and the extended mining algorithms,” Information Sciences, vol. 147, pp. 201-228, 2002.
[14] M. S. Chen and P. S. Yu, “Using multi-attribute predicates for mining classification rules,” IBM Research Report, 1995.
[15] M. S. Chen, J. Han, and P. S. Yu, “Data mining: an overview from a database perspective,” IEEE Transactions on Knowledge and Data Engineering, vol. 8(6), pp. 866-883, 1996.
[16] Y. L. Chen, S. S. Chen, and P. Y. Hsu, “Mining hybrid sequential patterns and sequential rules,” Information Systems, vol. 27, no. 5, pp. 345-362, 2002.
[17] Y. L. Chen, M. C. Chiang, and M. T. Ko, “Discovering time-interval sequential patterns in sequence databases,” Expert Systems with Applications, vol. 25, no. 3, pp. 343-354, 2003.
[18] Y. L. Chen, and T. C. K. Huang, “Discovering fuzzy time-interval sequential patterns in sequence databases,” IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, vol. 35(5), pp. 959-972, 2005.
[19] S. L. Chuang and L. F. Chien, “Enriching web taxonomies through subject categorization of query terms from search engine logs,” Decision Support Systems, vol. 35, pp. 113-127, 2003.
[20] Y. H. Cho and J. K. Kim, “Application of web usage mining and product taxonomy to collaborative recommendations in e-commerce,” Expert Systems with Applications, vol. 26, pp. 233-246, 2004.
[21] A. W. C. Fu, M. H. Wong, S. C. Sze, W. C. Wong, W. L. Wong, and W. K. Yu, Finding fuzzy sets for the mining of fuzzy association rules for numerical attributes, In Proc. Int. Symposium Intelligent Data Engineering Learning (IDEAL’98), Hong Kong, pp. 263-268, 1998.
[22] A. Gupta, V. Harinarayan, and D. Quass, “Aggregate-query processing in data warehousing environment,” In Proc. 21st Int. Conf. Very Large Databases, pp. 358-369, Zurich, Sept., 1995.
[23] M. Garofalakis, R. Rastogi, and K. Shim, “SPIRIT: sequential pattern mining with regular expression constraint”, In Int. Conf. Very Large Databases, Morgan Kaufmann, pp. 223-234, 1999.
[24] T. P. Hong, C. S. Kuo, and S. C. Chi, “Mining association rules from quantitative data,” Intelligent Data Analysis, vol. 3, pp. 363-376, 1999.
[25] T. P. Hong, C. S. Kuo, and S. C. Chi, “Mining fuzzy sequential patterns from quantitative data,” In The 1999 IEEE Int. Conf. on Systems, Man, and Cybernetics, vol. 3, pp. 962-966, 1999.
[26] T. P. Hong, K. Y. Lin, and S. L. Wang, “Fuzzy data mining for interesting generalized association rules,” Fuzzy Set and Systems, vol. 138, pp. 255-269, 2003.
[27] T. P. Hong, K. Y. Lin, and B. C. Chien, “Mining fuzzy multiple-level association rules from quantitative data,” Applied Intelligence, vol. 18, pp. 79-90, 2003.
[28] V. Harinarayan, J. D. Ullman, and A. Rajaraman, “Implementing data cubes efficiently,” In Proc. 1996 ACM SIGMOD Int. Conf. Management Data, pp. 205-216, Montreal, Canada, June 1996.
[29] J. Han, Y. Cai, and N. Cercone, “Data-driven discovery of quantitative rules in relational databases,” IEEE Transaction on Knowledge and Data Engineering, vol. 5, pp. 29-40, 1993.
[30] J. Han, Y. Fu, W. Wang, J. Chiang, W. Gong, K. Koperski, D. Li, Y. Lu, A. Rajan, N. Stefanovic, B. Xia, and O. R. Zaiane, “DBMiner: a system for mining knowledge in large relational databases,” In Proc. Int. Conf. Data Mining and Knowledge Discovery (KDD ‘96), pp. 250-255, Portland, Ore., Aug. 1996.
[31] J. Han and Y. Fu, “Exploration of the power of attribute-oriented induction in data mining,” U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, eds., Advances in Knowledge Discovery and Data Mining, pp. 399-421, AAAI/MIT Press, 1996.
[32] J. Han and Y. Fu, “Mining multiple-level association rules in large databases,” IEEE Transaction on Knowledge and Data Engineering, vol. 11(5), pp. 1-8, 1999.
[33] J. Han, G. Dong, and Y. Yin, “Efficient mining of partial periodic patterns in time series database,” In Proc. of the Int. Conf. on Data Engineering, pp. 106-115, 1999.
[34] J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, and M. C. Hsu, “FreeSpan: frequent pattern-projected sequential pattern mining,” In Proc. of 2000 Int. Conf. on Knowledge Discovery and Data Mining, pp. 355-359, 2000.
[35] J. Han, and M. Kamber, “Data mining: concepts and techniques,” Academic Press, 2001.
[36] J. Han, J. Pei, Y. Yin, and R. Mao, “Mining frequent patterns without candidate generation: a frequent-pattern tree approach,” Data Mining and Knowledge Discovery, vol. 8(1), pp. 53-87, 2004.
[37] M. Kamber, J. Han, and J. Y. Chiang, “Metarule-guided mining of multi-dimensional association rules using data cubes,” In Proc. 1997 Int. Conf. Knowledge Discovery and Data Mining (KDD ‘97), pp. 207-210, Newport Beach, CA, Aug. 1997.
[38] C. Kim, J. H. Lim, R. Ng, and K. Shim, “SQUIRE: sequential pattern mining with quantities,” In Proc. of the 20th Int. Conf. on Data Engineering, Boston, USA, pp. 827-827, 2004.
[39] C. M. Kuok, A. Fu, and M. H. Wong, “Mining fuzzy association rules in databases,” SIGMOD Record, vol. 27(1), pp. 41-46, 1998.
[40] J. H. Lee, and H. L. Kwang, “An extension of association rules using fuzzy sets,” presented at the IFSA’97, Prague, Czech Republic, 1997.
[41] J. W. T. Lee, “An ordinal framework for data mining of fuzzy rules,” In FUZZ IEEE 2000, San Antonio, TX, pp. 399-404, 2000.
[42] G. Liu, H. Lu, Y. Xu, and J. X. Yu, “Ascending frequency ordered prefix-tree: efficient mining of frequent patterns,” In Proc. of the Eighth Int. Conf. on Database Systems for Advanced Applications, pp. 65-72, 2003.
[43] Y. Li, Z. A. Bandar, and D. McLean, “An approach for measuring semantic similarity between words using multiple information sources,” IEEE Transaction on Knowledge and Data Engineering, vol. 15(4), pp. 871-882, 2003.
[44] J. Liu, Y. Pan, K. Wang, and J. Han, “Mining frequent item sets by opportunistic projection,” In Proc. of 2002 Int. Conf. on Knowledge Discovery in Databases (KDD'02), pp. 229-238, Edmonton, Canada, July 2002.
[45] M. Mehta, R. Agrawal, and J. Rissanen, “SLIO: a fast scalable classifier for data mining,” In Proc. Int. Conf. Extending Database Technology (EDBT ‘96), Avignon, France, Mar. 1996.
[46] H. Mannila and H. Toivonen, “Levelwise search and borders of theories in knowledge discovery,” Data Mining and Knowledge Discovery, pp. 241-258, 1997.
[47] H. Mannila, H. Toivonen, and A.I. Verkamo, “Discovery of frequent episodes in event sequences,” Data Mining and Knowledge Discovery, pp. 259-289, 1997.
[48] S. Medasani, J. Kim, and R. Krishnapuram, “An overview of membership function generation techniques for pattern recognition,” International Journal of Approximate Reasoning, vol. 19, pp. 391-417, 1998.
[49] S. Mitra, S. K. Pal, and P. Mitra, “Data mining in soft computing framework: a survey,” IEEE Transaction on Neural Networks, vol. 13(1), pp. 3-14, 2002.
[50] R. T. Ng, L. V. S. Lakshamanan, J. Han, “Exploratory mining and pruning optimizations of constrained associations rules,” In Proc. 1998 ACM SIGMOD Int. Conf. Management of Data, pp. 13-24, Seattle, Washington, June 1998.
[51] G. Piatesky-Shapiro, “Discovery, analysis, and presentation of strong rules,” G. Piatesky-Shapiro and W. J. Frawley, eds., Knowledge Discovery in Databases, pp. 229-238. AAAI/MIT Press, 1991.
[52] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, “Discovering frequent closed itemsets for association rules,” In Proc. Seventh Int. Conf. Database Theory (ICDT ’99), pp. 398-416, Jan. 1999.
[53] S. Parthasarathy, M. J. Zaki, M. Ogihara, and S. Dwarkadas, “Incremental and interactive sequence mining,” In Conf. on Information and Knowledge Management Proc. of the eighth Int. Conf. on Information and knowledge management, pp. 251-258, Kansas City, Missouri, United States, 1999.
[54] H. Pinto J. Han J. Pei, and K. Wang, “Multi-dimensional sequential pattern mining,” In Proc. of the Int. Conf. on Information and Knowledge Management, pp. 81-88, 2001.
[55] J. Pei, J. Han, B. Mortazavi-Asl, and H. Zhu, “Mining access patterns efficiently from web logs,” In Proc. of 2000 Pacific-Asia Conf. on Knowledge Discovery and Data Mining, pp. 396-407, 2000.
[56] J. Pei, J. Han, and W. Wang, “Mining sequential patterns with constraints in large databases,” In Proc. of the Int. Conf. on Information and Knowledge Management, pp. 18-25, 2002.
[57] J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal, and M. –C. Hsu, “ Mining sequential patterns by pattern-growth: the prefixspan approach,” IEEE Transaction on Knowledge and Data Engineering, vol. 16(11), pp. 1424-1440, 2004.
[58] J. R. Quinlan, “Introduction of decision trees,” Machine Learning, vol. 1, pp. 81-106, 1986.
[59] J. R. Quinlan, “C4.5: Programs for machine learning”, Morgan Kaufmann, 1993.
[60] T. J. Ross, “Fuzzy logic with engineering applications”, McGraw-Hill, Inc. 1995.
[61] Y. U. Ryu, “Dynamic construction of product taxonomy hierarchies for assisted shopping in the electronic marketplace,” Hawaii Int. Conf. on System Sciences, vol. 5, pp. 196-204, 1998.
[62] R. Srikant and R. Agrawal, “Mining generalized association rules,” In Proc. 1995 Int. Conf. Very Large Data Bases, pp. 407-419, Zurich, Sept. 1995.
[63] R. Srikant, and R. Agrawal, “Mining sequential patterns: generalizations and performance improvements,” In Proc. of the Fifth Int. Conf. on Extending Database Technology, pp. 3-17, 1996.
[64] R. Srikant, and R. Agrawal, Mining quantitative association rules in large relational tables, In Proc. of the 1996 ACM SIGMOD Int. Conf. on Management of Data, pp. 1-12, 1996.
[65] M. Vazirgiannis, “A classification and relationship extraction scheme for relational databases based on fuzzy logic,” In Proc. Research Development Knowledge Discovery Data Mining, Melbourne, Australia, pp. 414-416, 1998.
[66] J. Widom, “Research problems in data warehousing,” In Proc. Fourth Int. Conf. Information and Knowledge Management, pp. 25-30, Baltimore, Nov. 1995.
[67] H. J. Watson and M. N. Frolick, “Determining information requirements for an EIS,” MIS Quarterly, vol. 17(3), pp. 255-269, 1993.
[68] J. Wang and J. Han, “BIDE: efficient mining of frequent closed sequences,” In Proc. 2004 Int. Conf. on Data Engineering (ICDE'04), Boston, MA, March 2004.
[69] W. P. Yan and P. Larson, “Eager aggregation and lazy aggregation,” In Proc. 21st Int. Conf. Very Large Data bases, pp.345-357, Zurich, Sept. 1995.
[70] J. Yang, W. Wang, and P. S. Yu, “Mining asynchronous periodic patterns in time series data,” In Proc. of the ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 275-279, 2000.
[71] J. S. Yue, E. Tsang, D. Yenng, and S. Daming, “Mining fuzzy association rules with weighted items,” In Proc. IEEE Int. Conf. Systems, Man, Cybernetics, Nashville, TN, pp. 1906-1911, 2000.
[72] X. Yan, J. Han, R. Afshar, “CloSpan: mining closed sequential patterns in large databases,” In SIAM Int. Conf. on Data Mining, San Francisco, CA, USA, 2003.
[73] C. C. Yu, and Y. L. Chen, “Mining sequential patterns from multi-dimensional sequence data,” IEEE Transaction on Knowledge and Data Engineering, vol. 17(1), pp. 136-140, 2005.
[74] L. A. Zadeh, “Fuzzy sets,” Information and Control, vol. 8, pp. 338-353, 1965.
[75] W. Zhang, “Mining fuzzy quantitative association rules,” In Proc. 11th Int. Conf. Tools Artificial Intelligence, Chicago, IL, pp. 99-102, 1999.
[76] M. Zaki, “SPADE: An efficient algorithm for mining frequent sequences,” Machine Learning, vol. 40 pp. 31-60, 2001.
[77] M. Zhang, B. Kao, DW-L. Cheung, and CL Yip, “Efficient algorithms for incremental update of frequent sequences,” In Proc. Pacific-Asia Conf. Knowledge Discovery Data Mining, pp. 186-197, 2002.
[78] Q. Zheng, K. Xu and S. Ma, “When to update the sequential patterns of stream data?,” In Proc. 7th Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD), Korea, LNAI 2637, pp. 545-550, 2003.
[79] Q. Zhao and S. S. Bhowmick, “Sequential pattern mining: a survey,” Technical Report, CAIS, Nanyang Technological University, Singapore, No. 2003118, 2003.

指導教授

陳彥良(Yen-Liang Chen)

審核日期

2006-6-6

推文