博碩士論文 93426008 詳細資訊


姓名 高詩惠(Shih-Hui Kao)  查詢紙本館藏   畢業系所 工業管理研究所
論文名稱 在跨概念階層中挖掘於產品銷售期間內之非重複性的交易間關聯規則
(Mining non-redundant inter-transaction cross-level association rules with appearance period)
檔案 [Endnote RIS 格式]    [Bibtex 格式]    至系統瀏覽論文 ( 永不開放)
摘要(中) 在過去的關聯規則研究中,大部分的研究都著重在概念階層中最底層的交易內的挖掘工作。在這裡,我們將著重在挖掘跨概念階層的非重複性的交易間關聯規則。意為我們可以在不同交易間與任一層概念階層中挖掘到此種關聯規則。為避免產生大量不有趣的冗餘重複規則,我們沿用林聖傑(2005)計算概念階層中父層產品與子層產品的趣度來避免產生帶有冗餘重複資訊的關聯規則。
在零售店中並非所有產品都擁有相同或相似的特性。有些商品整年皆販售,有些商品則依照季節或特定節日銷售。為了尋找販售期間短但有趣的產品,我們將依照各產品的銷售期間的交易記錄來計算支持度。同時也根據其產品特性設定不同的門檻限制。為了提升挖掘的效率與有趣度,我們選擇在產生關聯規則前使用gap事先篩選有趣的產品來產生關聯規則。而非傳統上將所有的關聯規則產出後再進行關聯規則間的有趣度比較。
本研究提出一個以FP-tree為基礎的演算法,名為ITCL_FP-tree,結合產品銷售期間、多門檻限制與gap來採擷跨階層的非重複性交易間規聯規則。我們利用實際的資料驗證出ITCL_FP-tree在使用gap的情況下可以刪除50 %到70 %不等的冗餘重複或不有趣的規則。其決定frequent items的運算時間與傳統的演算法不相上下,但產生規則所需的運算時間則大幅減少。當資料量越大時,產生規則就越有效率。從實驗的結果中可以顯示使用gap的確可以幫助使用者更有效率地挖掘出有趣但不帶有冗餘重複資訊的交易間的關聯規則。
摘要(英) Most of previous studies on mining association rules are mining intra-transaction associations at the atomic level of concept hierarchy. In this study, we will mine the non-redundant inter-transaction cross-level association rules. An inter-transaction cross-level association rule describes the association relationships among different transactions and the rules among concepts at any level of a hierarchy. Additional step in pruning redundant rule is usually carried out after rules are found. However, this kind of mining may cause generating a large number of potential redundant rules. In retailing, an item may not be carried in the entire year in the shop. Therefore, mining the rules under such situations requires solving the rare item problem.
Since all items in the database may not have the same natures or similar frequencies. In real-life applications, some items may appear very frequently and others may appear rarely. To find frequent items which appear rarely, we first identify the appearance period of each item, and then calculate the item’s support value. Multiple minimum support (MIS) is used to reflect the distinct nature of each item. In order to mine interesting rules and to improve the mining efficiency, we adopt the concept of gap to prune redundant and uninteresting items before rule generation rather than remove uninteresting rules after rule mining.
Finally, we implement an FP-tree based algorithm, ITCL_FP-tree, on real data. Our experiment shows that we can prune out almost 50 to 70 percent of the redundant and uninteresting rules. The runtime of determining frequent items or generating rule is shorter than the one by using the traditional mining procedures even when the number of transactions is large. The result indicates that we can discover inter-transaction association rules with non-redundant knowledge.
關鍵字(中) ★ FP-tree演算法
★ 跨概念階層
★ 冗餘重複規則
★ 銷售期間
★ 多門檻限
關鍵字(英) ★ redundant rule
★ inter-transaction rule
★ cross-level association rule
★ gap
★ FP-tree algorithm
★ multiple minimum support
★ appearance period
論文目次 Table of Contents ii
List of Figures v
List of Tables vi
Chapter 1 Introduction 1
1.1Motivation and Background 1
1.2Problem Description 3
1.3 Research Objectives 5
1.4Methodology 5
Chapter 2 Literature Review 7
2.1 Association rule mining among multiple and cross concept hierarchy 7
2.2 Mining inter-transaction association rules 8
2.3 Redundant rule pruning among concept hierarchy 9
Chapter 3 Methodology 11
3.1 Frequent itemsets generation 11
3.1.1 Concept hierarchy construction 11
3.1.2 Appearance period construction 11
3.2 Pruning frequent but redundant items 16
3.3 Inter-transaction association rule mining 19
3.3.1 Transforming the transactions into ITCL-transactions 19
3.3.2 Generating inter-transaction cross-level rules using ITCL_FP-tree 22
3.4 Presentation of interesting rules 25
Chapter 4 Experiment Evaluation and Performance Study 27
4.1Environment of Experiments 27
4.2Result and Analysis of Experiments 27
Chapter 5 Conclusion and Future Research 44
5.1 Conclusion 44
5.2 Future Research 45
References46
Appendix : Algorithms 49
參考文獻 [1] An Chen and Huilin Ye. 2004. Multiple-Level Sequential Pattern Discovery from Customer Transaction Database. In International Journal of Computation Intelligence Volume 1 Number 1 2004.
[2] Anthony K. H. Tung, H. Lu, J. Han and L. Feng. 2003. Efficient Mining of Intertransaction Association Rules. In IEEE Transactions on knowledge and data engineering, vol.15, no.1, January/February 2003.
[3] B. Liu, W. Hsu and Y. Ma. 1999. Mining Association Rule with Multiple Minimum supports. In Proc. 1999 ACM SIGMOD Int’l Conf. on Management of Data, San Diego, CA, USA.
[4] B. Liu, W. Hsu and Y. Ma. 1999. Pruning and Summarizing the Discovered Associations. In S Surajit Chaudhuri abd David Madigan, editors, Proceedings of Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 125-134, N.Y., August 15-18 1999. ACM Press.
[5] D. Shah, L. V.S. Lakshmanan, K. Ramamritham and S. Sudarshan. 1999. Interestingness and Pruning of Mined Patterns. Indian Institute of Technology, Bombay.
[6] H. Lu, L. Feng, J. Han. 2000. Beyond Intratransaction Association Analysis: Mining Multidimensional Intertransaction Association Rules. In ACM Transactions on Information Systems, Vol. 18, No. 4, October 2000, pp. 423-454.
[7] I. Pramudiono and M. Kitsuregawa. 2004. FP-tax: Tree Structure based generalized association rule mining. In Proceeding of the 9th ACM SIGMO Workshop on Research Issues in Data Mining and Knowledge Discovery, June 2004.
[8] J. Deogun and L. Jiang. 2005. Prediction Mining – An Approach to Mining
Association Rules for Prediction. In Rough Sets, Fuzzy Sets, Data Mining, and
Granular Computing: 10th International Conference, RSFDGrC 2005, Regina,
Canada, August 31 - September 3, 2005, Proceedings, Part II.
[9] J. Han, Y. Fu. 1995. Discovery of Multiple-Level Association Rules from Large
Databases. in Proc. of the 21st VLDB Conf., Zurich, Switzerland.
[10] J. Han. 1995. Mining Knowledge at Multiple Concept Levels. In School of
Computer Science Simon Fraser University Brutish Columbia, Canada V5A 1S6.
[11] J. Han, J. Pei, and Y. Yin. 2000. Mining frequent patterns without candidate generation. In Proceeding of the ACM SIGMOD International Conference on Management of Data (SIGMOD’00), 2000.
[12] J. Han, M. Kamber 2001. Data Mining: Concepts and Techniques. In Morgan
Kaufmann.
[13] J. M. de Graaf, W.A. Kosters, J. J.W. Witteman. 2000. Interesting Association
Rules in Multiple Taxonomies. In Leiden Institute of Advanced Computer Science Universiteit Leiden.
[14] K.-L Ong, W.-K Ng and E.-P. Lim. 2001. Mining Multi-Level Rules with Recurrent Items Using FP’-Tree. In Proc. Of the 3rd Int. Conf. on Information, Communications and Signal Processing, Singapore, Oct. 2001.
[15] M. J. Zaki. 2000. Generating Non-Redundant Association Rules. Computer
Science Department Rensselaer Polytechnic Institute, Troy NY 12180.
[16] Q. Zhao and S.S. Bhowmick. 2003. Association Rule Mining: A Survey. In Technical Report, CAIS, Nanyang Technological University Singapore, No. 2003116, 2003
[17] R. Agrawal, T. Imielinski, A. Swami. 1993. Mining Association Rules Between Sets of Items in Large Databases. In Proc. 1993 ACM SIGMOD Int’l Conf. on Management of Data, Washington, D.C. pp.207-216.
[18] R. Agrawal, H. Mannila, R. Shikant, H. Toivonen. A.I. Verkamo. 1996. Fast Discovery of Association Rules. In Advance in Knowledge Discovery and Data Minng. pp.307-328.
[19] R. Srikant and R. Agrawal. 1995. Mining Generalized association rules. Research Report RJ 9963, IBM Almaden Research Center, San Jose, California, June 1995.
[20] R. Srikant and R. Agrawal. 1995. Mining Sequential Patterns, In Proceedings of the 11th International Conference on Data Engineering ( ICDE’95), 1995, pp.3-14.
[21] R. J. Bayardo and R. Agrawal. 1999. Mining the Most Interesting Rules. In Proc. of the 5th ACM SIGDD International Conference on Knowledge Discovery and Data Mining, page 145-154, August 1999.
[22] R. Hilderman and H. Hamilton. 1999. Knowledge Discovery and Interestingness Measure: A Survey. Technique Report CS 99-04, Department of Computer Science, University of Regina, 1999.
[23] S. Jaroszewicz and D. A. Simovici. 2001. A General Measure of rule Interestingness. In Proc. Of PKDD 2001, Freiburg, Germany, volume 2168 of Lecture Notes in Computer Science, page 253-265. Springer, September 2001.
[24] S. Luhr, G. West and S. Vankatesh. 2005. An Extended Frequent Pattern Tree for Intertransation Association Rule Mining. In Technical Report 2005/1, Department of Computing, Curtin University of Technology, 2005.
[25] Y. Bastide, N. Oasquier, R. Taouil, G. Stumme and L. Lahal. 2000. Mining
Minimal Non-Redundant Association Rules using Frequent Closed Itemsets. In
Proceedings of the 1st International Conference on CL (6th International Conference on Database System). In J. Lloyd, V. Dahl, U. Furbach, M. Kerber, K.-K. Lau, C. Palamidessi, L.M. Pereira, Y. Sagiv, P.J. Stuckey (Eds.), Computational Logic-CL, LNAL 1861, Springer, Heidelberg, 2000.
[26] Y. H. Hu and Y.L. Chen. 2003. Discovering and Maintaining Association Rules with Multiple Minimum Supports. Department of Information Management, National Central University Chung-Li 320, Taiwan, R.O.C.
指導教授 沈國基(Gwo-Ji Sheen) 審核日期 2006-7-10
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡