摘要(英) |
Some recent works have showed the improved approaches which are certainly better than original Apriori-like algorithms for mining association rules. Mining frequent patterns (itemsets) plays an important role of discovering association rules. In the past, Apriori-like methods were adopted to mine frequent itemsets. But these approaches are inefficient to perform a mining task. This is a result from its repeatedly scans of database and iteratively checking a large set of candidates by pattern matching. A compact structure, called FP-tree, was developed to improve the disadvantages of Apriori-like algorithms. By FP-growth approach, proposed by J. Han, we can facilitate mining frequent itemsets. Although FP-growth is a relatively more efficient approach for mining frequent itemsets, the results deduced by FP-growth may be too detailed to satisfy managers or policymakers. We proposed that lower level frequent itemsets and those compressed data within a FP-tree can be generalized furthermore for mining higher level frequent itemsets. Our basic idea is employing the properties and structure of FP-tree according to an existed conceptual hierarchy on mined items. We then provide efficient evolution algorithms to modify the original FP-tree to a higher level FP-tree. In our approaches, the transformed FP-tree still retains the properties of primitive FP-tree. By these novel approaches, we can effectively achieve the goals of transforming from a lower level FP-tree to a higher one, providing more generalized information to managers. Our experimental results also show the effectiveness of the proposed methods. |
參考文獻 |
[AIS93b] R. Agrawal, T. Imielinski,and A. Swami. Mining association rules between sets of items in large databases. In Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'93), pages 207-216, Washington, DC, May 1993.
[AS94b] R. Agrawal, R. Srikant. Fast algorithm for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB'94), pages 487-499, Santiago, Chile, Sept. 1994.
[CD97] S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. ACM SIGMOD Record, 26:65-74, March 1997.
[HPY00] J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'00), pages 1-12, Dallas, TX, May 2000.
[KMR+94] M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and I. Verkamo. Finding interesting rules from large sets of discovered association rules. In Proc. 3rd Int. Conf. Information and Knowledge Management (CIKM'94), pages 401-408, Gaithersburg, MD, Nov. 1994.
[PCY95a] J. S. Park, M. S. Chen, and P. S. Yu. An effective hash-based algorithm for mining association rules. In Proc. 1995 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'95), pages 175-186, San Jose, CA, May 1995.
[PS91a] G. Piatetsky-Shapiro. Discovery, analysis and presentation of strong rules. In G. Piatetsky-Shapiro and W. J. Frawley, editors, Knowledge Discovery in Databases. Pages 229-238, Cambridge, MA: AAAI/MIT Press, 1991.
[SON95] A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. In Proc. 1995 Int. Conf. Very Large Data Bases (VLDB'95), pages 432-443, Zurich, Switzerland, Sept. 1995.
[STA98] S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'98), pages 343-354, Seattle, WA, June 1998.
[Toi96] H. Toivonen. Sampling large databases for association rules. In Proc. 1996 Int. Conf. Vert Large Data Base (VLDB'96), pages 134-145, Bombay, India, Sept. 1996. |