從交易資料庫中以自我推導方式探勘具有多層次FP-tree

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：86

、訪客IP：3.16.207.48

姓名

李翊銘(Yi-Ming Lee) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

從交易資料庫中以自我推導方式探勘具有多層次FP-tree
(Mining Self-derivable Multilevel FP-tree From a Transactional Database)

相關論文

★ 應用自組織映射圖網路及倒傳遞網路於探勘通信資料庫之潛在用戶	★ 基於社群網路特徵之企業電子郵件分類
★ 行動網路用戶時序行為分析	★ 社群網路中多階層影響力傳播探勘之研究
★ 以點對點技術為基礎之整合性資訊管理及分析系統	★ 在分散式雲端平台上對不同巨量天文應用之資料區域性適用策略研究
★ 應用資料倉儲技術探索點對點網路環境知識之研究	★ 建構儲存體容量被動遷徙政策於生命週期管理系統之研究
★ 應用服務探勘於發現複合服務之研究	★ 利用權重字尾樹中頻繁事件序改善入侵偵測系統
★ 有效率的處理在資料倉儲上連續的聚合查詢	★ 入侵偵測系統：使用以函數為基礎的系統呼叫序列
★ 有效率的在資料方體上進行多維度及多層次的關聯規則探勘	★ 在網路學習上的社群關聯及權重之課程建議
★ 在社群網路服務中找出不活躍的使用者	★ 利用階層式權重字尾樹找出在天文觀測紀錄中變化相似的序列

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在探勘關聯式規則的領域裡，一些近期的研究中顯示出一些比Apriori-like演算法還要好的方法。探勘頻繁模式在發掘關聯式規則的領域裡占有很重要的角色。在過去，Apriori-like的方法被用在探勘頻繁模式中，但是這些方法對於探勘的工作過程中過於沒效率，這是因為在探勘過程中有著多次重複掃描資料庫和不斷遞迴式地靠著模式比對來產生大量候選者的集合。一個被叫作FP-tree精簡結構被發展出用來改善之前Apriori-like方法的缺點。靠著由J. Han所提出的FP-growth方法，我們可以更便利地去探勘多頻繁模式，雖然FP-growth在探勘多頻繁模式領域中，相較一些作法是一個比較有效率的方法，但是探勘的結果對管理者和決策者來說可能太過詳細。我們提出一個探勘具有較高階層次的頻繁模式的想法，也就是說那些較低階的頻繁模式和精簡的結構可以更加地被歸納和簡化。我們基本的概念是利用FP-tree的特性和結構並且根據一個現存自定的階層關係進行探勘工作。有鑒於此，我們提供有效率提升方法使得原本的FP-tree可以進而成為一個較高階層次的FP-tree。在我們的方法中，被轉換的高層次FP-tree仍保有原本FP-tree的特性。藉由這些提升方法，我們可以達到低階FP-tree到高階FP-tree轉換的目的，並提供管理者具歸納性質的資訊，在實驗結果中也顯示出我們所提出方法的效果性。

摘要(英)

Some recent works have showed the improved approaches which are certainly better than original Apriori-like algorithms for mining association rules. Mining frequent patterns (itemsets) plays an important role of discovering association rules. In the past, Apriori-like methods were adopted to mine frequent itemsets. But these approaches are inefficient to perform a mining task. This is a result from its repeatedly scans of database and iteratively checking a large set of candidates by pattern matching. A compact structure, called FP-tree, was developed to improve the disadvantages of Apriori-like algorithms. By FP-growth approach, proposed by J. Han, we can facilitate mining frequent itemsets. Although FP-growth is a relatively more efficient approach for mining frequent itemsets, the results deduced by FP-growth may be too detailed to satisfy managers or policymakers. We proposed that lower level frequent itemsets and those compressed data within a FP-tree can be generalized furthermore for mining higher level frequent itemsets. Our basic idea is employing the properties and structure of FP-tree according to an existed conceptual hierarchy on mined items. We then provide efficient evolution algorithms to modify the original FP-tree to a higher level FP-tree. In our approaches, the transformed FP-tree still retains the properties of primitive FP-tree. By these novel approaches, we can effectively achieve the goals of transforming from a lower level FP-tree to a higher one, providing more generalized information to managers. Our experimental results also show the effectiveness of the proposed methods.

關鍵字(中)

關鍵字(英)

★ multilevel association rule
★ FP-growth
★ FP-tree
★ Apriori
★ association rule mining

論文目次

中文摘要 I
ABSTRACT II
ACKNOWLEDGEMENT IV
TABLE OF CONTENTS V
LIST OF TABLES VII
LIST OF FIGURES VIII
1 INTRODUCTION 1
2 BACKGROUND AND RELATED WORK 6
2.1 A DATA WAREHOUSE VS. DATA MINING 6
2.2 ASSOCIATION RULES MINING 8
2.3 APRIORI ALGORITHM AND RELATED IMPROVEMENT 10
2.4 FREQUENT PATTERN TREE 12
3 PROBLEM DESCRIPTION 15
4 THE PROPOSED EVOLUTION APPROACHES 24
4.1 THE EVOLUTION APPROACHES 25
4.2 VALIDATION OF THE PROCESSES 46
5 EXPERIMENT 48
5.1 GENERATION OF SYNTHETIC DATA 48
5.2 GENERATION OF A CONCEPTUAL HIERARCHY 50
5.3 COMPARISON OF RUN TIME FOR MINING FREQUENT ITEMSETS 52
5.4 COMPARISON OF REDUCED SIZES 54
5.5 COMPARISON OF EVOLUTION AND REBUILDING 55
5.6 COMPARISON OF NUMBER OF RULES 56
6 CONCLUSION 59
REFERENCE 61

參考文獻

[AIS93b] R. Agrawal, T. Imielinski,and A. Swami. Mining association rules between sets of items in large databases. In Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'93), pages 207-216, Washington, DC, May 1993.
[AS94b] R. Agrawal, R. Srikant. Fast algorithm for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB'94), pages 487-499, Santiago, Chile, Sept. 1994.
[CD97] S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. ACM SIGMOD Record, 26:65-74, March 1997.
[HPY00] J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'00), pages 1-12, Dallas, TX, May 2000.
[KMR+94] M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and I. Verkamo. Finding interesting rules from large sets of discovered association rules. In Proc. 3rd Int. Conf. Information and Knowledge Management (CIKM'94), pages 401-408, Gaithersburg, MD, Nov. 1994.
[PCY95a] J. S. Park, M. S. Chen, and P. S. Yu. An effective hash-based algorithm for mining association rules. In Proc. 1995 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'95), pages 175-186, San Jose, CA, May 1995.
[PS91a] G. Piatetsky-Shapiro. Discovery, analysis and presentation of strong rules. In G. Piatetsky-Shapiro and W. J. Frawley, editors, Knowledge Discovery in Databases. Pages 229-238, Cambridge, MA: AAAI/MIT Press, 1991.
[SON95] A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. In Proc. 1995 Int. Conf. Very Large Data Bases (VLDB'95), pages 432-443, Zurich, Switzerland, Sept. 1995.
[STA98] S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'98), pages 343-354, Seattle, WA, June 1998.
[Toi96] H. Toivonen. Sampling large databases for association rules. In Proc. 1996 Int. Conf. Vert Large Data Base (VLDB'96), pages 134-145, Bombay, India, Sept. 1996.

指導教授

蔡孟峰(Meng-Feng Tsai)

審核日期

2006-10-3

推文