FP-tree(Frequent Pattern Tree)的調整維護技術研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：83

、訪客IP：18.117.192.64

姓名

張元哲(Yuan-Che Chang ) 查詢紙本館藏

畢業系所

資訊管理研究所

論文名稱

FP-tree(Frequent Pattern Tree)的調整維護技術研究

相關論文

★ 零售業商業智慧之探討	★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例	★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析－以國內某DRAM廠為例	★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式	★ 應用資料探勘技術建立機車貸款風險評估模式之研究－以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證	★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 關聯式資料庫之廣義知識探勘
★ 考量屬性值取得延遲的決策樹建構	★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題
★ 利用分割式分群演算法找共識群解群體決策問題	★ 以新奇的方法有序共識群應用於群體決策問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在資料庫中挖掘association rules是相當普遍的資料挖掘課題，而這類的研究主要可以分成兩大類：(1)利用Apriori-like的方法產生candidate set，並找出符合minimum support的large itemsets，再依據large itemsets產生association rules；(2)使用Non Apriori-like的方法，找出large itemsets。這兩類的方法中，第(2)類的方法在效率上明顯優於第(1)類，而在第(2)類方法中，又以FP-tree（Frequent Pattern Tree）的結構結合FP-growth algorithm的方法最新、最有效率也最為著名，如今它已經實際運用於DBMiner中，而且有很好的效率表現。
然而上述的兩類方法都需要掃描資料庫兩次以上，在超大型的交易資料庫上進行這樣的動作將相當費時，因此如果能夠依照交易資料庫的異動情況，以incremental的方式修改large itemsets來維護association rules的正確性，又不需要重新掃描資料庫，便相當有價值。過去所提出的incremental維護方法，均是針對第(1)類方法所進行的研究，而第(2)類方法目前都還沒有人提出相關的incremental維護方法，本研究便是針對第(2)類方法中的FP-tree結構，提出incremental維護演算法FPI（FP-tree Incremental），可以在資料庫發生insert、delete或update時，不需要重新掃描整個資料庫即可很有效率地動態調整FP-tree，使它維持正確的結構。另外一種常見的情況是產生association rule的minimum support變小時，由於FP-tree並未包含原先minimum support下屬於infrequent item的node，故需要重新掃描資料庫來建構新的FP-tree，FPI在這種情況下也可以達到動態調整的目的，而不用浪費掃描整個資料庫的I/O時間來重新建構FP-tree。
由於FP-tree通常遠小於交易資料庫本身而可以放在主記憶體中，所以一旦完成FP-tree的建構，以FP-growth來產生frequent patterns就非常的迅速，因此它所花費的主要時間便在掃描資料庫的I/O和建構FP-tree之上，本研究所提供的FPI演算法可以即時地維護FP-tree正確結構，使association rules可以快速產生。

關鍵字(中)

★ 資料挖掘

關鍵字(英)

★ association rules
★ Data Mining
★ FP-tree

論文目次

圖目錄 3
表目錄 4
1.緒論 5
2.FP-tree的建構方式 8
3.FP-tree的維護架構 12
4.FPI演算法 15
5.FPI的functions 25
5.1 FP-insert function 25
5.2 FP-delete function 28
5.3 FP-move-up function 31
5.4 FP-go-down function 38
6.實驗評估及效率研究 46
6.1 minimum support值向下調整時的比較 46
6.2 deleted and added transactions的個別比較 47
7.結論 49
8.參考文獻 50

參考文獻

[1] R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules. Proc. Int'l Conf. Very Large Data Bases, 487-499 (September 1994).
[2] J.S. Park, M.S. Chen, and P.S. Yu. An Effective Hash-Based Algorithm for Mining Association Rules. Proc. ACM-SIGMOD Int'l Conf. Management of Data, 175-186 ( May 1995).
[3] A. Savasere, E. Omiecinski, and S. Navathe. An Efficient Algorithm for Mining Association Rules in Large Databases. Proc. Int'l Conf. Very Large Data Bases, 432-444 (Sept. 1995).
[4] S. Brin, R. Motwani, J. Ullman and S. Tsur. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the 1997 ACM-SIGMOD Conf. on Management of Data, 255-264 (1997).
[5] Mohammed Javeed Zaki, Srinivasan Parthasarathy, Wei Li and Mitsunori Ogihara. Evaluation of sampling for data mining of association rules. Technical Report 617, Computer Science Dept., U. Rochester, (May 1996).
[6] G. Gunopulos, H. Mannila and S. Saluja. Discovering All Most Specific Sentences by Randomized Algorithms. In Proc. of the 6th Int'l Conf. on Database Theory, 215-229 (1997).
[7] J. Roberto and Jr. Bayardo. Efficiently Mining Long Patterns from Databases. In Proc. of the ACM-SIGMOD Int'l Conf. on Management of Data, 85-93 (1998).
[8] Nicolas Pasquier, Yves Bastide, Rafik Taouil and Lotfi Lakhal. Efficient mining of association rules using closed itemset lattices. Information Systems, Volume: 24, Issue: 1, 25-46 (March 1999)
[9] S.J. Yen and A.L.P. Chen. An Efficient Approach to Discovery Knowledge from Large Database. Proceeding of the IEEE/ACM International Conference on Parallel and Distributed Information Systems, 8-18 (1996).
[10] R. Agarwal, C. Aggarwal, and V. V. V. Prasad. A tree projection algorithm for generation of frequent itemsets. In J. Parallel and Distributed Computing, (2000).
[11] Jiawei Han , Jian Pei and Yiwen Yin. Mining Frequent Patterns without Candidate Generation. Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'00), 1-12 (May 2000).
[12] J. Pei and J. Han. Can We Push More Constraints into Frequent Pattern Mining? Proc. 2000 Int. Conf. on Knowledge Discovery and Data Mining (KDD'00), Boston, MA, (August 2000).
[13] J. Han and J. Pei. Mining Frequent Patterns by Pattern-Growth: Methodology and Implications. ACM SIGKDD Explorations (Special Issue on Scaleble Data Mining Algorithms), 2(2) (December 2000).
[14] J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. Proc. 2001 Int. Conf. on Data Engineering (ICDE'01), Heidelberg, Germany, (April 2001).
[15] E. M. Rains. Increasing subsequences and the classical groups. Elec. J. Combin. 5 (1998).

指導教授

陳彥良(Yen-Liang Chen)

審核日期

2001-7-4

推文