遞增資料關聯式規則探勘之改進; Extending SWF for Incremental Association Mining by Incorporating Previously Discovered Information

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/8615

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/8615

題名:	遞增資料關聯式規則探勘之改進;Extending SWF for Incremental Association Mining by Incorporating Previously Discovered Information
作者:	楊士賢;Shi-Hsan Yang
貢獻者:	資訊工程研究所
關鍵詞:	關聯式規則;資料探勘;Data Mining;Association Rules
日期:	2002-06-20
上傳時間:	2009-09-22 11:31:47 (UTC+8)
出版者:	國立中央大學圖書館
摘要:	資料探勘在實際的應用上，已經從傳統的針對靜態的資料庫做探勘，演變成針對動態的資料庫做探勘，關聯規則的遞增探勘是其中較早為大家所重視的課題。近期對於關聯式法則遞增探勘提出的演算法有 FUP2、MAAP、PELICAN、SWF等，其中 SWF 在效能上優於其他同型的演算法。而在本篇論文中，我們提出了二個改進 SWF 的演算法－FI_SWF和CI_SWF，我們藉著儲存前一次探勘的頻繁項目集和支持度，對於目前探勘，我們只需要掃描資料庫變動的部分，即可得儲存的項目集的新支持度，不僅降低了在 SWF 中最後一次掃瞄資料庫的時間，也加速候選項目集的產生。在實驗中證明，改良後的 SWF 演算法確實能加快執行時間。雖然我們的演算法須要較多的硬體空間來儲存前一次的頻繁項目集或是侯選項目集，但是在最大記憶體的使用上是相當於SWF演算法。在實際的應用上，當資料探勘變成是一個重複而頻繁的工作時，執行時間更形重要，利用本篇論文提出的演算法來做資料探勘，是一個有效並簡單的好方法。 Incremental mining of association rules from dynamic databases refers to the maintenance and utilization of the knowledge discovered in the previous mining operations.Sliding- window-ﬁltering (SWF)is a technique proposed to ﬁltering false candidate 2-itemsets by segmenting a transaction database into several partitions.SWF computes a set of candidate 2-itemsets that is close to frequent 2-itemsets.Therefore,it is possible to generate several candidate k -itemsets for one database scan.Such a database scan reduction technique greatly increase the performance for frequent itemsets discovery.In this paper,we extend SWF by incorporating previously discovered information and propose two algorithms to boost the performance for incremental mining.The ﬁrst algorithm FI SWF (SWF with Frequent Itemset)reuse the frequent itemsets (and the counts)of previous mining task as FUP2 to reduce the number of new candidate itemsets that have to be checked.The second algorithm CI SWF (SWF with Candidate Itemset)reuse the candidate itemsets (and the counts)from the previously mining task.Experimental studies are performed to evaluate performance of the new algorithms.The study shows that the new incremental algorithm is signi ﬁcantly faster than SWF.More importantly,the need for more disk space to store the previously discovered knowledge does not increase the maximum memory required during the execution time.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	大小	格式	瀏覽次數

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....