時序資料庫中緊密頻繁連續事件型樣之有效探勘

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：24

、訪客IP：18.117.141.69

姓名

林國瑞(Kuo-Zui Lin) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

時序資料庫中緊密頻繁連續事件型樣之有效探勘
(ClosedPROWL: Efficient Mining of Closed Frequent Continuities in Temporal Databases)

相關論文

★ 行程邀約郵件的辨識與不規則時間擷取之研究	★ NCUFree校園無線網路平台設計及應用服務開發
★ 網際網路半結構性資料擷取系統之設計與實作	★ 非簡單瀏覽路徑之探勘與應用
★ 遞增資料關聯式規則探勘之改進	★ 應用卡方獨立性檢定於關連式分類問題
★ 中文資料擷取系統之設計與研究	★ 非數值型資料視覺化與兼具主客觀的分群
★ 關聯性字組在文件摘要上的探討	★ 淨化網頁：網頁區塊化以及資料區域擷取
★ 問題答覆系統使用語句分類排序方式之設計與研究	★ 星狀座標之軸排列於群聚視覺化之應用
★ 由瀏覽歷程自動產生網頁抓取程式之研究	★ 動態網頁之樣版與資料分析研究
★ 同性質網頁資料整合之自動化研究	★ 時序性資料庫中未知週期之非同步週期性樣板的探勘

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在資料探勘的領域中，型樣探勘一直是個相當重要的課題。早期，大部分的研究如頻繁項目集，主要在找尋同一筆交易中項目間的關聯性。近來，為能更有效地預測分析資料庫的行為趨勢，學者開始將焦點集中在交易間關聯性之探勘，用來描述不同交易間項目彼此的關係。連續事件即為一種交易間關聯性型樣，其明確描述著不同交易之間的相對位置與前後順序等關係。由於連續事件跨越了交易記錄間的藩籬，以致於潛在型樣與規則的數量急遽增加，如此不但會降低整體演算法的效率，還會使探勘結果難以運用，因此我們選擇探勘緊密頻繁連續事件。緊密頻繁連續事件是一群具有代表性的頻繁連續事件，不但數量相對較少，且可以由其展開列舉出所有的頻繁連續事件，因此具有消除冗餘資訊又不喪失其完整性的優點。本篇論文中，我們提出一個有效率的演算法ClosedPROWL，主要採用投影視窗列表技術以進行緊密頻繁連續事件的探勘。實驗結果顯示，不論在合成資料集或真實資料集，相較於之前其他方法，我們的演算法皆擁有更佳的效能與延展性。

摘要(英)

Mining frequent patterns in temporal databases is a fundamental and essential problem in data mining areas. Over the past few years a considerable number of studies have been made in frequent itemset mining, which consider only relationships among items in the same transaction. Recently, researchers began to focus the problem on the inter-transaction association that describes the association relationships among different transactions. A continuity is a kind of inter-transaction association which describes definite temporal relationships among different transactions. Since continuities breaks the barrier of transactions, the number of potential patterns will increase drastically. An alternative idea is to mine closed frequent continuities. Mining closed frequent patterns has the same power as mining the complete set of frequent patterns, while substantially reduce redundant rules to be generated and increase the effectiveness of mining. In this paper, we propose an efficient algorithm, ClosedPROWL, for closed frequent continuities mining by projected window list technology. Experimental evaluation on both real world and synthetic datasets shows that our algorithm is more efficient and scalable compared to previously proposed algorithm.

關鍵字(中)

★ 型樣探勘
★ 緊密頻繁連續事件
★ 交易間關聯性探勘
★ 資料探勘

關鍵字(英)

★ Pattern Mining
★ Closed Frequent Continuities
★ Inter-Transaction Association Mining
★ Data Mining

論文目次

第一章緒論 1
1.1. 研究動機與目的 1
1.2. 研究貢獻 4
1.3. 論文架構 4
第二章相關研究 5
2.1. 頻繁事件序探勘 5
2.1.1. WINEPI演算法 5
2.1.2. MINEPI演算法 6
2.2. 週期性型樣探勘 7
2.2.1. LSI演算法 7
2.2.2. SMCA演算法 9
2.3. 頻繁連續事件探勘 11
2.3.1 FITI演算法 11
第三章問題定義 14
第四章 ClosedPROWL演算法 18
4.1. ClosedPROWL演算法架構 18
4.2. 緊密頻繁事件集之探勘 20
4.3. 緊密頻繁事件集編碼與資料庫轉換 21
4.4. 緊密頻繁連續事件之探勘 21
4.4.1. 探勘流程 21
4.4.2. 搜尋空間刪減技術 24
4.4.3. 緊密連續事件檢查機制 27
4.4.4. 實例說明 28
4.5. ClosedPROWL演算法正確性分析 30
第五章實驗結果 32
5.1. 合成資料集（Synthetic Data） 32
5.1.1. 資料產生器說明 32
5.1.2. 效能與延展性分析 33
5.2.真實資料集（Real World Data） 37
第六章結論 40
參考文獻 41

參考文獻

1. R.C. Agarwal, C.C. Aggarwal, and V. Parsad. A tree projection algorithm for generation of frequent itemsets. In Journal of Parallel and Distributed Computing, 61(3): 350-371, 2001.
2. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. of the 20th International Conference Very Large Data Bases (VLDB'94), pp. 487-499, 1994.
3. M. N. Garofalakis, R. Rastogi, and K. Shim. Spirit: Sequential pattern mining with regular expression of constraints. IEEE Transactions on Knowledge and Data Engineering (TKDE), 14(3): 530-552, 2002.
4. K.Y. Huang and C.H. Chang, Asynchronous periodic patterns mining in temporal databases, In Proc. of the IASTED International Conference on Databases and Applications (DBA), pp. 43-48, February 17-19, 2004, Austria.
5. K.Y. Huang, C.H. Chang and K.Z. Lin, PROWL: An efficient frequent continuity mining algorithm on event sequences. In Proc. of 6th International Conference on Data Warehousing and Knowledge Discovery (DaWak'04), Septemper 1-3, 2004, Spain. To appear.
6. J. Han and J. Pei. Mining frequent patterns by pattern-growth: Methodology and implications. ACM SIGKDD Explorations (Special Issue on Scalable Data Mining Algorithms), 2(2): 14-20, 2000.
7. J. Han, J. Pei, Y. Yin, and R. Mao. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery: An International Journal(DMKD), 8(1): 53-87, 2004.
8. H. Mannila and H. Toivonen. Discovering generalized episodes using minimal occurrences. In Proc. of the International Conference on Knowledge Discovery and Data Mining, pp. 146-151, 1996.
9. H. Mannila, H. Toivonen and A. I. Verkamo. Discovering frequent episodes in sequences. In Proc. of the First International Conference on Knowledge Discovery and Data Mining. (KDD'95), pp. 210-215, 1995.
10. H. Mannila, H. Toivonen and A. I. Verkamo. Discovery of frequent episodes in event sequences. In Journal of the Data Mining and Knowledge Discovery, pp. 259-289, 1997.
11. R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and performance improvements. In Proc. of the 5th International Conference on Extending Database Technology (EDBT'96), pp. 3-17, 1996.
12. A. K. H. Tung, H. Lu, J. Han and L. Feng. Breaking the barrier of transactions: Mining inter-transaction association rules. In Proc. of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 297-301, 1999.
13. A. K. H. Tung, H. Lu, J. Han and L. Feng. Efficient mining of intertransaction association rules. IEEE Transactions on Knowledge and Data Engineering, 15(1): 43-56, 2003.
14. J. Yang, W. Wang, and P. S. Yu. Mining asynchronous periodic patterns in time series data. In Proc. of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD'00), pp. 275-279, 2000.
15. J. Yang, W. Wang, and P. S. Yu. Mining asynchronous periodic patterns in time series data. IEEE Transactions on Knowledge and Data Engineering, 15(3): 613-628, 2003.
16. M. J. Zaki. Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering (TKDE), 12(3): 372-390, 2000.
17. M. Zaki. Spade: An efficient algorithm for mining frequent sequences. Machine Learning, 42(1/2):31-60, 2001.
18. M. J. Zaki and C. J. Hsiao. CHARM: An efficient algorithm for closed itemset mining. In Proc. of 2nd SIAM International Conference on Data Mining (SIAM’ 02), pp. 457-473, 2002.

指導教授

張嘉惠(Chia-Hui Chang)

審核日期

2004-7-15

推文