博碩士論文 955202072 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:8 、訪客IP:3.229.118.253
姓名 楊翊彬(Yi-Bin Yang)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 有效率的在資料方體上進行多維度及多層次的關聯規則探勘
(Efficient Workload for Multidimensional and Multilevel Association Rule Mining on Data Cubes)
相關論文
★ 應用自組織映射圖網路及倒傳遞網路於探勘通信資料庫之潛在用戶★ 基於社群網路特徵之企業電子郵件分類
★ 社群網路中多階層影響力傳播探勘之研究★ 以點對點技術為基礎之整合性資訊管理 及分析系統
★ 在分散式雲端平台上對不同巨量天文應用之資料區域性適用策略研究★ 應用資料倉儲技術探索點對點網路環境知識之研究
★ 從交易資料庫中以自我推導方式探勘具有多層次FP-tree★ 建構儲存體容量被動遷徙政策於生命週期管理系統之研究
★ 應用服務探勘於發現複合服務之研究★ 利用權重字尾樹中頻繁事件序改善入侵偵測系統
★ 有效率的處理在資料倉儲上連續的聚合查詢★ 入侵偵測系統:使用以函數為基礎的系統呼叫序列
★ 在網路學習上的社群關聯及權重之課程建議★ 在社群網路服務中找出不活躍的使用者
★ 利用階層式權重字尾樹找出在天文觀測紀錄中變化相似的序列★ 漢字發音系統之音韻關聯規則探勘
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 關聯規則的資料探勘在現今的決策輔助系統上面佔有重要的地位。在過去,決
策輔助系統的後端通常是超大型交易資料庫,已有許多相關研究致力於改善關
聯規則挖掘的效率。近年來,許多決策輔助系統開始將後端平台由傳統的超大
型交易資料庫轉移到多維度的資料倉儲系統上面。資料倉儲系統通常集中由中
央管理維護,所以許多使用者在進行資料挖掘的動作時,都是由統一的資料倉
儲系統來提供歷史性的資訊。在多維度以及多層次的資料庫綱要下,如何能夠
迅速回應來自各個不同需求的人員的關聯規則挖掘請求將是重要的議題。在這
些關聯規則探勘當中我們會發現,同樣的計算容易被提出許多次。這篇論文提
出一種多維度以及多層次的資料挖掘系統,將多個關聯規則挖掘請求視為一個
工作量,對此工作量內的各個請求進行拆解、分析並且重新安排其完成的先後
次序,利用各挖掘請求間的相似部份加以管理、重複利用,以改善整體工作量
的效益,使挖掘結果能夠更快呈獻給使用者。
摘要(英) Association rule mining plays an important role in decision support systems, it finds
interesting rules from a huge amount of historical data. In the past when decision support
systems used transactional databases as backends, researches focus on the performance
improvement for mining association rules. Nowadays, decision support systems often
comes with several frontends and a data warehouse as the backend; the frontends send
preprocessed user queries and then fetch the requested data from the warehouse while
the central data warehouse has to respond a series request from different users, answering
historical data in multiple dimensions and levels. Efficiently answer mining queries on
different dimensions and different levels of abstraction is an important issue for decision
support systems. Based on some observations, we see that an analysis process includes
a series of related queries and many mining queries share common computation results.
We proposed an association rule mining system framework which processes queries as a
workload, managing and optimizing materialized tables, reusing the result among queries
to complete the entire workload efficiently.
關鍵字(中) ★ 工作量序列
★ 多維度關聯規則
★ 多層次關聯規則
關鍵字(英) ★ workload sequence
★ multilevel association rule
★ multidimensional association rule
論文目次 中文摘要iv
Abstract v
Acknowledgements vi
Table of Content vii
List of Figures ix
List of Tables x
List of Listings xi
1 Introduction 1
1.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Paper Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background and Related Works 6
2.1 Data Warehouses for Data Mining and OLAP . . . . . . . . . . . . . . . . 6
2.2 Multidimensional Data Model and Schemas . . . . . . . . . . . . . . . . . 8
2.3 Conceptual Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Association Rules and Frequent Patterns . . . . . . . . . . . . . . . . . . . 11
2.5 Mining Multidimensional Association Rules . . . . . . . . . . . . . . . . . 12
3 Association Rule Mining on Data Cubes 15
3.1 Mining Multidimensional Association Rules . . . . . . . . . . . . . . . . . 17
3.1.1 Finding Frequencies of Patterns . . . . . . . . . . . . . . . . . . . . 18
3.1.2 Rule Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Caching Aggregated Tables . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4 Problem Description 27
4.1 A Cost Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.1 The Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 A Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5 Proposed Solution 35
5.1 Mining System Framework Overview . . . . . . . . . . . . . . . . . . . . . 35
5.1.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1.2 Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2 The Derivational Relation Map . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2.1 The Architecture of a Derivational Relation Map . . . . . . . . . . 37
5.2.2 Constructing the DRM . . . . . . . . . . . . . . . . . . . . . . . . 38
5.3 Mining Association Rules Using the DRM . . . . . . . . . . . . . . . . . . 40
6 Experimental Results 48
6.1 The Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.1.1 Data Warehouse Population . . . . . . . . . . . . . . . . . . . . . . 48
6.2 Performance Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7 Conclusions and Future Works 52
References 53
參考文獻 [1] Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association
rules. In Jorge B. Bocca, Matthias Jarke, and Carlo Zaniolo, editors, Proc. 20th
Int. Conf. Very Large Data Bases, VLDB, pages 487–499. Morgan Kaufmann, 12–
15 1994. ISBN 1-55860-153-8. URL citeseer.ist.psu.edu/agrawal94fast.
html.
[2] Mika Klemettinen, Heikki Mannila, Pirjo Ronkainen, Hannu Toivonen, and
A. Inkeri Verkamo. Finding interesting rules from large sets of discovered association
rules. In Nabil R. Adam, Bharat K. Bhargava, and Yelena Yesha, editors,
Third International Conference on Information and Knowledge Management
(CIKM’94), pages 401–407. ACM Press, 1994. URL citeseer.ist.psu.edu/
klemettinen94finding.html.
[3] Rakesh Agrawal, Tomasz Imieli′nski, and Arun Swami. Mining association rules
between sets of items in large databases. In SIGMOD ’93: Proceedings of the 1993
ACM SIGMOD international conference on Management of data, pages 207–216,
New York, NY, USA, 1993. ACM. ISBN 0-89791-592-5. URL http://doi.acm.
org/10.1145/170035.170072.
[4] Jiawei Han, Micheline Kamber, and Jenny Chiang. Mining multi-dimensional association
rules using data cubes. Technical Report CMPT-TR-97-06, Database Systems
Research Lab. School of Computing Science, Simon Fraser University, 1997.
[5] Micheline Kamber, Jiawei Han, and Jenny Chiang. Metarule-guided mining of
multi-dimensional association rules using data cubes. In Knowledge Discovery
and Data Mining, pages 207–210, 1997. URL citeseer.ist.psu.edu/article/
kamber97metaruleguided.html.
[6] Ramakrishnan Srikant and Rakesh Agrawal. Mining generalized association rules.
Future Generation Computer Systems, 13(2–3):161–180, 1997. URL citeseer.
ist.psu.edu/srikant95mining.html.
[7] J. Han and Y. Fu. Discovery of multiple-level association rules from large databases.
In Proc. of 1995 Int’l Conf. on Very Large Data Bases (VLDB’95), Z‥urich,
Switzerland, September 1995, pages 420–431, 1995. URL citeseer.ist.psu.edu/
han95discovery.html.
[8] Meng-Feng Tsai and Yi-Ming Lee. Mining self-derivable multilevel fp-tree from
a transactional database. Master’s thesis, National Central University, Taoyuan,
Taiwan, 2006.
[9] Sanjay Agrawal, Eric Chu, and Vivek Narasayya. Automatic physical design tuning:
workload as a sequence. In SIGMOD ’06: Proceedings of the 2006 ACM SIGMOD
international conference on Management of data, pages 683–694, New York, NY,
USA, 2006. ACM. ISBN 1-59593-434-0. doi: http://doi.acm.org/10.1145/1142473.
1142549.
[10] Meng-Feng Tsai and Jin-Tang Lin. Efficient computation of continuous aggregation
queries on data warehouse. Master’s thesis, National Central University, Taoyuan,
Taiwan, 2006.
[11] W. H. Inmon and Ch. Kelley. Rdb/VMS: Developing the Data Warehouse. QED
Publishing Group/John Wiley, 1993. ISBN 0-471-56920-8.
[12] Jiawei Han. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers
Inc., San Francisco, CA, USA, 2005. ISBN 1558609016.
[13] Jiawei Han, Jian Pei, and Yiwen Yin. Mining frequent patterns without candidate
generation. SIGMOD Rec., 29(2):1–12, 2000. ISSN 0163-5808. URL http://doi.
acm.org/10.1145/335191.335372.
[14] Maurice A. W. Houtsma and Arun N. Swami. Set-oriented mining for association
rules in relational databases. In ICDE ’95: Proceedings of the Eleventh International
Conference on Data Engineering, pages 25–33, Washington, DC, USA, 1995. IEEE
Computer Society. ISBN 0-8186-6910-1.
[15] Jong Soo Park, Ming-Syan Chen, and Philip S. Yu. An effective hash based algorithm
for mining association rules. In Michael J. Carey and Donovan A. Schneider,
editors, Proceedings of the 1995 ACM SIGMOD International Conference on
Management of Data, pages 175–186, San Jose, California, 22–25 1995. URL
citeseer.ist.psu.edu/park95effective.html.
[16] Ashok Savasere, Edward Omiecinski, and Shamkant B. Navathe. An efficient algorithm
for mining association rules in large databases. In Umeshwar Dayal, Peter
M. D. Gray, and Shojiro Nishio, editors, VLDB’95, Proceedings of 21th International
Conference on Very Large Data Bases, September 11-15, 1995, Zurich,
Switzerland, pages 432–444. Morgan Kaufmann, 1995. ISBN 1-55860-379-4.
[17] Sameet Agarwal, Rakesh Agrawal, Prasad M. Deshpande, Ashish Gupta, Jeffrey F.
Naughton, Raghu Ramakrishnan, and Sunita Sarawagi. On the computation of multidimensional
aggregates. In T. M. Vijayaraman, Alejandro P. Buchmann, C. Mohan,
and Nandlal L. Sarda, editors, Proc. 22nd Int. Conf. Very Large Databases,
VLDB, pages 506–521. Morgan Kaufmann, 3–6 1996. ISBN 1-55860-382-4. URL
citeseer.ist.psu.edu/agarwal96computation.html.
[18] Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. Implementing data
cubes efficiently. In SIGMOD ’96: Proceedings of the 1996 ACM SIGMOD international
conference on Management of data, pages 205–216, New York, NY, USA,
1996. ACM. ISBN 0-89791-794-4. URL http://doi.acm.org/10.1145/233269.
233333.
[19] Ralph Kimball and Margy Ross. The Data Warehouse Toolkit: The Complete Guide
to Dimensional Modeling (Second Edition). Wiley, April 2002. ISBN 0471200247.
[20] The TPC-DS benchmark.
http://www.tpc.org/tpcds/tpcds.asp.
[21] IBM RedBrick Warehouse.
http://www-306.ibm.com/software/data/informix/redbrick/.
[22] Himanshu Gupta. Selection of views to materialize in a data warehouse. In ICDT,
pages 98–112, 1997. URL citeseer.ist.psu.edu/gupta97selection.html.
[23] Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman.
Index selection for OLAP. In ICDE, pages 208–219, 1997. URL citeseer.ist.
psu.edu/article/gupta97index.html.
指導教授 蔡孟峰(Meng-Feng Tsai) 審核日期 2008-7-21
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明