博碩士論文 92441002 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:38 、訪客IP:18.118.195.30
姓名 劉育津(Yu-Chin Liu)  查詢紙本館藏   畢業系所 企業管理學系
論文名稱 自企業資料庫挖掘和彙整商情規則之研究
(Mining and Summarizing Rules from raw data of Enterprise Systems)
相關論文
★ 在社群網站上作互動推薦及研究使用者行為對其效果之影響★ 以AHP法探討伺服器品牌大廠的供應商遴選指標的權重決定分析
★ 以AHP法探討智慧型手機產業營運中心區位選擇考量關鍵因素之研究★ 太陽能光電產業經營績效評估-應用資料包絡分析法
★ 建構國家太陽能電池產業競爭力比較模式之研究★ 以序列採礦方法探討景氣指標與進出口值的關聯
★ ERP專案成員組合對績效影響之研究★ 推薦期刊文章至適合學科類別之研究
★ 品牌故事分析與比較-以古早味美食產業為例★ 以方法目的鏈比較Starbucks與Cama吸引消費者購買因素
★ 探討創意店家創業價值之研究- 以赤峰街、民生社區為例★ 以領先指標預測企業長短期借款變化之研究
★ 應用層級分析法遴選電競筆記型電腦鍵盤供應商之關鍵因子探討★ 以互惠及利他行為探討信任關係對知識分享之影響
★ 結合人格特質與海報主色以類神經網路推薦電影之研究★ 資料視覺化圖表與議題之關聯
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 隨著電子商務的發展,企業面臨前所未有的全球化競爭,如何利用資訊科技創造出競爭優勢儼然成為各企業的新課題;因此,學界和實務者致力於將累積於企業資訊系統的資料,進行商情資訊的探勘和發掘。然而,目前資料探勘所使用的技術,大部分採取對資料先進行前置處理,也就是說先將企業交易資料轉出資訊系統,然後再轉換整理成一定格式的檔案,接下來,應用合適的演算法來進行資料挖掘。
然而,某些研究中顯示,資料的前置處理是整個資料探勘過程中最耗費資源的部分,故本研究擬提將資料探勘技術整合入企業交易資料庫的方法,換言之,原始的交易資料直接被用以進行資料探勘,如此一來,由於資料格式轉換所耗費的資源可大幅被減低,商情資訊將能以更有效率的方式即時提供予企業管理者做出正確的判斷。
本研究主要包含二種方法,第一種方法(FPN)著重於發展一個直接自企業原始交易資料表中找尋頻繁樣式的方法,進而有效率地產生關聯規則。主要特色包含:在既存的資訊系統中考量入資料轉換的前置處理,並提出一個較精簡的FPN-tree的資料結構來儲存及找尋頻繁樣式的資訊,伴以有效率的產生頻繁樣式之演算法,以幫助企業更即時快速地掌握有用資訊。除此,本方法於支持度門檻值(support threshold)調整時,不需重建資料結構,並可延伸用以找尋特定產品之頻繁樣式。
第二種方法(Char)則是基於關聯式資料庫廣泛被使用於企業資訊系統,如何從關聯資料表中彙整其特徵規則,本研究提出一個利用冗餘值(Redundancy)的計算,讓企業使用者只需設定一個直覺的門檻值,即能找出該資料表的主要特徵規則;如此,決策者即能依循發掘出的規則進行各式銷售分析,以期增益企業競爭力。
摘要(英) As data mining techniques are explored extensively, incorporating discovered knowledge into business leads to superior competitive advantages. Most data mining techniques nowadays are designed to solve problems based on transformed data files. Namely, the raw data tables should be transformed into specific formats before mining methods could be applied, and some previous works have pointed that such data transformation usually consumes a lot of resources. Therefore, this study proposes new methods which incorporate mining algorithms with enterprise transaction databases directly.
In this study, two methods are proposed to discovery knowledge from raw data of Enterprise Systems. The first one, named FPN, is developed to mine frequent patterns from transaction tables. Traditionally, data mining technique has seldom being applied in real-time. However, in many cases, the decisions have to be made in a short time, such as the decisions of promoting fresh agriculture goods in retailing stores should be made daily and in the limit of one or two hours. So the FPN method has following advantages to support real-time mining performed in enterprise systems: (i) raw data of enterprise systems are used directly, (ii) when the threshold is tuned, only newly qualified data are read and the data structure built for original data is kept intact, (iii) product assortments centered on particular product can be effective performed, (iv) the performance of the mining algorithm is better than that of popular mining algorithms.
The second method, Char, is proposed to find characteristics from database tables. It can be applied to find characteristics of customer tables or product tables… etc. In contrast to traditional data generalization or induction methods, the Char does not need a concept tree in advance and can generate a manual set of characteristic rules that are precise enough to describe the main characteristics of the data. The simulation results show that the characteristic rules found by Char are efficient as well as consistent regardless of the number of records and of attributes in the dataset.
關鍵字(中) ★ 資料探勘
★ 企業資料庫
★ 特徵規則
關鍵字(英) ★ frequent patterns
★ data mining
★ characteristic rules
論文目次 摘要 i
Abstract ii
誌謝 iii
Content iv
List of Tables vi
List of Figures vii
Chapter 1 Introduction - 1 -
1.1 Mining Association Rules from Raw Data of Enterprise Systems - 3 -
1.2 Summary Characteristic Rules from Tabular Data of Enterprise Systems - 8 -
Chapter 2 Related Work - 13 -
2.1 Literature Review on Mining association Rules - 13 -
2.1 Literature Review on Mining Characteristic Rules - 15 -
Chapter 3 The FPN Method - 18 -
3.1 The Data Preparation - 18 -
3.2 Data Structures and Functions - 19 -
3.3 The FPN Method - 21 -
3.3.1 The FPN-tree Construction - 21 -
3.3.2 The Frequent Pattern Generation Phase. - 26 -
3.4 Extra-Support of real-time Mining - 34 -
3.5 The completeness and correctness of the FPN Method - 36 -
3.5.1 The completeness and compactness of the FPN-tree. - 36 -
3.5.2 The completeness and correctness of the FPDiscovery - 38 -
Chapter 4 Simulation Results of the FPN Method - 40 -
4.1 Datasets and characteristics - 40 -
4.2 Performance Evaluation - 42 -
4.2.1 Performance Evaluation on adjusting minimum support thresholds - 42 -
4.2.2 Performance Evaluation versus FPGrowth and ECLAT - 43 -
Chapter 5 The Char Method - 46 -
5.1 A Formal Definition of Characteristic Rules - 46 -
5.2 The Attribute Preparation Step - 47 -
5.3 The Char Tree Construction Step - 47 -
5.3.1 Preliminaries and definitions - 48 -
5.3.2 The Char Algorithm of constructing Char Tree - 52 -
5.4 The Correctness of Char_Algorithm - 57 -
Chapter 6 Simulation Results of the Char Method - 62 -
6.1 Complexity - 65 -
6.1.1 The effect of Char_thresholds on Time Complexity - 66 -
6.1.2 The effect of numbers of records on Time Complexity - 66 -
6.1.3 The effect of numbers of potential rules on Time complexity - 67 -
6.2 Rule Stability - 68 -
6.2.1 Rule Stability with respect to Char_thresholds - 68 -
6.2.2 Rule Stability with respect to number of records - 69 -
6.2.3 Rule Stability with respect to numbers of potential characteristic rules - 70 -
6.3 Experiments on the real-life datasets - 72 -
Chapter 7 Conclusion - 73 -
Bibliography - 76 -
參考文獻 [1] G. Piatetsky-Shapiro and W. J. Fayyad, and P. Smith, From data mining to knowledge discovery: An overview. In U.M. Fayyad, F. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in knowledge Discovery and Data Mining, page 1-35. 1996, AAAI/MIT Press.
[2] J. Han, J. Pei, Y. Yin, Mining frequent patterns for relational databases, in: Proceedings of ACM-SIGMOD International Conference on Management of Data, 2000, pp. 1–12.
[3] M.-S. Chen, J. Han, P. Yu, Data mining: An overview from a database perspective, IEEE Transactions on Knowledge and Data Engineering 8 (6) (1996) 866–883.
[4] R. Agrawal, I. Imielinski, A. Swami, Mining association rules between sets of items in large databases, in: Proceedings of International Conference on Management of Data, 1993, pp. 207–216.
[5] R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, Machine Learning: An Artificial Intelligence Approach, Vol. 1. San Mateo, CA: Morgan Kaufmann, 1983.
[6] J. Han, M. Kamber, Data Mining: Concepts and Techniques, San Francisco, CA: Morgan Kaufmann, 2001.
[7] R. Agrawal, R. Srikant, Fast algorithm for mining association rules in large databases, Tech. Rep. RJ 9839, IBM Almaden Research center (1994).
[8] P. Shenoy, et al., Turbo-charging vertical mining of large databases, in: Proceedings of ACM SIGMOD International Conference in Management of Data (SIGMOD’00), 2000.
[9] J. Han, J. Pei, Y. Yin, R. Mao, Mining frequent patterns without candidate generation: A frequent-pattern tree approach, Data Mining and Knowledge Discovery 8 (2004) 53–87.
[10] A. Gupta, V. Harinarayan, and D. Quass. Aggregate-Query processing in data warehousing environment. In Proc. 21st Int. Conf. Very Large Data Bases. Pages 358-369, Zurich, Switzerland, Sept. 1995.
[11] V. Harinarayan, J.D. Ullman, and A. Rajaraman. Implementing data cubes efficiently. In proc. 1996 Int’l Conf. on Data Mining and Knowledge Discovery (KDD’96) Portland, Oregon, August 1996.
[12] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, A. Verkamo, Fast discovery of association rules., in: Advances in knowledge Discovery and Data Mining., AAAI/MIT Press, 1996, pp. 307–328.
[13] H. Mannila, H. Toivonen, A. Verkamo, Efficient algorithms for discovering association rules, in: Proceedings AAAI’94 in Databases (KDD’94), 1994, pp. 220–231.
[14] S. Brin, R. Motwani, C. Silverstein, Efficiently mining long patterns from databases, in: Proceedings of ACM SIGMOD International Conference in Management of Data(SIGMOD’97), 1997, pp. 265–276.
[15] C. Silversten, S. Brin, R. Motwani, J. Ullman, Scalable techniques for mining causal structures, in: Proceedings of International Conferences in Very Large Data Bases (VLDB98), 1998, pp. 594–605.
[16] R. Argawal, R. Srikant, Mining sequential patterns, in: Proceedings of International Conference on Data Engineering (ICDE’95), 1995, pp. 3–14.
[17] H. Mannila, H. Toivonen, A. Verkamo, Discovery of frequent episodes in event sequences, Data Mining and Knowledge Discovery 1 (1997) 259–289.
[18] B. Lent, A. Swami, J. Widom, Clustering association rules, in: Proceedings of International Conference on Data Engineering (ICDE’97), 1997, pp. 220–231.
[19] M. Kamber, J. Han, J. Chiang, Metarule-guided mining of multi-dimensional association rules using data cubes, in: Proceedings of International Conference on Knowledge Discovering and Data Mining (KDD’97), 1997, pp. 207–210.
[20] R. Bayardo, Beyond market basket: generalizing association rules to correlations, in: Proceedings of ACM SIGMOD International Conference in Management of Data (SIGMOD’98), 1998, pp. 85–93.
[21] J. Han, G. Dong, Y. Yin, Efficient mining of partial periodic patterns in the time series database, in: Proceedings of International Conference on Data Engineering (ICDE’99), 1999, pp. 106–115.
[22] G. Dong, J. Li, Efficient mining of emerging patterns., in: Proceedings of International Conference on Knowledge Discovering and Data Mining (KDD’99), 1999, pp. 43–52.
[23] S. Sarawagi, S. Thomas, R. Agrawal, Integrating association rule mining with relational database systems: Alternatives and implications, in: Proceedings of ACM SIGMOD International Conference in Management of Data(SIGMOD’98), 1998, pp. 343–354.
[24] G. Grahne, L. Lakshmanan, X.Wang, Efficient mining of constrained correlated sets., in: Proceedings of International Conference on Data Engineering (ICDE’00), 2000, pp. 512–521.
[25] M. Zaki, K. Gouda, Fast vertical mining using diffsets, in: Proceedings of International Conference on Knowledge Discovering and Data Mining (KDD’03), 2003.
[26] D. Burdick, M. Calimlim, J. Gehrke, Mafia: a maximal frequent itemset algorithm for transactional databases, in: Proceedings of International Conference on Data Engineering (ICDE’01), 2001.
[27] M. Zaki, Scalable algorithms for association mining, IEEE Transactions on Knowledge and Data Engineering 12 (3) (2000) 372–390.
[28] R. Agrawal, K. Shim, Developing tightly-coupled data mining applications on a relational database system, in: Proceedings of International Conference on Knowledge Discovering and Data Mining (KDD’96), 1996.
[29] J. Han, Y. Fu, W. Wang, K. Koperski, O. Zaiane, Dmql: A data mining query language, in: In 1996 SIGMOD’96Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’96), 1996.
[30] R. Meo, G. Psaila, S. Ceri, A new sql-like operator for mining association rules, in: Proceedings of International Conference in Very Large Data Bases (VLDB’96), 1996.
[31] J. Han and Y. Fu. Dynamic generation and refinement of concept hierarchies for knowledge discovery in databases. In Proc. AAAI’94 Workshop on Knowledge Discovery in Databases (KDD’94), pages 158-168, Seattle, WA, July 1994.
[32] J. Han and Y. Fu. Exploration of the power of attribute-oriented induction in data mining, In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Ming, pages 399-421. AAAI/MIT Press, 1996.
[33] J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann.
[34] J. R. Quinlan. Induction of decision trees. Machine Learning, 1:81-106, 1986.
[35] R. Argawal, R. Srikant, Fast algorithms for mining associations rules, in: Proceedings of International Conference in Very Large Data Bases, 1994, pp.487–499.
[36] U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. Proceedings of International Joint Conference on Artificial Intelligence (IJCAI-93), pp. 1022-1029, 1993
[37] A. A. Freitas. Understanding the Crucial Role of Attribute Interaction in Data Mining. Artificial Intelligence Review, 16(3), Nov. 2001, 177-199.
[38] David J.C. MaCay, Information Theory, Inference, Learning Algorithms. The 6th edition, Cambridge University Press, September 2003.
[39] R. Agrawal, T. Imielinski, & A. Swami, Database mining: a performance perspective. IEEE Transactions on knowledge and Data Engineering, 5(6), 1993, 914-925
[40] J. C. Shafer, R. Agrawal, & M. Mehta, SPRINT: A scalable parallel classifier for data mining. Proceedings of the 22nd International Conference on Very Large Databaes (pp.514-555). 1996, Mumbai(Bombay), India.
[41] M. Wang, B. Iyer, & J.S. Vitter, Scalable mining for classification rules in relation databases. Proceedings of International Database Engineering and Applications Symposium(pp. 58-67). 1998, Cardiff, Wales, UK
指導教授 許秉瑜(Ping-Yu Hsu) 審核日期 2006-6-6
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明