基於密度的超立方體覆蓋之啟發式演算法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：46

、訪客IP：18.116.62.132

姓名

蔣秉芳(Ping-Fang Chiang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於密度的超立方體覆蓋之啟發式演算法
(Efficient Classification Using Density-Based Hyper-Rectangle Covers)

相關論文

★ 以伸展樹為基礎的Android Binder Driver	★ 應用增量式學習於多種農作物判釋之研究
★ 應用分類重建學習偵測航照圖幅中的新穎坵塊	★ 用於輔助工業零件辨識之尺寸估算系統
★ 使用無紋理之3D CAD工業零件模型結合長度檢測實現細粒度真實工業零件影像分類	★ 一個建立在平行工作系統上的動態全球計算平台
★ 用權重參照計數演算法執行主動物件垃圾收集	★ 一個動態負載平衡之最大可能性估算計算架構
★ 利用多項系統負載資訊進行動態P2P系統重組的策略研究	★ 基於Hadoop系統的雲端應用程式特徵擷取與計算監測架構
★ 適用於大型動態分散式系統的調適性計算模型	★ 一個提供彈性虛擬資料中心的雲端服務平台
★ 雲端彈性虛擬機房服務平台之資源控管中心	★ 一個適用於自動供應雲端系統的動態調適計算架構
★ 線性相關工作與非相關工作的探索式排程策略	★ 適用於大資料集高效率的分散式階層分群演算法

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在資料建模、和機器學習的領域中，我們可以將不同資料對應到歐幾里德超空間後，再建立排他性的超立方體來覆蓋全部資料，然後利用這些超立方體做為資料辨識的規則或知識。然而，以往這方面的研究在建立這樣的排他性超立方體時，經常會花費太多的時間；或是雖然時間很短，卻犧牲太多的準確率。本篇論文嘗試在貪婪演算法高效率的基礎上，以不犧牲太多效率的方式，建構出擁有高度資料辨識率的超立方體覆蓋。針對較大的資料、和較佳的判斷兩方面，本篇論文分別提出兩種不同的啟發式方法，以便滿足大量資料和高精準度的不同需求。另外，論文也提供了將超立方體覆蓋的結果轉為析取範式(DNF)的方法，使得資料在完成建模之後能夠有更佳的可讀性。最後，本篇論文探討了超立方體建模的天生限制，並且嘗試對這個限制提出了將來可能的改善方向。

摘要(英)

In the fields of data modeling and machine learning, using exclusive hyper-rectangles which contain various classes of data in the Euclidean Hyper-Space as rules or knowledge, has been widely studied for data classifications. However, prior hyper-rectangle-based algorithms either take too much time on constructing hyper-rectangles for better classification results, or sacrifice accuracy of classification in return of less execution time. To solve this problem, this paper tries to propose a better hyper-rectangle-covering-based method, which produces good data classification results and yet executes efficiently. Considering both sides of larger data and more accurate result, this paper extends our idea to two novel, alternate heuristic methods, to fulfill different demands on precise classification and massive data usage. In this paper, we also provide a procedure to translate the results of the hyper-rectangle covers into conjunctive normal forms, which are more readable for human beings. We also point out an inherent restriction of the algorithms that use hyper-rectangles for data modeling, and propose a possible research direction to overcome the restriction.

關鍵字(中)

★ 資料探勘
★ 超立方體
★ 資料識別

關鍵字(英)

★ Hyper-Rectangle
★ Data Classification
★ Data Mining

論文目次

摘要 I
Abstract I
目錄 III
圖目錄 IV
表目錄 V
第一章　緒論 1
前言 1
1-1背景知識 2
1-2問題定義與實作目標 5
1-3研究貢獻 5
1-4文章架構 5
第二章　相關研究 6
2-1 Popular Classification Methods 6
2-2 Rectangle Greedy Cover 10
2-3使用窮舉法的RGC實作方式 11
2-4不使用窮舉法的RGC實作方式 12
第三章　演算法架構 14
3-1 Natural Division 16
3-2 Simulated Crystallization 27
3-3 Classification 34
3-4 Readability 38
3-5 瓶頸與限制 39
第四章實驗結果與分析 41
4-1 實驗準確性 41
4-2 實驗效率 46
第五章　結論與未來方向 48
參考文獻 50

參考文獻

[1] M. Ichino, "A nonparametric multiclass pattern classifier," IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, pp. 345-352, 1979.
[2] M. Kudo and M. Shinbo, "Optimal subclasses with dichotomous variables for feature selection and discrimination," IEEE Transactions on Systems, Man, and Cybernetics, vol. 19, pp. 1194-1199, 1989.
[3] A. Blumer, A. Ehrenfeucht, D. Haussler and M. Warmuth, "Learnability and the vapnik-chervonenkis dimension," Journal of the ACM (JACM), vol. 36, no. 4, pp. 929-965, 1989.
[4] M. Kudo, S. Yanagi and M. Shinbo, "Construction of class regions by a randomized algorithm: A randomized subclass method," Pattern Recognition, vol. 29, pp. 581-588, 1996.
[5] K. Ouchi, A. Nakamura and M. Kudo, "Efficient Construction and Usefulness of Hyper-Rectangle Greedy Covers," in IEEE International Conference on Granular Computing, 2011.
[6] F. Usama, G. Piatetsky-Shapiro and P. Smyth, "From Data Mining to Knowledge Discovery in Databases," 1996.
[7] ACM SIGKDD, "Data Mining Curriculum: A Proposal," 2006.
[8] C. Christopher, “Encyclopedia Britannica: Definition of Data Mining,” 2010.
[9] Y. Shidara, M. Kudo and A. Nakamura, "Extraction of generalized rules with automated attribute abstraction," Foundations of Data Mining and knowledge Discovery, Studies in Computational Intelligence, vol. 6, pp. 161-170, 2005.
[10] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, 1993.
[11] C. Cortes and V. Vapnik, "Support-Vector Networks," Machine Learning, 1995.
[12] H. Zhang, "The Optimality of Naive Bayes," in FLAIRS Conference, 2004.
[13] D. Gunopulos, R. Khardon, H. Mannila, S. Saluja, H. Toivonen and R. Sharma, "Discovering all most specific sentences," ACM Transactions on Database Systems (TODS), vol. 28, no. 2, pp. 140-174, 2003.
[14] M. Kudo and A. Nakamura, "What sperner family concept class is easy to be enumerated?," in Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, 2008.
[15] T. Uno, M. Kiyomi and H. Arimura, "LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets," in IEEE ICDM’04 Workshop FIMI, 2004.
[16] T. Uno and K. Satoh, "Detailed description of an algorithm for enumeration of maximal frequent sets with irredundant dualization," in Online CEUR Workshop Proceedings of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, 2003.
[17] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann and I. H. Witten, "The WEKA Data Mining Software: An Update," SIGKDD Explorations, vol. 11, 2010.
[18] C.-C. Chang and C.-J. Lin, "LIBSVM : a library for support vector machines," ACM Transactions on Intelligent Systems and Technology, 2011.
[19] A. Frank and A. Asuncion, "UCI Machine Learning Repository," Irvine, CA: University of California, School of Information and Computer Science., 2010. [Online]. Available: http://archive.ics.uci.edu/ml.

指導教授

王尉任(Wei-Jen Wang)

審核日期

2012-8-1

推文