博碩士論文 89522034 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:19 、訪客IP:18.117.182.179
姓名 張毓美(Yu-Mei Chang)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 應用卡方獨立性檢定於關連式分類問題
(Association Based Classification Using Chi-Square Independence Test)
相關論文
★ 行程邀約郵件的辨識與不規則時間擷取之研究★ NCUFree校園無線網路平台設計及應用服務開發
★ 網際網路半結構性資料擷取系統之設計與實作★ 非簡單瀏覽路徑之探勘與應用
★ 遞增資料關聯式規則探勘之改進★ 中文資料擷取系統之設計與研究
★ 非數值型資料視覺化與兼具主客觀的分群★ 關聯性字組在文件摘要上的探討
★ 淨化網頁:網頁區塊化以及資料區域擷取★ 問題答覆系統使用語句分類排序方式之設計與研究
★ 時序資料庫中緊密頻繁連續事件型樣之有效探勘★ 星狀座標之軸排列於群聚視覺化之應用
★ 由瀏覽歷程自動產生網頁抓取程式之研究★ 動態網頁之樣版與資料分析研究
★ 同性質網頁資料整合之自動化研究★ 時序性資料庫中未知週期之非同步週期性樣板的探勘
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 分類問題一直是機器學習領域中的主要問題。近年來,由於關連式規則挖掘技術的興起,使得越來越多的研究以關連式規則挖掘的技術來解決分類問題。在本篇論文中,我們研究及探討幾個關連式分類問題的方法,並且提出一個新的分類方法,此方法稱為ACC(意即「應用卡方獨立性檢定於關連式分類問題」)。ACC利用關連式規則挖掘技術找出所有頻繁且有趣的項目集,利用這些項目集建立屬性與屬性之間的關係。除此之外,ACC利用卡方獨立性檢定來檢測屬性與類別之間的關係,以保留與類別相關的頻繁集來做預測。我們使用UCI機器學習資料庫中的13個資料庫進行實驗,將我們的方法(ACC)與NB及LB兩種高效率及高正確性的方法做比較。實驗結果顯示,我們的方法在大多數的資料庫上優於NB及LB,亦是一種高效率及高正確性的分類方法。
摘要(英) For many years, classification s one of the key problems in machine learning research. Since association rule mining is an important and highly active data mining research, there are more and more classification methods based on association rule mining techniques. In this thesis, we study several association based classification methods and provide the comparison of these classifiers. We present a new method, called ACC (i.e. Association based Classification using Chi-square Independence test), to solve the problems of classification.
ACC finds frequent and interesting itemsets, which describe the relations between attributes. Moreover, it applies chi-square independence test to remain class-related itemsets for predicting new data objects. Besides, ACC provides an approach that considers the probability of missing value occurrence to solve the problem of missing value. Our method is experimented on 13 datasets from UCI machine learning database repository. We compare ACC with NB and LB, the state-of-the-art classifiers and the experimental results show that our method is a highly effective, accurate classifier.
關鍵字(中) ★ 資料探勘
★ 關連式規則
★ 分類
關鍵字(英) ★ Association Rules
★ Classification
★ Data Mining
論文目次 1 Introduction 1
1.1 Association Rule Mining .............................1
1.2 Concepts of Association Based Classification ..................2
1.3 Method and Goal.................................2
1.4 Organization of the Thesis............................3
2 Related Work 4
2.1 NB-Naïve Bayes Classifier ...........................4
2.2 LB-Large Bayes Classifier............................5
2.3 CBA -Classification Based on Associations ...................6
2.4 CMAR -Classification Based on Multiple Association Rules .........7
2.5 Comparison....................................7
2.6 Summary .....................................8
3 Classification Method 10
3.1 Learning Phase..................................12
3.1.1 Discovering Frequent Itemsets......................12
3.1.2 Discovering Interesting Itemsets.....................12
3.1.3 Discovering Class-Related Itemsets ...................13
3.2 Classification Phase................................14
3.3 Learning Algorithm................................15
3.4 Classification Algorithm.............................17
3.5 Zero Counts Smoothing .............................19
4 Experimental Results and Discussion 20
4.1 Parameter Setting.................................21
4.2 Experimental Results...............................22
4.3 Discussion.....................................22
4.3.1 The Effect of Missing Value .......................22
4.3.2 The Effect of Parameter Setting.....................26
5 Conclusion and Future Work.....................31
參考文獻 [1] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. VLDB-94 Sept 1994.
[2] R. Duda and P. Hart. Pattern Classification and Scene Analysis John Wiley &Sons, 1973.
[3] U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proc. of 13th Int. Joint Conference om Artificial Intelligence pages 1022 。V1027,1993.
[4] P. M. Lewis. Approximation probability distributions to reduce storage requirements. Information and Control 2:214 。V225,1959.
[5] W. Li, J. Han, and J. Pei. CMAR: Accurate and efficient classification based on multiple class-association. In ICDM, 2001.
[6] B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule mining. In Proc. 4th Int’l Conf. on Knowledge Discovery and Data Mining 1998.
[7] D. Meretakis and B. Wuthrich. Extending naive bayes classifiers using long itemsets. In KDD-99 pages 165, V174, 1999.
[8] C. J. Merz and P. Murphy. UCI repository of machine learning databases,1996.
[9] K. Wang, S. Zhou, and Y. He. Growing decision tree on support-less association rules. In KDD-00, Aug 2000.
[10] D. H. Wolpert. The relationship between, pac the statistical physics framework, the bayesian framework, and the vc framework. The Mathematics of Generalization,1994.
指導教授 張嘉惠(Chia-Hui Chang) 審核日期 2002-7-13
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明