詢問式倒傳遞類神經網路在資料挖掘的應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：26

、訪客IP：3.133.147.252

姓名

賴良賓(Liang-Bin Lai) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

詢問式倒傳遞類神經網路在資料挖掘的應用
(Data Mining by Query-Based Back-Propagation Neural Networks)

相關論文

★ 關聯式資料庫之廣義知識探勘	★ 協同商務模式中理財資訊與股價關聯之規則挖掘
★ 一個以內容為基礎的代理伺服器演算法	★ 使用群聚壓縮樹之高效率關聯法則挖掘法
★ 利用資料探勘改善代理伺服器預先擷取效率之研究	★ 利用資料挖掘技術輔助軟體重構之研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

資料挖掘技術已被許多企業廣泛應用，以從大量雜亂的商業資料中，得到洞察先機的機會。類神經網路是目前資料挖掘常用的技術之一，更已經被廣泛地應用在其他問題領域中，例如：操控交通工具，辨識DNA序列，貨運空間配置的排列，以及預測匯率等。雖然，類神經網路在這些監督式和非監督式的學習問題上，已經有許多成功應用的實例，但當考慮直接將類神經網路應用於資料挖掘問題時，就必須面對兩個重大的問題，那就是從大量資料中形成模式所需要的時間和學習模式的可理解性。所訓練得到的模式不易被理解，是類神經網路最常受到的批評，但近年來學者們已經發展出許多可行的解決方案，如：Shavlik 與Lu就提出十分成功的方法，以從完成訓練的類神經網路中萃取規則。可是，相對的，針對類神經網路必須耗費許多訓練時間以處理大量資料的改善方案與研究，卻相當的缺乏。本研究即針對此問題，應用詢問式倒傳遞類神經網路學習方法於資料挖掘的分類問題上，以解決傳統類神經網路在處理大量資料時必須耗費許多時間的問題。詢問式學習法的主要精神，在於針對學習者不清楚的地方加強學習，正如孔子所言：「因才施教」，所以往往能事半功倍。本研究的目的是利用詢問式的學習方法訓練倒傳遞類神經網路，以增進網路的學習效率，並提高模式的正確性。本論文考慮多項資料挖掘的應用問題，包括：心臟病、乳癌、糖尿病之疾病診斷資料與電子商務之人口統計資料。並考慮類神經網路模式的學習效率、預測正確率、預測可靠度，以包括假說判斷表(Contingency table)與接受者操作特徵曲線(Receiver Operating Characteristic Curve, ROC)等多項重要指標，來進行驗證。實驗結果顯示，不論在訓練時間和分類的結果，我們的詢問式倒傳遞類神經網路，都明顯的比原始倒傳遞類神經網路為佳。未來我們希望擴展詢問式的學習方法到其他資料挖掘技術上。

摘要(英)

The central focus of data mining in enterprises is to gain insight into large collections of data for making a good prediction and a right decision. Neural networks have been applied to a wide variety of problem domains such as steering motor vehicles, recognizing genes in uncharacterized DNA sequences, scheduling payloads for the space shuttle, and predicting exchange rates. Advantages of neural networks include the high tolerance to noisy data as well as the ability to classify patterns having not been trained. Neural networks have been successfully applied to a wide range of supervised and unsupervised learning problems. However, while being applied in data mining, there are two fundamental considerations - the comprehensibility of learned models and the time required to induce models from large data sets. For the first problem, many approaches have been proposed for extracting rules from trained neural networks. In this thesis, we focus on the second problem. We introduce a query-based learning algorithm to improve neural networks’’ performance in data mining. Results show that the proposed algorithm can significantly reduce the training set cardinality. Our future work is to apply this learning procedure to other data mining schemes.

關鍵字(中)

★ 資料挖掘
★ 詢問式學習法
★ 倒傳遞類神經網路

關鍵字(英)

★ query-based learning
★ backpropagation neural networks
★ data mining

論文目次

Abstract ii
誌謝詞 iii
第1章緒論 1
1.1 研究背景 1
1.2 研究問題 2
1.3 研究動機與目的 2
1.4 論文架構 3
第2章文獻探討 4
2.1 資料挖掘 4
2.2 學習演算法 7
2.3 學習演算法之延展能力 8
2.4 多層感知機與倒傳遞學習法 10
第3章詢問式的學習法 16
3.1 反向倒傳遞與混淆邊界搜尋 18
3.2 梯度共軛資料與詢問神諭 20
3.3 新的詢問神諭 22
3.4 詢問式的倒傳遞學習法 24
第4章實驗與結果 25
4.1 實驗設計 25
4.2 疾病診斷之應用 29
4.3 人口統計資料之應用 37
第5章結論與未來研究方向 39
5.1 結論 39
5.2未來研究方向 40
參考文獻 41

參考文獻

【1】葉怡成，民90。類神經網路模式應用與實作，儒林書局。
【2】 M. Berry and G. Linoff, (1997), Data Mining Techniques: For Marketing, Sales and Customer Support, John Wiley & Sons.
【3】 The Technology Review Ten (2001). MIT technology review, 2001 January/February. Retrieved March 15, 2003, from http://www.technologyreview.com/magazine/jan01/tr10_toc.asp.
【4】 TIME (2000). Vision of the 21st Century - What Will Be the 10 Hottest Jobs? Retrieved June 21, 2002, from http://dmlab.snu.ac.kr/press/TIME_com Visions of the 21st Century -- Our Work, Our World -- May 1, 2000.htm
【5】 IDC (2002, Mar). Data Mining Software Market Forecaster. Retrieved March 27, 2003, from http://www.idc.com/getdoc.jhtml?containerId=PT046.
【6】 J. Shavlik, R. Mooney and G. Towell, (1991), "Symbolic and neural net learning algorithms: An empirical comparison," Machine Learning, Vol. 6, pp.111-143.
【7】 Hongjun Lu, R. Setiono, and Huan Liu, (1996), "Effective data mining using neural networks," Knowledge and Data Engineering, IEEE Transactions on, Volume: 8 Issue: 6, pp.957 -961.
【8】 H. Chung, P. Gray, (1999) "Special Section: Data Mining", Journal of Management Information Systems, Vol. 16, pp.13-16.
【9】 Ray-I Chang and Pei-Yung Hsiao, (1997), "Unsupervised query-based learning of neural networks using selective-attention and self-regulation," IEEE Trans. Neural Networks, vol.8, no.2, pp.205-217.
【10】 U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, (1996) "Advances in Knowledge Discovery and Data Mining," AAAI/MIT Press.
【11】 M.S. Chen, J. Han, and P.S. Yu, (1996), "Data mining: an overview from a database perspective," IEEE Transactions on Knowledge and Data Engineering, 8(6), pp.866-883.
【12】 B. Rajagopalan and R. Krovi, (2002), "Benchmarking data mining algorithms," Journal of Database Management, 13(1), pp. 25-35.
【13】 U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, (1996), "From data mining to knowledge discovery in database," AI Magazine, 17(3), pp. 37-54.
【14】 D. E. Rumelhart, G. E. Hinton, and R. J. Williams, (1986), "Learning internal Representations By Error Propagation. In Paralleled Distributed Processing," Vol. 1, pp. 318-362, Cambridge, MA: MIT Press.
【15】 L. Valiant, (1984), "A theory of the learnable," Communications of the Association for Computing Machinery, 27, pp.1134-1142.
【16】 Lyman, Peter and Hal R. Varian, (2000), How Much Information, Retrieved March 27, from http://www.sims.berkeley.edu/how-much-info.
【17】 F. Provost, and V. Kolluri, (1999), "A Survey of Methods for Scaling Up Inductive Algorithms," Data Mining and Knowledge Discovery 3(2), pp.131-169.
【18】 H. Czap, (2001), "Construction and interpretation of multi-layer-perceptrons," IEEE International Conference on Systems, Man, and Cybernetics, Vol.5, pp.3349-3354.
【19】 J. Hwang, J. Choi, S. Oh, R. Marks II, (1991), "Query-based learning applied to partially trained multilayer perceptrons," IEEE Transactions on Neural Networks, Vol. 2, pp.131-136.
【20】 E. W. Saad, J. J. Choi, J. L. Vian, and D. C. Wunsch II, (1999), "Efficient training techniques for classification with vast input space," IJCNN '99. International Joint Conference on Neural Networks, Vol.2, pp.1333 -1338.
【21】 T. Oates, and D. Jensen, (1997), "The Effects of Training Set Size on Decision Tree Complexity," In Proceedings of The Fourteenth International Conference on Machine Learning, pp.254-262.
【22】 E. B. Baum, (1991), "Neural-net algorithms that learn in polynomial time from examples and queries," IEEE Transactions on Neural Networks, Vol. 1, pp.5-19.
【23】 M. Wann, T. Hediger, and N.Greenbaun, (1990), "The influence of training sets on generalization in feed-forward neural networks," In Proc. Int. Joint. Conf. Neural Networks, San Diego, CA, pp.137-142.
【24】 A. Linden, and J. Kindermann, (1989), "Inversion of multilayer nets," IJCNN, International Joint Conference on Neural Networks, vol.2, pp.425-430.
【25】 R. Reed, R. J. Marks II and S. Oh, (1995), "Similarities of error regularization, sigmoid gain scaling, target smoothing, and training with jitter," IEEE Transactions on Neural Networks, Volume: 6 Issue: 3, pp.529-538.
【26】 J. M. DeLeo, and S. J. Rosenfeld, (2001), "Essential roles for receiver operating characteristic (ROC) methodology in classifier neural network applications," in Proc. Int. Joint Conf. Neural Networks, Vol.4, pp. 2730-2731.
【27】 K. Woods and K. W. Bowyer, (1997), "Generating ROC curves for artificial neural networks," IEEE Trans. Medical Imaging, vol. 16, no. 3, pp.329-337.
【28】 MedCalc (2003). Statistical software including ROC curve analysis and comparison of ROC curves. Retrieved March 27, 2003, from http://www.medcalc.be.
【29】 C. Blake, E. Keogh, C. J. Merz, (1998), UCI Repository of machine learning databases, http://www.ics.uci.edu/~mlearn/MLRepository.html, Irvine, University of California, Department of Information and Computer Science.
【30】行政院衛生署 (2003), 民國90年死因統計結果摘要, Retrieved March 27, 2003, from http://www.doh.gov.tw/statistic/index.htm.

指導教授

張瑞益(Ray-I Chang)

審核日期

2003-6-26

推文