應用遺傳演算法於離散化連續性屬性之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：90

、訪客IP：3.129.71.98

姓名

邱獻良(Hsien-Lian Chiu) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

應用遺傳演算法於離散化連續性屬性之研究
(Apply Genetic Algorithms to Discretization)

相關論文

★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 類神經網路於股價波段預測及選股之應用
★ 以類神經網路提高股票單日交易策略之獲利	★ 智慧型多準則決策支援研究：以交談式遺傳演算法為基礎的模型
★ 應用遺傳演算法於財務指標選股策略之探討	★ 遺傳演算法於股市資金分配策略應用上之研究
★ 組合編碼遺傳演算法於投資組合及資金分配之應用	★ 遺傳程式規劃於股市擇時交易策略之應用
★ 遺傳演算法於股市選股與擇時策略之研究	★ 多目標遺傳演算法於基本面選股策略之應用
★ 證券交易策略發掘	★ 遺傳演算法於SAP R/3 系統效能最佳化之應用
★ 動態多期資金管理策略發掘	★ 擴充固定比例(CPPI)與時間不變性投資組合保險策略(TIPP)於投資組合之應用
★ 演化式賽局於投資策略之研究	★ 利用遺傳演算法發掘投資組合保險之調整策略

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

連續性屬性的離散化可以被視為如何去選擇出一組屬性切點集合的問題，多數的過去研究致力於找到一組最小的切點集合，並且同時保留資料的一致性。然而維持過高的資料一致性可能會導致分類演算法歸納出數目過多且概化能力不佳的分類規則。進行屬性離散化除了考量資料一致性外，也應該要將概化能力納入考量，因為概化能力好的分類規則是很容易被了解及解釋說明的。本研究中提出了以遺傳演算法為基礎的離散化方法，目標是能夠有效率地找出符合資料一致性及概化能力考量下的一個折衷最佳切點集合來進行離散化。本研究中設計了二組實驗，實驗中的資料選自於美國加洲大學爾灣分校的機器學習資料庫，實証結果顯示出本方法可以依照使用者的需求產生簡化的離散結果，而且可以幫助分類演算法歸納出概化能力佳及預測正確率亦高的分類規則。

摘要(英)

Discretization of continuous attributes is one of main problems needed to be solved in data
mining. Discretization can be viewed as the problem of selecting a set of cut points of
attributes. Past studies concentrated on finding a minimal set of cut points and maintaining
the fidelity of the original data in discretization. However, maintaining too high
consistency may yield too many unnecessary rules which are not general. Generality is
an important aspect to discretization because general rules are usually useful and easy
to interpret. In this paper, a genetic algorithm based approach is proposed and the aim
is to efficiently find an optimal compromise solution of discretization between generality
and consistency criterions. Two sets of experiments on some data sets from UCI Machine
Learning Repository by this approach were done. The empirical results have demonstrated
that our GA approach can generate the simplest discretization result according to the requirement of the decision maker and help the classifier to induce general rules with high
predictive accuracy.

關鍵字(中)

★ 約略集合
★ 遺傳演算法
★ 屬性離散化
★ 分類規則的歸納

關鍵字(英)

★ rule induction
★ rough set theory
★ genetic algorithm
★ discretization

論文目次

1 Introduction 1
2 Related Works 6
2.1 Discretization Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Rough Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Information Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2 Approximation of Sets . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 Reduction of Attributes . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.4 Decision Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.5 Decision Support Using Decision Rules . . . . . . . . . . . . . . . . 10
2.3 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.3 Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.4 Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.5 Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.6 Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.7 Termination Criterion . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 GA-Based Discretization Approach 16
3.1 Definition of the Discretization Problem . . . . . . . . . . . . . . . . . . . 16
3.2 Genetic Algorithms for Discretization Problems . . . . . . . . . . . . . . . 17
3.2.1 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2 Initialization of the Population . . . . . . . . . . . . . . . . . . . . 18
3.2.3 Selection, Crossover and Mutation Operations . . . . . . . . . . . . 18
3.2.4 Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.5 Terminating Criterion . . . . . . . . . . . . . . . . . . . . . . . . . 19
4 Experiments 20
4.1 Real World Data for the Experiments . . . . . . . . . . . . . . . . . . . . . 20
4.2 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.4 Validation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5 Procedure of the Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 24
5 Experiment Results 25
5.1 Empirical Results of Experiment 1 . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Empirical Results of Experiment 2 . . . . . . . . . . . . . . . . . . . . . . 28
6 Conclusions 33
6.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.3 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

參考文獻

[1] J. G. Bazan. A comparision of dynamic and non-dynamic rough set methods for
extracting laws from decision table. In L. Polkowski and A. Skowron, editors, Rough
Sets in Knowledge Discovery, pages 321–365. Physica-Verlag, Heidelberg, 1993.
[2] C.L. Blake and C.J. Merz. UCI repository of machine learning databases, 1998.
[3] B.Predki, R.Slowinski, J.Stefanowski, R.Susmaga, and Sz.Wilk. Rose - software implementation
of the rough set theory. In L. Polkowski and A. Skowron, editors, Rough
Sets and Current Trends in Computing, Lecture Notes in Artificial Intelligence, pages
605–608, 6 1998.
[4] B.Predki and Sz.Wilk. Rough set based data exploration using rose system. In
Z.W.Ras and A.Skowron, editors, Foundations of Intelligent Systems, Lecture Notes
in Artificial Intelligence, pages 172–180, 1999.
[5] Leo Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression
Trees. Statistics/Probability Series. Wadsworth Publishing Company, Belmont,
California, U.S.A., 1984.
[6] Cai-Yun Chen, Zhi-Guo Li, Sheng-Yong Qiao, and Shuo-Pin Wen. Study on discretizaion
in rough set based on genetic algorithm. In Proceedings of the Second
International Conference on Machine Learning and Cybernetics, pages 1430 – 1434,
11 2003.
[7] Chaochang Chiu and Nanh Sing Chiu. An adapted covering algorithm approach for
modeling airplanes landing gravities. Expert Systems with Applications, 26(3):443–450,
4 2004.
[8] Jian-Hua Dai and Yuan-Xiang Li. Study on discretization based on rough set theory.
In Proceedings of the First International Conference on Machine Learning and
Cybernetics, pages 1371 – 1373, 11 2002.
[9] James Dougherty, Ron Kohavi, and Mehran Sahami. Supervised and unsupervised
discretization of continuous features. In International Conference on Machine Learning,
pages 194–202, 1995.
[10] U. M. Fayyad and K. B. Irani. ”on the handling of continuous-valued attributes in
decision tree generation”. Machine Larning, 8:87–102, 1992.
[11] U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuous-valued
attributes for classification learning. In Proceedings of 13th International Joint Conference
on Artificial Intelligence, pages 1022– 1029, 1993.
[12] D. E. Goldberg. Genetic Algorithm in Search, Optimization and Machine Learning.
Addison-Wesley, 1989.
[13] Jerzy W. Grzymala-Busse. Lers a system for learning from examples based on rough
sets. In R. Slowinski, editor, In Intelligent Decision Support. Handbook of Applications
and Advances of the Rough Sets Theory., pages 3–18. Kluwer Academic Publisher,
Dordrecht, 1992.
[14] Jerzy W. Grzymala-Busse. Data reduction: discretization of numerical attributes. In
Handbook of data mining and knowledge discovery, pages 218 – 225, 2002.
[15] Jerzy W. Grzymala-Busse and Xihong Zou. Classification strategies using certain and
possible rules. In Rough Sets and Current Trends in Computing, pages 37–44, 1998.
[16] J. H. Holland. Adaptation in Natural and Artificial Systems: An Introductory Analysis
with Applications to Biology, Control, and Artificial Intelligence. MIT Press, 1st MIT
Press edition, 1992. University of Michigan Press, 1st edition, 1975.
[17] Randy Kerber. Chimerge: Discretization of numeric attributes. In AAAI, pages
123–128, 1992.
[18] R. Kohavi. Bottom-up induction of oblivious, read-once decision graphs: strengths
and limitations. In Twelfth National Conference on Artificial Intelligence, pages 613–
618, 1994.
[19] Ron Kohavi and Mehran Sahami. Error-based and entropy-based discretization of
continuous features. In KDD, pages 114–119, 1996.
[20] Ron Kohavi and Mehran Sahami. Error-based and entropy-based discretization of continuous
features. In Proceedings of the Second International Conference on Knowledge
Discovery and Data Mining, pages 114–119, 1996.
[21] Z. Pawlak. ”rough sets”. Int’l J. Computer and Information Science, 11(5):341–356,
1982.
[22] Z. Pawlak. Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic
Publishers, 1991.
[23] Bernhard Pfahringer. Compression-based discretization of continuous attributes. In
ICML, pages 456–463, 1995.
[24] J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[25] R.Mienko, R.Slowinski, J.Stefanowski, and R.Susmaga. Rough family - software implementation
of rough set based data analysis and rule discovery techniques. In Tanaka-H.
Nakamura A. Yokomori, T., editor, W: Proceedings of the 4th Int. Workshop on Rough
Sets, Fuzzy Sets and Machine Discovery, Tokyo, pages 437–441, 1996.
[26] Nguyen H. S. and Skowron A. Quantization of real value attributes. In Proceedings
of Second Joint Annual Conf. on Information Science, Wrightsville Beach, North
Carolina, pages 34–37, 1995.
[27] A. Skowron. Boolean reasoning for decision rules generation. In J. Komorowski and
Z. Ra, editors, Proceedings of the 7th International Symposium ISMIS’93, Trondheim,
Norway. Springer-Verlag, 1993.
[28] R. Slowinski and J. Stefanowski. Rough classification with valued closeness relation.
In E. et al. Diday, editor, New Approaches in Classification and Data Analysis. New
York, pages 482–489. Springer-Verlag, 1993.
[29] J. Stefanowski. On rough set based approachs to induction of decision rules. In
L. Polkowski and A. Skowron, editors, Rough Sets in Knowledge Discovery, pages
500–529. Physica-Verlag, Heidelberg, 1998.
[30] Jerzy Stefanowski and Daniel Vanderpooten. A general two-stage approach to inducing
rules from examples. In RSKD, pages 317–325, 1993.
[31] S. M. Weiss and C. A. Kulikowski. Computer Systems That Learn: Classification
and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert
Systems. San Mateo, California: Morgan Kaufmann, 1990.
[32] W. Ziarko, R. Golan, and D. Edwards. An application of datalogic /r knowledge
discovery tool to identify strong predictive rules in stock market data. In Proceedings
of AAAI Workshop on Knowledge Discovery in Databases, Washington, DC, USA,
pages 89–101. The AAAI Press, Menlo Park, CA, 1993.

指導教授

陳稼興(Jiah-Shing Chen)

審核日期

2005-7-1

推文