考量屬性值取得延遲的決策樹建構

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：40

、訪客IP：13.59.73.248

姓名

徐聖堡(Sheng-Pao Hsu) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

考量屬性值取得延遲的決策樹建構
(Decision Tree Induction with Time Stamp of Information Acquisition)

相關論文

★ 零售業商業智慧之探討	★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例	★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析－以國內某DRAM廠為例	★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式	★ 應用資料探勘技術建立機車貸款風險評估模式之研究－以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證	★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 關聯式資料庫之廣義知識探勘
★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題	★ 利用分割式分群演算法找共識群解群體決策問題
★ 以新奇的方法有序共識群應用於群體決策問題	★ 利用社群網路中的互動資訊進行社群探勘

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

資料探勘能夠從數以千萬計的資料中，探勘出有用的資訊和規則以協助決策者，找出目標顧客群並且做出更好的決策。分類是資料探勘其中一項應用非常廣泛的技術，依據已知的資料及其類別屬性來建立資料的分類模型，並以此預測其他未經分類資料的類別，而決策樹是最常使用到的一項分類模型，因為它產生的規則容易了解、且建立速度快又簡單；傳統決策樹必須要拿到完整的歷史資料，才可能生長決策樹，但是管理者希望能夠在較早的時間點，先行掌握未來可能的變化。
因此，本研究給定每筆資料中取得每個屬性的延遲時間（delay time），建立一棵考量time stamp 概念決策樹TStree，管理決策人員期望透過TStree 決策樹，找到決策時間點與分類正確率兼具的規則，先行掌握未來可能的變化。
本研究實驗結果顯示，本研究TStree 演算法只要取得部份資料，就能夠提早做出決策，並且能夠達到與傳統C4.5 相近的準確度。

摘要(英)

Data mining can be used to discover the useful informations and rules from the tens of millions of data, and it can identify the target customers and make better decisions. Classification is a one of the data mining domain which a very wide range of application technologies, based on available information and categories of property to create a data classification model, and use this model to forecast class of the unclassified data, while the decision tree is the most commonly used to the a classification model, because it generates easy to understand rules, and fast to establishment and simple; the traditional decision tree need to gather a complete historical data, when it grow the decision tree, but the managers hope to make decision early in time.
Therefore, this study given the delay time (time stamp) for each attribute, and build a decision tree TStree which to consider the concept of time stamp, and the decision-makers hope to find the rules, it can be decision-making point early, and the nice rate of the classification accuracy through TStree.
The results show that we gather some information from the complete data set, then we can make decisions early through TStree Algorithm, and the accuracy of TStree algorithm is close to the accuracy of the traditional algorithm (i.e. ID3, C4.5, etc.)

關鍵字(中)

★ 提早決策時間
★ 延遲時間
★ 時間標籤
★ 分類
★ 決策樹

關鍵字(英)

★ decision time early
★ delay time
★ time stamp
★ classification
★ decision tree

論文目次

目錄 III
圖目錄 V
表目錄 VI
一、緒論 1
1-1 研究背景 1
1-2 研究問題 2
1-3 預期結果 2
1-4 研究流程 2
1-5 論文架構 4
二、文獻探討 5
2-1 C45 演算法 5
2-2 資料探勘 6
2-3 分類 6
2-4 決策樹 7
2-5 相關論文研究 7
2-5-1 提早執行決策 7
2-5-2 Cost-sensitive Analysis（成本敏感分析）相關決策樹文獻 8
2-5-3 Time Series（時間序列）相關決策樹文獻 9
2-5-4 Stock Market（股票市場）相關決策樹文獻 9
三、問題描述與相關定義 11
四、TSTREE 演算法 22
4-1 TSTREE 演算法基本概念 22
4-2 TSTREE 演算法架構 22
4-3 TSTREE 演算法範例說明 26
4-3-1 分類前要做Confidence、根節點類別 26
4-3-2 計算屬性值分類信心程度，即Confax(Dni) 27
4-3-3 篩選分類節點屬性的準則 28
4-3-4 篩選分類節點屬性的準則 29
4-3-5 決定根節點分類屬性：a2 30
4-3-6 重覆上述步驟生成TStree 決策樹 31
五、實驗評估 32
5-1 開發環境 32
5-2 資料來源 33
5-3 評估準則 33
5-4 資料前處理 34
5-4-1 中間屬性值的給定 34
5-4-2 類別標籤值的給定 34
5-4-3 認定類別標籤值時的發現 34
5-5 實驗結果 35
5-5-1 情境一：預測第7 個交易日股價 35
5-6 結果分析 39
六、結論與建議 40
6-1 結論 40
6-2 研究貢獻 40
6-3 未來研究 41
參考文獻 43

參考文獻

〔1〕謝昌倫，「以資料採礦方法辨認半導體晶圓圖的錯誤樣式」，私立淡江大學統計系應用統計研究所碩士論文，民國九十二年。
〔2〕陳彥良：從商業智慧到資料探勘－最常使用的資料採礦演算法(22頁)。2008年9月24日，取自 http://www.kdnuggets.com/。
〔3〕 Robert K. Lai, Chin-Yuan Fan, Wei-Hsiu Huang, Pei-Chann Chang, 2009, “Evolving and Clustering fuzzy Decision Tree for Financial Time Series Data Forecasting,” Expert System with Applications, pp. 3761-3773.
〔4〕 Shichao Zhang, Li Liu, Xiaofeng Zhu, Cheng Zhang, 2008, “A Strategy for Attributes Selection in Cost-Sensitive Decision Trees Induction,” IEEE 8th International Conference on Computer and Information Technology Workshops, pp. 8-13.
〔5〕 Mohd Noor Md Sap, Rashid Hafeez Khokhar, 2006, “Fuzzy Decision Tree for Data Mining of Time Series Stock Market Databases.”, USA.
〔6〕 Jar-Long Wang, Shu-Hui Chan, 2006, “Stock Market Trading Rule Discovery Using Two-layer Bias Decision Tree,” Expert Systems with Applications, pp. 605-611.
〔7〕 Muh-Cherng Wu, Sheng-Yu Lin, Chia-Hsin Lin, 2006, “An Effective Application of Decision Tree to Stock Trading,” Expert System with Applications, pp. 270-274.
〔8〕 Subramaniam S, Kalogeraki V, Gunopulos D, et al., 2007, “Improving process models by discovering decision points,” Information Systems, Volume:32, Issue:7, pp. 1037-1055
〔9〕 Rozinat, A., & Aalst, W. van der (2006b). Decision Mining in ProM. In S. Dustdar, J. Faideiro, & A. Sheth (Eds.), International Conference on Business Process Management (BPM 2006) (Vol. 4102, pp. 420-425). Springer-Verlag, Berlin.
〔10〕 A. Rozinat, W.M.P. van der Aalst, Decision mining in business processes, BPM Center Report BPM-06-10, BPMcenter.org, 2006
〔11〕 Berry, M. J. A., and Linoff, G., 1997, “Data Mining Techniques: For Marketing Sale and Customer Support,” John Wiley and Sons, Inc., Canada.
〔12〕 Budiarto, Nishio, S. and Tsukamoto, M., 2002, “Data Management Issues in Mobile and Peer-to-Peer Environments,” Data and Knowledge Engineering, Vol. 41, pp. 183-204.
〔13〕 Chen, M. S., Park, J. S. and Yu, P. S., 1998, “Efficient Data Mining for Path Traversal Patterns,” IEEE Transactions on Knowledge and Data Engineering, Vol. 10, No. 2, pp. 209-221.
〔14〕 Craven, M. W. and Shavlik, J. W., 1997, “Using Neural Networks for Data Mining,” Future Generation Computer System, Vol. 13, pp. 221-229.
〔15〕 Eric H. Sorensen, Keith L. Miller, and K. Ooi, Fall 2000, “The Decision Tree Approach to Stock Selection,” Journal of Portfolio Management 27, No. 1, pp. 42-52.
〔16〕 Fayyad, U., Piatetsky, S. G. and Padhraic, S. (1996), “Form Data Mining to Knowledge Discovery in Databases,” American Association for Artificial Intelligence, Vol. 17, No. 3, pp. 37-54.
〔17〕 Frawley, W. J., Paitetsky, S. G. and Matheus, C. J., 1992, “Knowledge Discovery in Database: An Overview, Knowledge Discovery in Databases,” The American Association for Artificial Intelligence, AAAI, California, 1992, pp. 1-15.
〔18〕 Han, J. W. and Kamber, M., 2001, “Data Mining Concepts and Techniques,” Morgan Kaufmann Publishers.
〔19〕 Hui, S. C. and Jha, G., 1999, “Data Mining for Customer Service Support,” Information Management, Vol. 38, No. 1, pp. 1-13.
〔20〕 Margaret H. Dunham, 2002. “Data Mining: Introductory and Advanced Topics,” Prentice Hall, New Jersey, USA.
〔21〕 Muh-Cherng Wu, Sheng-Yu Lin and Chia-Hsin Lin, August 2006, “An effective application of decision tree to stock trading,” Expert Systems with Applications, Volume 31, No. 2, pp. 270-274.
〔22〕 Peter D. Turney, 1995, “Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm,” Journal of Artificial Intelligence Research, Vol. 2, pp. 369-409.
〔23〕 Quinlan, J. R., 1986, “Induction of Decision Trees,” Machine Learning, Vol. 1, No. 1, pp.81-106.
〔24〕 Quinlan, J. R., 1993, “C4.5: Programs for Machine Learning,” San Francisco, CA: Morgan Kaufmann Publishers.
〔25〕 Saso Dzeroski, Valentin Gjorgjioski, Ivica Slavkov and Jan Struyf, 2007, “Analysis of Time Series Data with Predictive Clustering Trees,” KDID 2006, pp. 63-80.
〔26〕 Victor S. Sheng and Charles X. Ling, 2006, “Feature Value Acquisition in Testing: A Sequential Batch Test Algorithm,” Canada, in Proceedings of the 23nd International Conference on Machine Learning(ICML, 2006), pp. 809-816.
〔27〕 Yuu Yamada, Einoshin Suzuki, Hideto Yokoi and Katsuhiko Takabayashi, 2003, “Decision-tree Induction from Time-series Data Based on a Standard-example Split Test,” Proceedings of the Twentieth International Conference on Machine Learning (ICML, 2003).

指導教授

陳彥良(Yen-Liang Chen)

審核日期

2010-6-3

推文