摘要(英) |
Since the birth rate of new born babies in our country was continued to decline every year and the applications of treatments with cord blood and umbilical cord stem cells were restricted by medical regulations, the competitions in the industry were extremely fierce. In such competitive industry, how to use the name list of potential customers effectively and making the potential customers become existing stem cell storage customers was an important business operation issue of the stem cell storage providers. This study focuses on the analysis of potential customer placements. Based on the collected data name list of potential customers and using the result information analyzed with supervised machine learning techniques for data mining, we could provide the corporate decision makers or the executive business unit useful reference information. Due to lack of objective reliability and validity, using traditional rule of thumb resulted inaccurate placement of target customers and loss of company profit opportunities. The aims of this study were discovering the best prediction model for potential customer placements by supervised machine learning techniques; further comparing the difference between single and multiple classification techniques; and with this study, providing case company reference information for prediction of potential customers in related industries.
For trying to get the best prediction model of potential customer placements, the experimental processes was designed to use Weka data mining software and compared different classifying techniques. In this study, we adopted four kinds of single classification techniques, which are decision tree, support vector machine, artificial neural network, and k-nearest neighbor. Besides, the Bagging and AdaBoost methods are employed to construct classifier ensembles of the four single classifiers.
The experimental results show that with the training data of 2014, the nearest neighbor classifier provides the best performance. For classifier ensemble, the Bagging based artificial neural network and the AdaBoost based decision tree models perform the best. Particularly, the classification accuracy and receiver operating characteristic curve (ROC) can achieve about 0.68 and 0.7. Therefore, we could suggest the case company to adopt the nearest neighbor algorithm first to perform the prediction of potential customer placements and use both Bagging based neural network and the AdaBoost based decision tree models to perform potential customer placement prediction at the same time. |
參考文獻 |
【英文文獻】
1. Abu H. M Kamal, Xingquan Zhu, Abhijit Pandya, and Sam Hsu, "Feature Selection with Biased Sample Distributions," IEEE IRI 2009.
2. Arya, B. and Lin, Z. (2007), "Understanding collaboration outcomes from an extended resource-based view perspective: the roles of organizational characteristics, partner attributes, and network structures," Journal of Management, 33(5), 697-732.
3. Becerra, M. (2008), "A resource-based analysis of the conditions for the emerge nce of profits," Journal of Management, 34,(6), 1110-1126.
4. Berry, M. J. A. and Linoff, G.S.(1997), Data Mining Technique for Marketing, Sale, and Customer Support, Wiley Computer.
5. Berry, M. J. A. and Linoff, G.S.(2001), Data Mining Technique for Marketing, Sale, and Customer Support, Wiley Computer, 2nd ED., 2001.
6. Bianchi C. and Montemaggiore G. B. (2008), "Enhancing Strategy Design and Planning in Public Utilities through “Dynamic” balanced scorecards:Insight from a Project in a City Water Company," System Dynamic Review Vol. 24, No. 2, (summer 2008): 175-213.
7. C.F. Tsai, 2009, "Feature selection in bankruptcy prediction," Knowledge-Based Systems, Vol. 22, No. 2, pp. 120-127.
8. Dorian Pyle(1999), Data Preparation for Data Ming, Morgan Kaufmann.
9. Dunham, M. H.(2003), ‘’Data Mining Introductory and Advanced Topics,’’ N. J., Pearson Education Inc.
10. Fayyad, M.U(1996), ‘’Data Mining and Knowledge Discovery: Making Sense Out of Data, ‘’IEEE Expect, 11(10),20-25.
11. Frawley, W. J., Sharpiro, G. P. and Mantheus C. J.(1992), ‘’Knowledge Discovery in Database:An Overiew,’’ AI Magazine, 13(3),57-10.
12. Grupe, G. H. and Owrang(1995), M. M ’’Database Mining Discovering New Knowledge and Cooperative Advantage,’’ Information SytemManagement, I(12),26-31.
13. Han, J. and Kamber M.(2001), ’’Data Mining:Concepts and Techniques,’’ Academic Press, San Diego.
14. Kleissner, C., & Technol, A. (1998). Data Mining for the Enterprise. System Sciences, 1998., Proceedings of the Thirty-First Hawaii International Conference on, 7.
15. Macher J.T., Nowery D.C. (2009), "Measuring Dynamic Capabilities: Practices and Performance in Semiconductor Manufacturing," British Journal of Management , 20, S 41-S62.
16. Rakesh Agrawal, Tomasz Imielinskim and Atun Swami(1993),’’Database Mining:A Performance Perspective,’’ IEEE Trans on Knowledge and Data Engineering, 5(6), 914-925.
17. Sadath, L. (2013). Data Mining: A Tool for Knowledge Management in Human Resource. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 2.
18. Sikora Riyaz, Piramuthu Selwyn, 2007, "Framework for efficient feature selection in genetic algorithm based data mining," European Journal of Operational Research, Vol. 180, Issue 2, pp. 723-737.
19. Zhao,X. (2008). An Empirical Study of Data Mining in Performance Evaluation of HRM. In 2008 International Symposium on Intelligent Information Technology Application Workshops (pp. 82-75).
【中文文獻】
1. 姚少凌(2003),「造血幹細胞體外增殖培養技術與應用」,國立清華大學化學工程研究所博士論文。頁 1-1.
2. 翁慈宗(2009),資料探勘的發展與挑戰. 科學發展, 頁 33-34.
3. 黃朋祥(2013),「運用資料採礦技術建立客戶留吃預測模型以聯成電腦為例」,致理技術學院服務業經營管理研究所碩士論文。
4. 范睿昀(2015),「應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例」,中央大學資訊管理學系碩士論文。
5. 許郁卿(2011),「整合資料探勘及紅綠配行銷策略於商品銷售之研究」,靜宜大學資訊管理學系碩士論文。
6. 金伯伶(2011),「台灣景氣指標宣告對貿易百貨類股之影響」,台北大學企業管理學系碩士論文。
7. 洪彥群(2014),「利用資料探勘技術建立商用複合機銷售預測模型」,中央大學資訊管理學系碩士論文。
8. 國家發展委員會(2016),台灣經濟景氣指標月刊,第39卷,第12期,47。
9. 閻慧群、劉怡伶譯(2000),定價聖經,台北:藍鯨。
10. 黃皓宇(2008),「台灣臍帶血儲存公司之競爭策略」,天主教輔仁大學科技管理學程碩士在職專班碩士論文。
11. 朱啟源(2011), 「資料前處理之研究:以基因演算法為例」,國立中央大學資訊管理學系碩士論文。
12. 楊正三,葉明龍,莊麗月,陳禹融,楊正宏,「利用資訊增益與瀰集演算法於基因微陣列之特徵選取與分類問題」,資訊科技國際期刊,第二卷,第十期,2008,第50-62頁。
13. 林明潔,董子毅,「危險評估中 ROC 曲線在預測 2×2 表上與敏感度及特異度之關係」,亞洲家庭暴力與性侵害期刊,第四卷第二期,2008,64 -74。
14. 邱映潔(2008),「幹細胞研究與再生醫療產品之法律規範」,國立交通大學科技法律研究所碩士論文。
15. 高棋楠(2012)「資料探勘技術建構財務危機公司預警模式之研究」國立中正大學會計與資訊科技研究所碩士論文。
16. 歐嘉文(2012),「基因演算法運用於特徵挑選解決財務危機預測問題」,國立中央大學資訊工程 學系軟體工程學系資訊工程碩士班碩士學位論文。
17. 翁紹宏,陳麗帆,朱基銘,白璐 ,楊燦,劉立,孫建安,「使用資料探勘演算法預測非肺小細胞肺癌患者存活情形及其效能比較」,2008,台灣國際醫學資訊聯合研討會。
【網站資料】
1. 國家發展委員會(accessed 2016/3/16, available at:
http://www.ndc.gov.tw/Default.aspx)
2. 生寶臍帶血銀行(accessed 2015/8/18), available at:
http://www.healthbanks.com.tw/) |