應用資料探勘技術於預測潛在客戶之研究-以臍帶血公司為例

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：47

、訪客IP：52.14.49.59

姓名

黃柏中(Huang-Po Chung) 查詢紙本館藏

畢業系所

資訊管理學系在職專班

論文名稱

應用資料探勘技術於預測潛在客戶之研究-以臍帶血公司為例

相關論文

★ 利用資料探勘技術建立商用複合機銷售預測模型	★ 應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例
★ 資料探勘技術應用於航空業航班延誤分析-以C公司為例	★ 全球供應鏈下新產品的安全控管-以C公司為例
★ 資料探勘應用於半導體雷射產業-以A公司為例	★ 應用資料探勘技術於空運出口貨物存倉時間預測-以A公司為例
★ 使用資料探勘分類技術優化YouBike運補作業	★ 特徵屬性篩選對於不同資料類型之影響
★ 資料探勘應用於B2B網路型態之企業官網研究-以T公司為例	★ 衍生性金融商品之客戶投資分析與建議-整合分群與關聯法則技術
★ 應用卷積式神經網路建立肝臟超音波影像輔助判別模型	★ 基於卷積神經網路之身分識別系統
★ 能源管理系統電能補值方法誤差率比較分析	★ 企業員工情感分析與管理系統之研發
★ 資料淨化於類別不平衡問題: 機器學習觀點	★ 資料探勘技術應用於旅客自助報到之分析—以C航空公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

我國新生兒出生率每年持續下降的因素，加上臍帶血、臍帶幹細胞之治療上受限於醫療法規的限制，使得該產業的競爭程度相當激烈。而在如此競爭激烈的產業之中，如何有效的運用潛在客戶名單資料，使潛在客戶成為幹細胞存戶，便是幹細胞儲存業者在營運上的重要課題。本研究著重在潛在客戶之落點分析，依現有所蒐集到的潛在客戶名單資料，利用資料探勘監督式學習技術所分析出來的資訊，提供個案公司決策者或執行的業務單位有所依據。以往傳統的經驗法則在執行上可能因缺乏客觀的信度與效度，導致目標客戶落點不準確，而喪失企業獲利的機會。本研究的目的包括利用資料探勘監督式學習技術挖掘出潛在客戶的客戶落點預測模型、建立潛在客戶落點預測模型，並進一步比較單一分類技術與多重分類技術之差異及藉由本次研究，提供個案公司在預測潛在客戶的相關產業做為參考。
本研究在實驗流程上採用Weka資料探勘軟體，並進行不同分類技術的實驗，本研究在單一分類技術分別採用決策樹推估模式、支援向量機推估模式、類神經網路推估模式、最鄰近演算法等四種單一分類技術，並搭配多重分類技術中的Bagging、AdaBoost以加以驗證，以試圖獲得最佳潛在客戶落點預測模型。
經過實驗結果得知以2014年的訓練集資料而言，在單一分類技術中以最鄰近演算法表現最佳，在多重分類技術中分別以Bagging的類神經網路推估模式、AdaBoost的決策樹推估模式表現最佳，透過Weka的實驗結果，正確率(Correctly Classified Instances)與接收者操作特徵曲線(ROC)普遍值達到0.68、0.7左右，具有較佳參考意義。因此，本研究建議個案公司未來在進行潛在客戶落點預測時，可以優先採用單一分類技術中的最鄰近演算法，並搭配多重分類技術中Bagging的類神經網路推估模式、AdaBoost的決策樹推估模式，以進行潛在客戶落點預測分析。

摘要(英)

Since the birth rate of new born babies in our country was continued to decline every year and the applications of treatments with cord blood and umbilical cord stem cells were restricted by medical regulations, the competitions in the industry were extremely fierce. In such competitive industry, how to use the name list of potential customers effectively and making the potential customers become existing stem cell storage customers was an important business operation issue of the stem cell storage providers. This study focuses on the analysis of potential customer placements. Based on the collected data name list of potential customers and using the result information analyzed with supervised machine learning techniques for data mining, we could provide the corporate decision makers or the executive business unit useful reference information. Due to lack of objective reliability and validity, using traditional rule of thumb resulted inaccurate placement of target customers and loss of company profit opportunities. The aims of this study were discovering the best prediction model for potential customer placements by supervised machine learning techniques; further comparing the difference between single and multiple classification techniques; and with this study, providing case company reference information for prediction of potential customers in related industries.
For trying to get the best prediction model of potential customer placements, the experimental processes was designed to use Weka data mining software and compared different classifying techniques. In this study, we adopted four kinds of single classification techniques, which are decision tree, support vector machine, artificial neural network, and k-nearest neighbor. Besides, the Bagging and AdaBoost methods are employed to construct classifier ensembles of the four single classifiers.
The experimental results show that with the training data of 2014, the nearest neighbor classifier provides the best performance. For classifier ensemble, the Bagging based artificial neural network and the AdaBoost based decision tree models perform the best. Particularly, the classification accuracy and receiver operating characteristic curve (ROC) can achieve about 0.68 and 0.7. Therefore, we could suggest the case company to adopt the nearest neighbor algorithm first to perform the prediction of potential customer placements and use both Bagging based neural network and the AdaBoost based decision tree models to perform potential customer placement prediction at the same time.

關鍵字(中)

★ 資料探勘
★ 監督式學習技術
★ 預測模型
★ 單一分類技術與多重分類技術

關鍵字(英)

★ data mining
★ supervised machine Learning techniques
★ prediction model
★ single classification technique and multiple classification technique

論文目次

摘要 i
Abstract ii
誌謝 iv
目錄 v
圖目錄 vii
表目錄 viii
第一章緒論 1
1.1 研究背景與動機 1
1.2 研究目的 3
1.3 研究對象與範圍 4
1.3.1 個案公司介紹 4
1.3.2 資料蒐集範圍 4
1.3.3 資料蒐集限制 5
1.4 論文架構 5
第二章文獻探討 7
2.1 資料探勘介紹 7
2.1.1 資料探勘定義 7
2.1.2 資料探勘的方式 9
2.1.3 資料探勘的程序 11
2.2 產品定價策略與景氣之關聯性 12
2.2.1 產品定價策略 12
2.2.2 景氣指標介紹 13
2.2.3 景氣對策信號(Monitoring Indicator) 16
2.2.4 景氣燈號(Monitoring Lights) 17
第三章研究方法 20
3.1 研究架構 20
3.2 個案公司 21
3.3 資料蒐集 23
3.4 資料前處理 24
3.5 資料探勘應用軟體 25
3.6 資料探勘分類技術 27
3.6.1 單一分類技術 28
3.6.1.1 決策樹推估模式(Decision Tree) 28
3.6.1.2 支援向量機推估模式(SVM) 30
3.6.1.3 類神經網路推估模式(Multiplayer Perceptron) 31
3.6.1.4 最鄰近演算法(KNN) 32
3.6.2 多重分類技術 33
3.6.2.1 Bagging 34
3.6.2.2 AdaBoost 35
第四章研究結果 36
4.1 單一分類技術分析結果 37
4.2 多重分類技術分析結果 40
4.3 討論 46
第五章結論 49
5.1 研究結論 49
5.2 研究貢獻 49
5.3 研究限制及未來研究方向 50
5.3.1 研究限制 50
5.3.2 建議未來研究方向 50
參考文獻 52

參考文獻

【英文文獻】
1. Abu H. M Kamal, Xingquan Zhu, Abhijit Pandya, and Sam Hsu, "Feature Selection with Biased Sample Distributions," IEEE IRI 2009.
2. Arya, B. and Lin, Z. (2007), "Understanding collaboration outcomes from an extended resource-based view perspective: the roles of organizational characteristics, partner attributes, and network structures," Journal of Management, 33(5), 697-732.
3. Becerra, M. (2008), "A resource-based analysis of the conditions for the emerge nce of profits," Journal of Management, 34,(6), 1110-1126.
4. Berry, M. J. A. and Linoff, G.S.(1997), Data Mining Technique for Marketing, Sale, and Customer Support, Wiley Computer.
5. Berry, M. J. A. and Linoff, G.S.(2001), Data Mining Technique for Marketing, Sale, and Customer Support, Wiley Computer, 2nd ED., 2001.
6. Bianchi C. and Montemaggiore G. B. (2008), "Enhancing Strategy Design and Planning in Public Utilities through “Dynamic” balanced scorecards:Insight from a Project in a City Water Company," System Dynamic Review Vol. 24, No. 2, (summer 2008): 175-213.
7. C.F. Tsai, 2009, "Feature selection in bankruptcy prediction," Knowledge-Based Systems, Vol. 22, No. 2, pp. 120-127.
8. Dorian Pyle(1999), Data Preparation for Data Ming, Morgan Kaufmann.
9. Dunham, M. H.(2003), ‘’Data Mining Introductory and Advanced Topics,’’ N. J., Pearson Education Inc.
10. Fayyad, M.U(1996), ‘’Data Mining and Knowledge Discovery: Making Sense Out of Data, ‘’IEEE Expect, 11(10),20-25.
11. Frawley, W. J., Sharpiro, G. P. and Mantheus C. J.(1992), ‘’Knowledge Discovery in Database:An Overiew,’’ AI Magazine, 13(3),57-10.
12. Grupe, G. H. and Owrang(1995), M. M ’’Database Mining Discovering New Knowledge and Cooperative Advantage,’’ Information SytemManagement, I(12),26-31.
13. Han, J. and Kamber M.(2001), ’’Data Mining:Concepts and Techniques,’’ Academic Press, San Diego.
14. Kleissner, C., & Technol, A. (1998). Data Mining for the Enterprise. System Sciences, 1998., Proceedings of the Thirty-First Hawaii International Conference on, 7.
15. Macher J.T., Nowery D.C. (2009), "Measuring Dynamic Capabilities: Practices and Performance in Semiconductor Manufacturing," British Journal of Management , 20, S 41-S62.
16. Rakesh Agrawal, Tomasz Imielinskim and Atun Swami(1993),’’Database Mining:A Performance Perspective,’’ IEEE Trans on Knowledge and Data Engineering, 5(6), 914-925.
17. Sadath, L. (2013). Data Mining: A Tool for Knowledge Management in Human Resource. International Journal of Innovative Technology and Exploring Engineering (IJITEE), 2.
18. Sikora Riyaz, Piramuthu Selwyn, 2007, "Framework for efficient feature selection in genetic algorithm based data mining," European Journal of Operational Research, Vol. 180, Issue 2, pp. 723-737.
19. Zhao,X. (2008). An Empirical Study of Data Mining in Performance Evaluation of HRM. In 2008 International Symposium on Intelligent Information Technology Application Workshops (pp. 82-75).
【中文文獻】
1. 姚少凌(2003)，「造血幹細胞體外增殖培養技術與應用」，國立清華大學化學工程研究所博士論文。頁 1-1.
2. 翁慈宗(2009)，資料探勘的發展與挑戰. 科學發展, 頁 33-34.
3. 黃朋祥(2013)，「運用資料採礦技術建立客戶留吃預測模型以聯成電腦為例」，致理技術學院服務業經營管理研究所碩士論文。
4. 范睿昀(2015)，「應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例」，中央大學資訊管理學系碩士論文。
5. 許郁卿(2011)，「整合資料探勘及紅綠配行銷策略於商品銷售之研究」，靜宜大學資訊管理學系碩士論文。
6. 金伯伶(2011)，「台灣景氣指標宣告對貿易百貨類股之影響」，台北大學企業管理學系碩士論文。
7. 洪彥群(2014)，「利用資料探勘技術建立商用複合機銷售預測模型」，中央大學資訊管理學系碩士論文。
8. 國家發展委員會(2016)，台灣經濟景氣指標月刊，第39卷，第12期，47。
9. 閻慧群、劉怡伶譯(2000)，定價聖經，台北：藍鯨。
10. 黃皓宇(2008)，「台灣臍帶血儲存公司之競爭策略」，天主教輔仁大學科技管理學程碩士在職專班碩士論文。
11. 朱啟源(2011), 「資料前處理之研究：以基因演算法為例」，國立中央大學資訊管理學系碩士論文。
12. 楊正三，葉明龍，莊麗月，陳禹融，楊正宏，「利用資訊增益與瀰集演算法於基因微陣列之特徵選取與分類問題」，資訊科技國際期刊，第二卷，第十期，2008，第50-62頁。
13. 林明潔，董子毅，「危險評估中 ROC 曲線在預測 2×2 表上與敏感度及特異度之關係」，亞洲家庭暴力與性侵害期刊，第四卷第二期，2008，64 -74。
14. 邱映潔(2008)，「幹細胞研究與再生醫療產品之法律規範」，國立交通大學科技法律研究所碩士論文。
15. 高棋楠(2012)「資料探勘技術建構財務危機公司預警模式之研究」國立中正大學會計與資訊科技研究所碩士論文。
16. 歐嘉文(2012)，「基因演算法運用於特徵挑選解決財務危機預測問題」，國立中央大學資訊工程學系軟體工程學系資訊工程碩士班碩士學位論文。
17. 翁紹宏，陳麗帆，朱基銘，白璐，楊燦，劉立，孫建安，「使用資料探勘演算法預測非肺小細胞肺癌患者存活情形及其效能比較」，2008，台灣國際醫學資訊聯合研討會。
【網站資料】
1. 國家發展委員會(accessed 2016/3/16, available at:
http://www.ndc.gov.tw/Default.aspx)
2. 生寶臍帶血銀行(accessed 2015/8/18), available at:
http://www.healthbanks.com.tw/)

指導教授

蔡志豐(Tsai)

審核日期

2016-6-7

推文