針對長時運算財務分析模型的分散式運算模式效率之比較

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：71

、訪客IP：18.217.10.160

姓名

何峻昇(JYUN SHENG HE) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

針對長時運算財務分析模型的分散式運算模式效率之比較
(Performance Comparison for Distributed Computing Model in Long-Running Financial Analysis Computation)

相關論文

★ 針對提昇資料倉儲資料庫執行效能之知識管理與相關系統設計	★ 以關聯規則探勘為基礎，探討詐騙車手提領型態互動之研究
★ 部落格之網路口碑評比機制平台管理與應用	★ 虛擬貨幣交易平台之實現
★ 適用於多種設備的可否認鑑別協定之設計	★ 交易程式最佳化的多維度分析平台之設計與建置
★ 多商品多策略程式交易績效曲線比較和分群機制之研究	★ 以工作流程與Portlet為基礎整合學習管理系統以支援課程編組
★ 使用服務導向技術建構具支援二線廠客製化能力的電子中樞系統之研究	★ 以流程為中心的Portlet重用性分析
★ 應用資料倉儲技術建構平衡計分卡資訊系統之研究-以某消費性電子製造公司人力資源計分卡為例	★ 自動化的產品平台管理與應用
★ 以代理人為基礎的資訊系統協助新產品開發流程的自動化	★ 以整合式的教練引導開發以框架為基礎的專案
★ 支援新產品研發的整合性知識管理系統	★ 以流程為導向協同異質性數位學習系統中呈現層與資料層之研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

目前在全球金融領域中，有許多利用現有財務分析模型做為應用的研究，但存在著一些技術性問題，例如:計算複雜的財務分析模型與處理大數據，需要耗費大量的運算時間。由於目前針對CPU Intensive和I/O Intensive問題，做分散式運算的文獻，沒有公平的標準與開發流程，使得開發者無法得知哪種分散式運算，適合解決什麼樣類型的Intensive特性，以及如何透過成本較低的方式，進行分散式運算。
本研究針對金融領域所面臨的CPU Intensive和I/O Intensive問題，運用分散式運算模式，提升財務分析模型與大數據在運算上的效率，進而解決大量的運算時間，以及比較三種常用的分散式運算模式，在CPU Intensive和I/O Intensive特性上運算的效率，讓開發者針對特性選擇較適合的分散式運算模式撰寫，來達到提升財務模型在運算上的效率，但結果與原預期效果不合，因此探討如何修正初步流程，讓修正後的流程可以更快找出效率不佳的原因。
本研究提出修正後初步流程，確實能夠更快找出效率不佳的原因。修正後的流程主要分為七個階段，首先第一步、根據CPU Intensive與I/O Intensive特性，挑選出適合的財務分析模型，之後第二步、決定出適合開發大量複雜運算的程式語言，本研究是採用SAS、MATLAB程式語言開發，第三步、複雜運算的初步效率驗證，針對需要轉換的程式，進行初步效率比較，第四步、為了規範模型程式碼的一制性，因此需要確保程式複雜性，在沒有增加的情況下，將SAS程式轉換成MATLAB程式，第五步、開始撰寫分散式運算程式，讓程式及資料可以達到分散式運算的效果，第六步、進行實測，最後第七步、分析與討論分散式運算模式的效率。

摘要(英)

Currently the global financial field, there are many studies about using existed financial analysis model as applications, but there are some technical problems. For example: to calculate complex financial analysis models and handle big data takes lots of computing time. for the current CPU Intensive and I/O Intensive problem, research distributed computing literature no fairer standard and development processes, to make the developers can’t know what kind of distributed computing for solving Intensive what type of properties, and how to through a cost-effective manner to distributed computing.
In this study, for the financial field faced CPU Intensive and I / O Intensive problem, using distributed computing model to enhance the financial analysis model and big data on the efficiency of operation, thereby solve lot of computing time, and comparison of three common distributed computing model, in the CPU Intensive and I/O Intensive characteristic operation efficiency, to make developers can choose more suitable for the characteristics of the distributed computing model development, enhance the efficiency of financial models in operation, but the result is not the same as the original expected results, research how to amend the original process to quickly identify the cause of inefficiency.
The study presents a revised preliminary process can really quickly identify poor efficiency reasons. It divided into seven phases, the first step, according to CPU Intensive and I / O Intensive properties, selected more suitable for financial calculation model, the second step, to determine the programming language more suited to the development of complex operations, the study is the use SAS and MATLAB programming language development, the third step, for the program to be converted, to compare the initial efficiency, the fourth step, to ensure that the program did not increase the complexity of the case, to make SAS program convert MATLAB program, the fifth step, develop distributed computing program, the sixth step, experiment, finally, efficiency analysis and discussion of distributed computing model.

關鍵字(中)

★ 分散式運算
★ Hadoop
★ SAS
★ MATLAB
★ JAVA RMI

關鍵字(英)

★ Distributed Computing
★ Hadoop
★ SAS
★ MATLAB
★ JAVA RMI

論文目次

摘要 IV
Abstract V
致謝辭 VII
目錄 VIII
圖目錄 XI
表目錄 XIII
第一章緒論 1
1.1 研究背景 1
1.2研究動機 2
1.3 研究目的 4
1.4 研究流程 5
第二章文獻探討 6
2.1 分散式運算 6
2.2 Hadoop 8
2.3 MapReduce 9
2.4 HDFS 10
2.5 財務統計程式語言 11
2.5.1 SAS程式語言 12
2.5.2 MATLAB程式語言 13
2.6 分散式運算模式 13
2.6.1 SAS分散式運算模式 13
2.6.2 MATLAB分散式運算模式 14
2.7 JAVA RMI(Remote Method Invocation, RMI) 15
第三章研究設計 16
3.1 初步研究設計 16
3.1.1 轉換程式規則 16
3.1.2 SAS分散式運算模式流程 18
3.1.3 Hadoop分散式運算模式流程 19
3.1.4 JAVA RMI分散式運算模式流程 22
3.2 分散式運算模式之比較方法及原則 24
第四章研究分析 25
4.1 初步流程驗證 25
4.1.1 研究環境之軟硬體規格 25
4.1.2 運算環境概念圖 26
4.1.3 研究問題 27
4.1.4 研究結果 27
4.1.4.1 CPU Intensive實驗之SAS 與JAVA RMI比較結果 27
4.1.4.2 CPU Intensive實驗之JAVA RMI 與 Hadoop比較結果 30
4.2 修正初步流程 31
4.2.1 初步流程失敗原因 31
4.2.2 驗證修正後之流程 32
4.2.2.1 驗證CPU Intensive 32
4.2.2.2 驗證I/O Intensive 37
第五章結論 41
參考文獻 42

參考文獻

[1] Sanjay P. Ahuja, Renato Quintao, "Performance Evaluation of Java RMI A Distributed ObjectArchitecture for Internet BasedApplications, "in 8th International Symposium on, 2000, pp. 565 - 569.
[2] Arghya Kusum Das, Seung-Jong Park, Jaeki Hong, Wooseok Chang, "Evaluating Different Distributed-Cyber-Infrastructure for Data and Compute Intensive Scientific Application," in Big Data (Big Data), 2015 IEEE International Conference on, 2015, pp. 134 - 143.
[3] Rassul Fazelat(2015, September 21), A Comprehensive Analysis: Apache Spark vs MapReduce. Retrieved from https://www.linkedin.com/pulse/comprehensive-analysis-apache-spark-vs-mapreduce-rassul-fazelat
[4] Lei Gu, Huan Li, "Memory or Time: Performance Evaluation for Iterative Operation on Hadoop and Spark, "in High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on, 2013, pp. 721 - 727.
[5] Abhinay Gupta, Raviraj Gundety, Vivek Fernando, Neeraj Iyer, Beatrice.S, "Survey on Metadata Management Schemes in HDFS, "in International Journal of Computer Science and Information Technologies, 2014, Vol.5(2), pp. 2163.
[6] Hung-Neng Lai(2015), Adjusted Probability of Informed Trading and E-M Algorithm, Working Paper, Department of Finance, National Central University.
[7] Lukas(2009, February 23), Comparison of data analysis packages: R, Matlab, SciPy, Excel, SAS, SPSS, Stata. Retrieved from https://brenocon.com/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/
[8] A. Munar, E. Chiner, and I. Sales, "A Big Data Financial Information Management Architecture for Global Banking," in Future Internet of Things and Cloud (FiCloud), 2014 International Conference on, 2014, pp. 385-388.
[9] G. Pardhavi, T. Princess Raichel, Dr. M. Giri, "Big Data: Wiki Data Mining in Hadoop, "in International Journal of Advanced Research in Computer Science and Software Engineering, 2015, Vol.5, pp. 915.
[10] Tao Zhu, Chengchun Shu1, Haiyan Yu1, "Green Scheduling: A Scheduling Policy for Improving the Energy Efficiency of Fair Scheduler, "in 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, 2011, pp. 319 - 326.
[11] 冬果(2015)。MapReduce錯誤處理，任務調度及Shuffle過程，取自http://blog.csdn.net/scgaliguodong123_/article/details/46439219
[12] 沈炳宏(初探 Hadoop 開放原始碼平台環境)。RUN! PC 雜誌，2009 年 11 月號
[13] 李彥明(2001)。以XML為基礎的分散式運算資源管理系統之研究，國立臺灣大學土木工程學系碩士論文。
[14] 呂欣汶(2014)。雲端醫療紀錄之巨量資料存取與處理平台建置，東海大學資訊工程學系碩士論文。
[15] 周恬弘(2008)。幾種常用的統計分析軟體比較，取自http://thchou.blogspot.tw/2008/07/blog-post_13.html
[16] 林宗良(2003)。基因演算法運用於分散式網路分級資訊管理之研究，國立暨南國際大學資訊管理學系碩士論文。
[17] 胡仲軒(2008)。在擁有部分資訊且需付手續費之財務模型中之最佳投資策略，國立交通大學應用數學系所碩士論文。
[18] 陳仕彬(2012)。雲端運算之編譯排程系統設計與實作，國立成功大學電機工程學系碩士論文。
[19] 郭易勳(2014)。改善MapReduce效能之區域性感知排程，國立東華大學資訊工程學系碩士論文。
[20] 梁嘉勝(2013)。以Hadoop為平台-結合異質資料庫與Hive之加速查詢應用，國立東華大學資訊工程學系碩士在職專班碩士論文。
[21] 陳柏珽(2011)。分而治之演算法之探討，逢甲大學應用數學所碩士論文。
[22] 國立台灣大學統計教學中心(2016)。認識SAS，取自http://www.statedu.ntu.edu.tw/lab/SAS_2.0/1.1%20%E8%AA%8D%E8%AD%98%20SAS.pdf
[23] 陸宗儀(2010)。以MATLAB語言撰寫非結構性網格、體心共位有限體積法求解熱傳問題，國立高雄海洋科技大學輪機工程研究所碩士論文。
[24] 張家瑋(2014)。雲端平台大數據資料庫研究-以報關訊息資料為例，龍華科技大學資訊管理系碩士班碩士論文。
[25] 曾柏崴(2015)。分散式計算系統及巨量資料處理架構設計-基於YARN, Storm及Spark，國立政治大學資訊管理學系研究所碩士論文。
[26] 黃志瑋(2004)。建構分散式運算與資料共享之整合平台，臺中健康暨管理學院資訊科技學系研究所論文。
[27] 黃啟川(2003)。多層次認證系統的實現，國立成功大學工程科學系碩士論文。
[28] 異想天開(2015)。RMI原理，取自http://www.cnblogs.com/moonandstar08/p/4957492.html
[29] 詹智斌(2015)。虛擬化分散式運算環境之實作與評估，德明財經科技大學資訊管理系在職專班碩士論文。
[30] 楊元森(2014)。水情測預報平台與分散式運算，國立臺北科技大學土木工程系土木與防災碩士班（碩士在職專班）碩士論文。
[31] 楊貴安(2012)。Hadoop雲端平台在工程應用之探討研究，國立中央大學士木工程學系碩士論文。
[32] 廖知航(2015)。RHadoop技術探討與實作，靜宜大學資訊管理學系碩士論文。
[33] 維基百科(2016)。統計分析系統，取自https://zh.wikipedia.org/wiki/統計分析系統
[34] 維基百科(2016)。MATLAB，取自https://zh.wikipedia.org/wiki/MATLAB
[35] 賴彥良(2012)。雲端運算之應用與效能評估，靜宜大學資訊工程學系碩士論文。
[36] 簡玠忠(2013)。基於Hadoop框架建立巨量資料分析處理模型研究，國立中興大學資訊科學與工程學系碩士論文。
[37] 顏春煌(2013)。作業系統理論與實務(第二版)。台北市，碁峰出版社。
[38] 羅子澄(2012)。運用於Hadoop雲端運算的資料探勘混合編碼演算法，國立清華大學資訊工程學系。

指導教授

許智誠、賴弘能(Kevin Chihcheng Hsu Hung Neng Lai)

審核日期

2016-7-14

推文