摘要(英) |
Currently the global financial field, there are many studies about using existed financial analysis model as applications, but there are some technical problems. For example: to calculate complex financial analysis models and handle big data takes lots of computing time. for the current CPU Intensive and I/O Intensive problem, research distributed computing literature no fairer standard and development processes, to make the developers can’t know what kind of distributed computing for solving Intensive what type of properties, and how to through a cost-effective manner to distributed computing.
In this study, for the financial field faced CPU Intensive and I / O Intensive problem, using distributed computing model to enhance the financial analysis model and big data on the efficiency of operation, thereby solve lot of computing time, and comparison of three common distributed computing model, in the CPU Intensive and I/O Intensive characteristic operation efficiency, to make developers can choose more suitable for the characteristics of the distributed computing model development, enhance the efficiency of financial models in operation, but the result is not the same as the original expected results, research how to amend the original process to quickly identify the cause of inefficiency.
The study presents a revised preliminary process can really quickly identify poor efficiency reasons. It divided into seven phases, the first step, according to CPU Intensive and I / O Intensive properties, selected more suitable for financial calculation model, the second step, to determine the programming language more suited to the development of complex operations, the study is the use SAS and MATLAB programming language development, the third step, for the program to be converted, to compare the initial efficiency, the fourth step, to ensure that the program did not increase the complexity of the case, to make SAS program convert MATLAB program, the fifth step, develop distributed computing program, the sixth step, experiment, finally, efficiency analysis and discussion of distributed computing model. |
參考文獻 |
[1] Sanjay P. Ahuja, Renato Quintao, "Performance Evaluation of Java RMI A Distributed ObjectArchitecture for Internet BasedApplications, "in 8th International Symposium on, 2000, pp. 565 - 569.
[2] Arghya Kusum Das, Seung-Jong Park, Jaeki Hong, Wooseok Chang, "Evaluating Different Distributed-Cyber-Infrastructure for Data and Compute Intensive Scientific Application," in Big Data (Big Data), 2015 IEEE International Conference on, 2015, pp. 134 - 143.
[3] Rassul Fazelat(2015, September 21), A Comprehensive Analysis: Apache Spark vs MapReduce. Retrieved from https://www.linkedin.com/pulse/comprehensive-analysis-apache-spark-vs-mapreduce-rassul-fazelat
[4] Lei Gu, Huan Li, "Memory or Time: Performance Evaluation for Iterative Operation on Hadoop and Spark, "in High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on, 2013, pp. 721 - 727.
[5] Abhinay Gupta, Raviraj Gundety, Vivek Fernando, Neeraj Iyer, Beatrice.S, "Survey on Metadata Management Schemes in HDFS, "in International Journal of Computer Science and Information Technologies, 2014, Vol.5(2), pp. 2163.
[6] Hung-Neng Lai(2015), Adjusted Probability of Informed Trading and E-M Algorithm, Working Paper, Department of Finance, National Central University.
[7] Lukas(2009, February 23), Comparison of data analysis packages: R, Matlab, SciPy, Excel, SAS, SPSS, Stata. Retrieved from https://brenocon.com/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/
[8] A. Munar, E. Chiner, and I. Sales, "A Big Data Financial Information Management Architecture for Global Banking," in Future Internet of Things and Cloud (FiCloud), 2014 International Conference on, 2014, pp. 385-388.
[9] G. Pardhavi, T. Princess Raichel, Dr. M. Giri, "Big Data: Wiki Data Mining in Hadoop, "in International Journal of Advanced Research in Computer Science and Software Engineering, 2015, Vol.5, pp. 915.
[10] Tao Zhu, Chengchun Shu1, Haiyan Yu1, "Green Scheduling: A Scheduling Policy for Improving the Energy Efficiency of Fair Scheduler, "in 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, 2011, pp. 319 - 326.
[11] 冬果(2015)。MapReduce錯誤處理,任務調度及Shuffle過程,取自http://blog.csdn.net/scgaliguodong123_/article/details/46439219
[12] 沈炳宏(初探 Hadoop 開放原始碼平台 環境)。RUN! PC 雜誌,2009 年 11 月號
[13] 李彥明(2001)。以XML為基礎的分散式運算資源管理系統之研究,國立臺灣大學土木工程學系碩士論文。
[14] 呂欣汶(2014)。雲端醫療紀錄之巨量資料存取與處理平台建置,東海大學資訊工程學系碩士論文。
[15] 周恬弘(2008)。幾種常用的統計分析軟體比較,取自http://thchou.blogspot.tw/2008/07/blog-post_13.html
[16] 林宗良(2003)。基因演算法運用於分散式網路分級資訊管理之研究,國立暨南國際大學資訊管理學系碩士論文。
[17] 胡仲軒(2008)。在擁有部分資訊且需付手續費之財務模型中之最佳投資策略,國立交通大學應用數學系所碩士論文。
[18] 陳仕彬(2012)。雲端運算之編譯排程系統設計與實作,國立成功大學電機工程學系碩士論文。
[19] 郭易勳(2014)。改善MapReduce效能之區域性感知排程,國立東華大學資訊工程學系碩士論文。
[20] 梁嘉勝(2013)。以Hadoop為平台-結合異質資料庫與Hive之加速查詢應用,國立東華大學資訊工程學系碩士在職專班碩士論文。
[21] 陳柏珽(2011)。分而治之演算法之探討,逢甲大學應用數學所碩士論文。
[22] 國立台灣大學統計教學中心(2016)。認識SAS,取自http://www.statedu.ntu.edu.tw/lab/SAS_2.0/1.1%20%E8%AA%8D%E8%AD%98%20SAS.pdf
[23] 陸宗儀(2010)。以MATLAB語言撰寫非結構性網格、體心共位有限體積法求解熱傳問題,國立高雄海洋科技大學輪機工程研究所碩士論文。
[24] 張家瑋(2014)。雲端平台大數據資料庫研究-以報關訊息資料為例,龍華科技大學資訊管理系碩士班碩士論文。
[25] 曾柏崴(2015)。分散式計算系統及巨量資料處理架構設計-基於YARN, Storm及Spark,國立政治大學資訊管理學系研究所碩士論文。
[26] 黃志瑋(2004)。建構分散式運算與資料共享之整合平台,臺中健康暨管理學院資訊科技學系研究所論文。
[27] 黃啟川(2003)。多層次認證系統的實現,國立成功大學工程科學系碩士論文。
[28] 異想天開(2015)。RMI原理,取自http://www.cnblogs.com/moonandstar08/p/4957492.html
[29] 詹智斌(2015)。虛擬化分散式運算環境之實作與評估,德明財經科技大學資訊管理系在職專班碩士論文。
[30] 楊元森(2014)。水情測預報平台與分散式運算,國立臺北科技大學土木工程系土木與防災碩士班(碩士在職專班)碩士論文。
[31] 楊貴安(2012)。Hadoop雲端平台在工程應用之探討研究,國立中央大學士木工程學系碩士論文。
[32] 廖知航(2015)。RHadoop技術探討與實作,靜宜大學資訊管理學系碩士論文。
[33] 維基百科(2016)。統計分析系統,取自https://zh.wikipedia.org/wiki/統計分析系統
[34] 維基百科(2016)。MATLAB,取自https://zh.wikipedia.org/wiki/MATLAB
[35] 賴彥良(2012)。雲端運算之應用與效能評估,靜宜大學資訊工程學系碩士論文。
[36] 簡玠忠(2013)。基於Hadoop框架建立巨量資料分析處理模型研究,國立中興大學資訊科學與工程學系碩士論文。
[37] 顏春煌(2013)。作業系統理論與實務(第二版)。台北市,碁峰出版社。
[38] 羅子澄(2012)。運用於Hadoop雲端運算的資料探勘混合編碼演算法,國立清華大學資訊工程學系。 |