博碩士論文 975202031 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:39 、訪客IP:18.116.43.109
姓名 劉勝豪(Sheng-Hao Liu)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 基於Hadoop系統的雲端應用程式特徵擷取與計算監測架構
(A Profiling and Monitoring Framework for Cloud Applications on Hadoop System)
相關論文
★ 以伸展樹為基礎的Android Binder Driver★ 應用增量式學習於多種農作物判釋之研究
★ 應用分類重建學習偵測航照圖幅中的新穎坵塊★ 用於輔助工業零件辨識之尺寸估算系統
★ 使用無紋理之3D CAD工業零件模型結合長度檢測實現細粒度真實工業零件影像分類★ 一個建立在平行工作系統上的動態全球計算平台
★ 用權重參照計數演算法執行主動物件垃圾收集★ 一個動態負載平衡之最大可能性估算計算架構
★ 利用多項系統負載資訊進行動態P2P系統重組的策略研究★ 適用於大型動態分散式系統的調適性計算模型
★ 一個提供彈性虛擬資料中心的雲端服務平台★ 雲端彈性虛擬機房服務平台之資源控管中心
★ 一個適用於自動供應雲端系統的動態調適計算架構★ 線性相關工作與非相關工作的探索式排程策略
★ 適用於大資料集高效率的分散式階層分群演算法★ 混合雲端環境上的多重代理人動態調適計算管理架構
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 近年來由於雲端計算技術的快速發展,越來越多的領域採用雲端計算技術來協助相關的計算,並相繼開發出各種類型的應用程式。這些應用程式會佔用了大量的計算資源,因此需要透過適當的監測軟體來提供系統資訊,使該雲端系統可以據此調整計算資源的使用以提升系統的整體效能與使用者的滿意度。除此之外,如果雲端系統是屬於收費型的系統,則該系統就需要提供精確的使用者資源使用狀況,並根據使用狀況去收費。但是目前的監測系統多是單純的監測硬體資源使用狀況,例如:CPU使用率、Memory使用量、網路傳輸速率等,不足以提供雲端系統實際運作的需求。因此,我們引進了 Application-Aware的概念,在 Hadoop 系統上以現有的監測系統為基礎去發展一個應用程式特徵擷取與監控(Application Profiling and Monitoring)架構。這個系統能夠把複雜的計算工作對應關係隱藏起來,使得管理者可以用更簡單的方式去掌握應用程式。
這個系統主要包含三個元件:Application-Aware Profiling Agent、Profiling Database、Filter。
Application-Aware Profiling Agent安裝在每個計算節點上以紀錄各個雲端應用程式的執行情況(例如執行時間、CPU 使用狀況等資訊),這些監測資訊將透過Filter來擷取出對應關係,並將資料送到 Profiling Database 儲存。透過這一個系統,我們就可以掌握各個雲端應用程式的執行情況。除此之外,系統提供特徵擷取的功能,因此可以對計算工作的屬性進行分類。
基於使用者付費與服務品質的概念,雲端計算系統中採用越穩定的計算資源進行計算的使用者,就應付出較多的成本,當然雲端計算系統對於使用者的保障當然也要越大。因此,這個研究最大的貢獻就是提高監測的階層,以應用程式作為監測對象,提供一個監測應用程式的機制,劃分出應用程式與使用者的等級,才能讓雲端系統設計出保障各等級的使用者的方法。此外,系統處理過的資料也能夠回饋給管理者以及系統的資源調整機制,進而達到在動態的雲端計算環境中去支援動態系統調整,並保障使用者可佔用的計算資源份量。
摘要(英) The emerging cloud computing technology provides on-demand, powerful computing platforms for many complex scientific and industrial applications. They usually consume lots of computing resources and execute concurrently on a cloud platform. Therefore, a cloud system demands a good monitoring and profiling framework to keep track of users’ applications, and uses the observed information for system management purposes, such as process deployment, application optimization, and load balancing. A pay-per-use cloud system can also charge their customers ac-cording to the observed application usage. However, existing monitoring systems focus on hardware monitoring, such as CPU usage, memory usage, and network bandwidth usage. They have no clue of how users’’ applications utilize the system re-sources. As a result, we introduce the concept of application-aware monitoring to improve existing cloud monitoring systems, and develop an application profiling and monitoring framework based on a cloud system, Hadoop.
The proposed framework does not present low-level views of jobs, tasks, and local processes. Instead, it provides a more integrated, abstract view for cloud appli-cations. The proposed architecture is comprised of three components --- the applica-tion-aware profiling agents, filters, and the profiling database. The application-aware profiling agents are installed on every computing node to record the execution status of users’ applications. The observed information is then sent to the filters for prelim-inary processing. The filters extract the mapping relations, save the results as inter-mediate files, and deliver the files to the profiling database. In addition, our system provides a classification service that utilizes the profiling data to classify cloud ap-plications. It helps users and administrators optimize their applications. The major difference between our system and other existing systems is that our system is appli-cation-oriented, while others are mostly hardware-oriented. The major contribution of our system is that it can integrate the information of users, applications, jobs, processes, and resources. When problems arise in a cloud system, applications with high performance guarantee can be identified easily to get timely service. Cloud ser-vice providers can also take advantage of our system to develop a set of billing strat-egies, to create different service-level agreements, and to protect the rights for dif-ferent customers who pay different amount of money. Furthermore, the processed data can be sent to the load-balancing service of a cloud system to support dynamic system reconfiguration and improve resource utilization rate.
關鍵字(中) ★ 特徵擷取
★ 監測服務
★ 雲端計算
關鍵字(英) ★ Profiling
★ Monitoring
★ Hadoop
★ Cloud Computing
論文目次 摘要 ii
Abstract iv
目錄 vi
圖目錄 viii
表目錄 ix
第一章 緒論 1
1-1 研究動機 1
1-2 研究目的 3
1-3 研究主要貢獻 5
1-4 文章架構 6
第二章 相關研究 7
2-1 監測系統 7
2-2 Hadoop簡介 7
第三章 系統架構 9
第四章 系統元件 14
4-1 Application-Aware Profiling Agent 14
4-2 Profiling Database 16
4-3 Filters 16
4-4 Web Interface 17
第五章 系統元件實做與環境 19
5-1 Testbed 19
5-2 Application-Aware Profiling Agent 20
5-3 Profiling Database 22
5-4 Filters 23
5-5 Web Interface 25
第六章 未來研究方向 30
6-1 容錯能力(Fault Tolerance) 30
6-2 可擴充性(Scalability) 30
6-3 調適控制(Adaptive Control) 30
6-4 網頁介面擴充(Improve Web Interface) 31
第七章 結論 32
參考文獻 33
參考文獻 [1] Amazon.com. Amazon EC2.
[2] Apache Hadoop project. (a). Hadoop MapReduce.
[3] Apache Hadoop project. (b). HBase.
[4] Apache Hadoop project. (c). HDFS.
[5] Apache Hadoop project. (d). Hive.
[6] Apache Hadoop project. (e). Pig.
[7] Appleby, K., Fakhouri, S., Fong, L., Goldszmidt, G., Kalantar, M., Krishnaku-mar, S., et al. (2001). Oceano-SLA based management of a computing utility. Paper presented at the Proceedings of the 7th IFIP/IEEE International Sympo-sium on Integrated Network Management, , 5
[8] Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R. H., Konwinski, A., et al. (2009). Above the clouds: A berkeley view of cloud computing. EECS Department, University of California, Berkeley, Tech.Rep.UCB/EECS-2009-28,
[9] Audience, I., & Bios, I. Data-intensive text processing with MapReduce.
[10] Balaton, Z., & Gombás, G. (2003). Resource and job monitoring in the grid. Euro-Par 2003 Parallel Processing, , 404-411.
[11] Baliś, B., Bubak, M., Funika, W., Szepieniec, T., & Wismüller, R. An infra-structure for grid application monitoring. Recent Advances in Parallel Virtual Machine and Message Passing Interface, , 41-49.
[12] Balis, B., Bubak, M., Funika, W., Szepieniec, T., Wismüller, R., & Radecki, M. Monitoring grid applications with grid-enabled OMIS monitor. Grid Computing, 230-239.
[13] Barth, W. (2008). Nagios: System and network monitoring No Starch Press San Francisco, CA, USA.
[14] Bialecki, A., Cafarella, M., Cutting, D., & O’Malley, O. (2005). Hadoop: A framework for running applications on large clusters built of commodity hard-ware.
[15] Borthakur, D. (2007). The hadoop distributed file system: Architecture and de-sign. Hadoop Project Website,
[16] Buyya, R., Yeo, C. S., & Venugopal, S. (2008). Market-oriented cloud compu-ting: Vision, hype, and reality for delivering it services as computing utilities. 10th IEEE International Conference on High Performance Computing and Communications, 2008. HPCC'08, 5-13.
[17] Calheiros, R. N., Ranjan, R., De Rose, C. A. F., & Buyya, R. (2009). CloudSim: A novel framework for modeling and simulation of cloud computing infrastruc-tures and services. Arxiv Preprint arXiv:0903.2525,
[18] Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., et al. (2006). Bigtable: A distributed storage system for structured data. To Ap-pear in OSDI, , 1.
[19] Chaudhuri, S., & Dayal, U. (1997). An overview of data warehousing and OLAP technology. ACM Sigmod Record, 26(1), 65-74.
[20] Chunghwa Telecom Co., L. CHT HiCloud CaaS.
[21] Cooke, A., Gray, A., Nutt, W., Magowan, J., Oevers, M., Taylor, P., et al. (2004). The relational grid monitoring architecture: Mediating information about the grid. Journal of Grid Computing, 2(4), 323-339.
[22] Czajkowski, K., Fitzgerald, S., Foster, I., & Kesselman, C. (2001). Grid infor-mation services for distributed resource sharing. 10th IEEE International Sym-posium on High Performance Distributed Computing, , 184
[23] Czajkowski, K., Foster, I., Kesselman, C., Sander, V., & Tuecke, S. (2002). SNAP: A protocol for negotiating service level agreements and coordinating re-source management in distributed systems. Job Scheduling Strategies for Pa-rallel Processing, 153-183.
[24] Dean, J., & Ghemawat, S. (2008). Map reduce: Simplified data processing on large clusters. Communications of the ACM-Association for Computing Machi-nery-CACM, 51(1), 107-114.
[25] Figueiredo, R., Dinda, P., & Fortes, J. (2003). A case for grid computing on vir-tual machines. Distributed Computing Systems, 2003. Proceedings. 23rd Inter-national Conference on, 550-559.
[26] Foster, I., & Kesselman, C. (2004). The grid: Blueprint for a new computing infrastructure Morgan Kaufmann.
[27] Foster, I., Zhao, Y., Raicu, I., & Lu, S. (2008). Cloud computing and grid com-puting 360-degree compared. Grid Computing Environments Workshop, 2008. GCE'08, 1-10.
[28] Ghemawat, S., Gobioff, H., & Leung, S. T. (2003). The google file system. ACM SIGOPS Operating Systems Review, 37(5), 43.
[29] GridLab: A grid application toolkit and testbed.
[30] Hasselmeyer, P., Mersch, H., Koller, B., Quyen, H., Schubert, L., & Wieder, P. (2007). Implementing an SLA negotiation framework. Exploiting the Knowledge Economy: Issues, Applications, Case Studies (eChallenges 2007),
[31] Houstis, E. N., Catlin, A. C., Rice, J. R., Verykios, V. S., Ramakrishnan, N., & Houstis, C. E. (2000). PYTHIA-II: A knowledge/database system for managing performance data and recommending scientific software. ACM Transactions on Mathematical Software (TOMS), 26(2), 227-253.
[32] Huang, W., Liu, J., Abali, B., & Panda, D. K. (2006). A case for high perfor-mance computing with virtual machines. Proceedings of the 20th Annual Inter-national Conference on Supercomputing, 134.
[33] Iosup, A., Ţãpuş, N., & Vialle, S. A monitoring architecture for control grids. Advances in Grid Computing-EGC 2005, , 922-931.
[34] Jang, S. H., Wu, X., Taylor, V., Mehta, G., Vahi, K., & Deelman, E. (2004). Us-ing performance prediction to allocate grid resources. Texas A&M University, College Station, TX, GriPhyN Technical Report, 25
[35] Jin, C., & Buyya, R. (2009). MapReduce programming model for .NET-based distributed computing. Proc. 15th European Conference on Parallel Processing (Euro-Par 2009),
[36] Khare, R., Cutting, D., Sitaker, K., & Rifkin, A. (2004). Nutch: A flexible and scalable open-source web search engine. Oregon State University,
[37] Krauter, K., Buyya, R., & Maheswaran, M. ATaxonomy and survey of grid re-source management systems.
[38] Lin, J., & Dyer, C. (2009). Data-intensive text processing with MapReduce. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Tutorial Abstracts, 1-2.
[39] Litke, A., Konstanteli, K., Andronikou, V., Chatzis, S., & Varvarigou, T. (2008). Managing service level agreement contracts in OGSA-based grids. Future Gen-eration Computer Systems, 24(4), 245-258.
[40] Ludwig, A., Braun, P., Kowalczyk, R., & Franczyk, B. A framework for auto-mated negotiation of service level agreements in services grids. Business Process Management Workshops, 89-101.
[41] Massie, M. L., Chun, B. N., & Culler, D. E. (2004). The ganglia distributed monitoring system: Design, implementation, and experience. Parallel Computing, 30(7), 817-840.
[42] Paurobally, S., Tamma, V., & Wooldrdige, M. (2007). A framework for web ser-vice negotiation. ACM Transactions on Autonomous and Adaptive Systems (TAAS), 2(4), 14.
[43] Ribler, R., Vetter, J., Simitci, H., & Reed, D. (1998). Autopilot: Adaptive control of distributed applications. High Performance Distributed Computing, 1998. Proceedings. the Seventh International Symposium on, 172-179.
[44] Sacerdoti, F. D., Katz, M. J., Massie, M. L., & Culler, D. E. (2003). Wide area cluster monitoring with ganglia. Proceedings of the IEEE Cluster 2003 Confe-rence,
[45] Sahai, A., Machiraju, V., Sayal, M., Van Moorsel, A., Casati, F., & Jin, L. J. (2002). Automated SLA monitoring for web services. Lecture Notes in Comput-er Science, , 28-41.
[46] Seidel, J., Waldrich, O., Ziegler, W., Wieder, P., & Yahyapour, R. Using SLA for resource management and scheduling-a survey. Grid Middleware and Servic-es-Challenges and Solutions, 8
[47] Staten, J. (2008). Is cloud computing ready for the enterprise? Forrester Re-search, March, 7
[48] Stephens, A. OverView: A framework for generic online visualization of distri-buted systems.
[49] Tierney, B., Aydt, R., Gunter, D., Smith, W., Swany, M., Taylor, V., et al. (2002). A grid monitoring architecture. The Global Grid Forum GWD-GP-16-2,
[50] Tierney, B., & Gunter, D. (2003). NetLogger: A toolkit for distributed system performance tuning and debugging. Proceedings of the 8th IFIP/IEEE Interna-tional Symposium on Integrated Network Management,
[51] University of California, Berkeley Ganglia.
[52] Varela, C., & Agha, G. (2001). Programming dynamically reconfigurable open systems with SALSA. ACM SIGPLAN Notices, 36(12), 34.
[53] Vraalsen, F., Aydt, R., Mendes, C., & Reed, D. (2001). Performance contracts: Predicting and monitoring grid application behavior. Grid Computing—GRID 2001, , 154-165.
[54] Waheed, A., Smith, W., George, J., & Yan, J. An infrastructure for monitoring and management in computational grids. Languages, Compilers, and Run-Time Systems for Scalable Computers, , 619-628.
[55] White, T. (2009). Hadoop: The definitive guide O'Reilly Media, Inc.
[56] Yang, H., Dasdan, A., Hsiao, R. L., & Parker, D. S. (2007). Map-reduce-merge: Simplified relational data processing on large clusters. Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, 1040.
[57] Zanikolas, S., & Sakellariou, R. (2005). A taxonomy of grid monitoring systems. Future Generation Computer Systems, 21(1), 163-188.
[58] Zhang, X., Freschl, J., & Schopf, J. (2003). A performance study of monitoring and information services for distributed systems. 12th IEEE International Sym-posium on High Performance Distributed Computing, 2003. Proceedings, 270-281.
[59] Zhu, X., Young, D., Watson, B. J., Wang, Z., Rolia, J., Singhal, S., et al. (2009). 1000 islands: An integrated approach to resource management for virtualized data centers. Cluster Computing, 12(1), 45-57.
指導教授 王尉任(Wei-Jen Wang) 審核日期 2010-7-21
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明