dc.description.abstract | The emerging cloud computing technology provides on-demand, powerful computing platforms for many complex scientific and industrial applications. They usually consume lots of computing resources and execute concurrently on a cloud platform. Therefore, a cloud system demands a good monitoring and profiling framework to keep track of users’ applications, and uses the observed information for system management purposes, such as process deployment, application optimization, and load balancing. A pay-per-use cloud system can also charge their customers ac-cording to the observed application usage. However, existing monitoring systems focus on hardware monitoring, such as CPU usage, memory usage, and network bandwidth usage. They have no clue of how users’’ applications utilize the system re-sources. As a result, we introduce the concept of application-aware monitoring to improve existing cloud monitoring systems, and develop an application profiling and monitoring framework based on a cloud system, Hadoop.
The proposed framework does not present low-level views of jobs, tasks, and local processes. Instead, it provides a more integrated, abstract view for cloud appli-cations. The proposed architecture is comprised of three components --- the applica-tion-aware profiling agents, filters, and the profiling database. The application-aware profiling agents are installed on every computing node to record the execution status of users’ applications. The observed information is then sent to the filters for prelim-inary processing. The filters extract the mapping relations, save the results as inter-mediate files, and deliver the files to the profiling database. In addition, our system provides a classification service that utilizes the profiling data to classify cloud ap-plications. It helps users and administrators optimize their applications. The major difference between our system and other existing systems is that our system is appli-cation-oriented, while others are mostly hardware-oriented. The major contribution of our system is that it can integrate the information of users, applications, jobs, processes, and resources. When problems arise in a cloud system, applications with high performance guarantee can be identified easily to get timely service. Cloud ser-vice providers can also take advantage of our system to develop a set of billing strat-egies, to create different service-level agreements, and to protect the rights for dif-ferent customers who pay different amount of money. Furthermore, the processed data can be sent to the load-balancing service of a cloud system to support dynamic system reconfiguration and improve resource utilization rate.
| en_US |