摘要: | 近幾年雲端運算日益普及和成熟。但與此同時,雲端服務的downtime問題逐漸被重視,其造成的cost也有逐年上升的趨勢。Virtual Machine (VM) 是大多數cloud service的基礎,雲端系統復原管理過程中,需要重啟VM。然而不同情況下,重啟VM的時間不相同。若愈能精準預測VM boot time,則可以找到花最少時間啟動service的VM擺放方式,復原所需時間也愈短,進而縮短雲端服務downtime。 過去鮮少有關於VM boot time研究,因為VM boot time通常被認為是固定值,但前人研究指出事實並非如此。Lee提出五種model預測VM boot time,並在四台host的環境進行實驗,且VM背景沒有運行增加host CPU loading的程式。結果顯示 (Random Forest) RF model是accuracy最高的model,但它所需的資料量大小隨host數目增加呈指數成長,所以建議在小規模的雲端環境使用。 然而,Lee沒有驗證:若host數量增加後,ML-based model能維持accuracy;增加host CPU loading後,ML-based model仍能維持accuracy。因此本研究將針對以上兩問題進行探討。結果顯示增加host數目後,RF model accuracy沒有下降,因為它能適應較複雜的環境;而增加host CPU loading後,RF model accuracy明顯下降。此外,由於收集ML-based model的資料,時間成本高昂。因此本研究建議若在10台host以上的雲端環境,採用YLL’s rule-based model。它的優勢為只需收集少量資料,所需時間相較ML-based model非常短暫。 ;In recent years, cloud computing has become increasingly popular and mature. But at the same time, the extension of downtime of cloud service has become more and more common in recent years, and the cost caused by it has also increased year by year. Virtual Machines (VMs) are the foundation of most cloud services. During cloud system recovery management, the VM needs to be restarted. However, the time to restart the VM varies in different situations. If the VM boot time can be predicted more accurately, the VM placement method that takes the least time to start the service can be found, and the recovery time will be shorter, thereby shortening the downtime of the cloud service. There has been little research on VM boot time because VM boot time is often considered constant. However, previous studies show this is not correct. Lee proposed five models to predict the VM boot time in the environment of four hosts, and the VM background did not run the program that increase the host CPU loading. The results show that the (Random Forest) RF model is the model with the highest accuracy, but the amount of data it requires grows exponentially with the number of hosts, so it is recommended to be used in a small-scale cloud environment. However, Lee did not verify if the number of hosts increases, the ML-based model can maintain accuracy; after increasing host CPU loading, the ML-based model can still maintain accuracy. Therefore, this study will address the above two issues. The results show that after increasing the number of hosts, the RF model accuracy does not decrease because it can adapt to a more complex environment; however, after increasing the host CPU loading, the RF model accuracy decreases significantly. In addition, the time cost is high due to the collection of data for ML-based models. Therefore, this study suggests that YLL′s rule-based model should be used in a cloud environment with more than 10 hosts. Its advantage is that it only needs to collect a small amount of data, and the time required is very short compared to ML-based models. |