近年來隨著資訊產業的發展,以及各大企業對於資訊科技的依賴性提高,全世界正在快速的發展雲端計算相關研究,來提供企業雲端服務並且提升生產力,然而一套完整且值得企業購買的雲端服務,必然要達到高可用性 ( availability ) 的服務。容錯機制的加入可以提供雲端服務做到高可用性,當主要進行服務的機器發生故障或錯誤時,備援機器會快速接管原來的服務,以維持正常運作來達到高可靠性。本研究基於 NCU M-FTVM 的研究,在原始錯誤檢測架構下新增 Heartbeat 機制,主要機器定期在新增的網路通道傳送 Heartbeat 封包到備援機器上,提供備援機器確認主要機器的存活狀態的依據,並且探討新增 Heartbeat 機制後的 NCU M-FTVM 系統在所有可能發生錯誤的情況,進行錯誤注入的分析。另外針對 NCU M-FTVM 錯誤偵測及 Split-Brain 狀態的情況進行討論。;In recent years, with the development of the information industry and the increasing dependence of major enterprises on information technology, the world is rapidly developing cloud computing-related research to provide enterprise cloud services and improve productivity. However, a complete set of Cloud services are bound to achieve high availability services. The addition of a fault tolerance mechanism can provide cloud services with high availability. When the primary machine fails, the backup machine will quickly takeover the service to maintain normal operation to achieve high availability. Based on the research of NCU M-FTVM, this research setup a backup physical connection with the original fault detection framework. The primary machine periodically sends the Heartbeat packets to the backup machine through the newly added connection channel, providing the backup machine with the basis for confirming the status of the primary machine. And analyze all possible failures in the NCU M-FTVM system after Heartbeat connection been setup, then inject faults in the system. In addition, the original NCU M-FTVM fault detection and potential of Split-Brain status is discussed.