摘要(英) |
Automatic fault tolerance for virtual machine usually uses a backup virtual machine to protect the primary virtual machine, which provides services to end users. The primary and the backup virtual machines have to run on different physical machines and keep their states synchronized. As the primary virtual machine fails, the backup virtual machine has to take the control and keeps the services on the virtual machine alive. Based on our study, the existing open-source projects of virtual machine fault tolerance, Kemari and Micro-Checkpointing, do not have good performance when hosting a network service. Therefore, we have proposed a novel design of a fault-tolerant virtual machine based on KVM, namely M –FTVM. We have also implemented a prototype of the proposed fault-tolerant virtual machine, and keep working on improving its performance and correctness. This paper focus on the issue of the virtual machine life cycle on the original M-FTVM. That is, the backup virtual machine may cause the primary virtual machine unstable and affects the life cycle of virtual machine. Therefore, we propose a new design for the virtual machine life cycle of M-FTVM, and modified the implementation of M-FTVM accordingly. The experiment results show that, the new design does not cause much overhead when compared with the original M-FTVM. |
參考文獻 |
[1] G. J. Popek and R. P. Goldberg, "Formal requirements for virtualizable third generation architectures," Commun. ACM, vol. 17, pp. 412-421, 1974.
[2] R. P. Goldberg, "Survey of virtual machine research," Computer, vol. 7, pp. 34-45, 1974 1974.
[3] Ponemon. (2016). Cost of Data Center Outages. Available: http://datacenterfrontier.com/cost-of-data-center-outages/
[4] J. Gray and D. P. Siewiorek, "High-availability computer systems," Computer, vol. 24, pp. 39-48, 1991.
[5] D. J. Scales, M. Nelson, and G. Venkitachalam, "The design and evaluation of a practical system for fault-tolerant virtual machines," Technical Report VMWare-RT-2010-001, VMWare2010.
[6] C.-H. Chen, "基於KVM的網路服務高可靠性容錯同步架構," Master′s thesis, National Central University, 2014.
[7] KVM – Kernel-based Virtual Machine | Red Hat [Online]. Available: https://www.linux-kvm.org
[8] M. Zabaljauregui, "Hardware assisted virtualization intel virtualization technology," 2008.
[9] AMD. (2007). Putting Server Virtualization to Work. Available: https://www.redhat.com/f/pdf/virtualization/amd_Virtualization_WP.pdf
[10] F. Bellard, "QEMU, a fast and portable dynamic translator," in USENIX Annual Technical Conference, FREENIX Track, 2005, pp. 41-46.
[11] B. Cully, G. Lefebvre, D. Meyer, M. Feeley, N. Hutchinson, and A. Warfield, "Remus: High availability via asynchronous virtual machine replication," in Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation, 2008, pp. 161-174.
[12] T. C. Bressoud and F. B. Schneider, "Hypervisor-based fault tolerance," ACM Transactions on Computer Systems (TOCS), vol. 14, pp. 80-107, 1996.
[13] Y. Tamura, K. Sato, S. Kihara, and S. Moriai, "Kemari: Virtual machine synchronization for fault tolerance," 2008.
[14] Features/MicroCheckpointing - QEMU. Available: http://wiki.qemu.org/Features/MicroCheckpointing
[15] M. Xu, V. Malyugin, J. Sheldon, G. Venkitachalam, and B. Weissman, "Retrace: Collecting execution trace with virtual machine deterministic replay," in In Proceedings of the 3rd Annual Workshop on Modeling, Benchmarking and Simulation, MoBS, 2007.
[16] R. Jhawar, V. Piuri, and M. Santambrogio, "Fault tolerance management in cloud computing: A system-level perspective," IEEE Systems Journal, vol. 7, pp. 288-297, 2013.
[17] Dvd store. Available: http://www.delltechcenter.com/page/DVD+store
[18] Yahoo! Cloud Serving Benchmark Available: https://github.com/brianfrankcooper/YCSB/wiki |