利用不定性事件記錄與重播之技術實現KVM虛擬機器自動容錯之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：3

、訪客IP：3.147.86.30

姓名

商晉瑋(Jin-Wei Shang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

利用不定性事件記錄與重播之技術實現KVM虛擬機器自動容錯之研究
(Using Non-Deterministic Event Log and Replay to Support Virtual Machine Fault Tolerance of Kernel-based Virtual Machine)

相關論文

★ 以伸展樹為基礎的Android Binder Driver	★ 應用增量式學習於多種農作物判釋之研究
★ 應用分類重建學習偵測航照圖幅中的新穎坵塊	★ 用於輔助工業零件辨識之尺寸估算系統
★ 使用無紋理之3D CAD工業零件模型結合長度檢測實現細粒度真實工業零件影像分類	★ 一個建立在平行工作系統上的動態全球計算平台
★ 用權重參照計數演算法執行主動物件垃圾收集	★ 一個動態負載平衡之最大可能性估算計算架構
★ 利用多項系統負載資訊進行動態P2P系統重組的策略研究	★ 基於Hadoop系統的雲端應用程式特徵擷取與計算監測架構
★ 適用於大型動態分散式系統的調適性計算模型	★ 一個提供彈性虛擬資料中心的雲端服務平台
★ 雲端彈性虛擬機房服務平台之資源控管中心	★ 一個適用於自動供應雲端系統的動態調適計算架構
★ 線性相關工作與非相關工作的探索式排程策略	★ 適用於大資料集高效率的分散式階層分群演算法

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

現代許多雲端系統服務皆建置在虛擬機的基礎之上，因此當虛擬機器因為某些問題而無法運行時，就會讓虛擬機上的服務與應用程式停止，進而造成服務提供者以及客戶的損失，所以如何提高虛擬機器系統的可用性就成為一個重要的議題。容錯技術為提供虛擬機高可用性的一種技術，這種技術可以在虛擬機器發生錯誤時，由備援實體機器來接手運行這個發生錯誤的虛擬機器，並讓此虛擬機持續而不間斷的執行(Continuous Execution)。通常有兩種方式可以用以實作虛擬機容錯，一為利用Checkpointing方式達成虛擬機器與備援虛擬機器的狀態同步，我們稱為記憶體層級狀態同步容錯機制；二為記錄主要虛擬機器所執行的指令，並且在備援虛擬機器上重現以達成兩虛擬機器狀態同步，此為指令層級狀態同步容錯機制。本研究鎖定在KVM虛擬機器系統上的指令層級同步容錯機制。這個機制主要透過監控主要虛擬機器上所發生的不定性事件 (Non-Deterministic Events)，然後計算該事件之邏輯時間並記錄該事件的參數後，再傳送給備援虛擬機器重現。備援虛擬機器一開始的狀態需要與主虛擬機器一致，也就是會執行同一份程式指令並保持相同記憶體內容，但是維持在暫停的狀態。當備援虛擬機器在接收到事件紀錄後，它會去設定該事件對應的指令中斷點後開始執行，並於中斷發生時安插紀錄的事件資料並重現，因此可達成兩虛擬機器的狀態同步。最後我們利用這種不定性事件記錄與重播之技術來設計並實作一個錯誤處理與復原的機制，來達成虛擬機器自動容錯的目的。

摘要(英)

Virtual machine fault tolerance (VMFT) is a technology enabling continuous execution upon hardware/software failures, and it thus can be used to protect virtualized, critical software services. There are two ways to implement VMFT. The first one uses a continuous-checkpointing strategy, in which a backup virtual machine (VM) keeps receiving the latest VM checkpoint from the protected VM. The other one uses a log-and-replay strategy, in which all events in the protected VM are recorded and the recorded events are turned into deterministic events for replay in the backup VM. Once the protected VM fails, the backup VM replaces the role of the protected VM immediately to minimize service downtime. This research aims to provide a log-and-replay-based mechanism for VMFT over Kernel-based Virtual Machine (KVM). Before entering the phase of VMFT, the proposed mechanism creates the backup VM by live cloning the protected VM. Then, the two VMs enter the fault tolerance phase, in which they synchronize periodically. In each synchronization epoch, the proposed mechanism monitors the non-deterministic events happening on the protected VM, and identifies the logical time along with the parameters of the events. It then transfers the logged data to the backup VM for event replay. Upon reception of the data, the backup VM sets instruction break points at the right place and starts execution. It injects each logged event when reaches the corresponding break point. The backup VM signals the protected VM when it finishes. When the protected VM fails during the fault tolerance phase, the backup VM is responsible to detect such a failure and to replace the role of the protected VM.

關鍵字(中)

★ KVM
★ 虛擬機器
★ Fault Tolerance
★ 記錄與重播

關鍵字(英)

★ KVM
★ Virtual Machine
★ Fault Tolerance
★ Log and Replay

論文目次

摘要 i
Abstract ii
目錄 iii
圖目錄 v
表目錄 vii
第一章緒論 1
1-1 前言 1
1-2 研究動機 3
1-3 論文貢獻 4
1-4 論文架構 4
第二章背景知識 5
2-1 背景知識 5
2-1-1 KVM (Kernel-based Virtual Machine)與QEMU 5
2-1-2 Fault Tolerance 9
2-1-3 Log and Replay 10
2-2 相關技術 10
2-2-1 Kemari & Micro-Checkpointing 10
2-2-2 VMware vSphere 12
第三章系統設計 14
3-1 系統架構 14
3-2 系統運作細節 17
3-2-1 Primary flow 17
3-2-2 Backup flow 19
3-3 Non-deterministic Log and Deterministic Replay implementation 22
3-3-1 Synchronization 22
3-3-2 Log 25
3-3-3 Timestamp 26
3-3-4 Replay 26
3-4 Failure Handling 27
3-4-1 Scenarios 27
3-4-2 Failure Models 31
3-4-3 Correctness Argument 36
第四章初步結果 38
第五章結論與未來研究方向 39
參考資料 40

參考文獻

[1] Staff, VMWare. "Virtualization overview." White Paper,
http://www.vmware.com/pdf/virtualization.pdf (2012).
[2] Popek, Gerald J., and Robert P. Goldberg. "Formal requirements for virtualizable third generation architectures." Communications of the ACM 17.7 (1974): 412-421.
[3] Power, Emerson Network. "Understanding the cost of data center downtime: an analysis of the financial impact on infrastructure vulnerability." white paper (2011).
[4] Gray, Jim, and Daniel P. Siewiorek. "High-availability computer systems." Computer 24.9 (1991): 39-48.
[5] Scales, Daniel J., Mike Nelson, and Ganesh Venkitachalam. "The design of a practical system for fault-tolerant virtual machines." ACM SIGOPS Operating Systems Review 44.4 (2010): 30-39.
[6] VMware Inc., “VMWare vSphere 4 Fault Tolerance: Architecture and Performance,” Chapter 1-Chapter 2, 2009
[7] Red Hat Inc., “KVM – KERNEL BASED VIRTUAL MACHINE,” white paper, update: January 2015.
[8] Maohua Lu, and Tzi-cker Chiueh, “Fast Memory State Synchronization for Virtualization-based Fault Tolerance,” 2009 IEEE/IFIP International Conference on Dependable Systems & Networks, 534-543, July 2009
[9] Uhlig, Rich, et al. "Intel virtualization technology." Computer 38.5 (2005): 48-56.
[10] Virtualization, A. M. D. "Amd-v nested paging." White paper.[Online] Available: http://sites.amd.com/us/business/it-solutions/virtualization/Pages/amd-v.aspx (2008).
[11] Intel, Intel. "and IA-32 architectures software developer’s manual." Volume 3B.
[12] QEMU Fabrice Bellard, “QEMU, a Fast and Portable Dynamic Translator,” USENIX Annual Technical Conference, FREENIX Track, 41-46, 2005.
[13] Y. Tamura, K. Sato, S. Kihara, and S. Moriai, “Kemari: Virtual Machine Synchronization for Fault Tolerance,” Proc. USENIX Annual Technical Conference, 2008.
[14] Micro-Checkpointing “Features/MicroCheckpointing – QEMU,” [Online]. Available: http://wiki.qemu.org/Features/MicroCheckpointing. [Accessed: 24-June-2016].
[15] Lockstep Thomas C. Bressoud, Fred B. Schneider, “Hypervisor-based fault tolerance,” ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles (Volume: 14 Issue 1), 80-107, Feb. 1996.
[16] Kurt E. Kiefer* and Louise E. Moser, ” Replay debugging of non-deterministic executions in the Kernel-based Virtual Machine,” Software: Practice and Experience (Volume: 43, Issue: 11), 1261-1281, November 2013.
[17] J. Li, S. Si, B. Li, L. Cui, and J. Zheng , “LoRe: Supporting Non-deterministic Events Logging and Replay for KVM Virtual Machines,” High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on, 442-449, November 2013.
[18] Julian B. Grizzard and Ryan W. Gardner, “Analysis of Virtual Machine Record and Replay for Trustworthy Computing,” JOHNS HOPKINS APL TECHNICAL DIGEST (Volume: 32, Number: 2), 528-535, 2013.
[19] Kurt E. Kiefer, Louise E. Moser, “Replay debugging of non-deterministic executions in the Kernel-based Virtual Machine,” Software: Practice and Experience (Volume: 43 Issue 11), 1261-1281, November 2013.
[20] Sheldon, M. X. V. M. J., and Ganesh Venkitachalam Boris Weissman. "Retrace: Collecting execution trace with virtual machine deterministic replay." Proceedings of the Third Annual Workshop on Modeling, Benchmarking and Simulation (MoBS 2007). 2007.

指導教授

王尉任(Wei-Jen Wang)

審核日期

2016-8-5

推文