參考文獻 |
[1] Flexera. "RightScale 2019 state of the cloud report from Flexera." https://resources.flexera.com/web/media/documents/rightscale-2019-state-of-the-cloud-report-from-flexera.pdf (accessed 17 Jan., 2022).
[2] Veeam. "2019 Veeam Cloud Data Management Report." https://www.sysgroup.com/resources/ebooks/veeam-cloud-data-management-report-2019 (accessed 17 Jan., 2022).
[3] M. Nabi, M. Toeroe, and F. Khendek, "Availability in the cloud: State of the art," Journal of Network and Computer Applications, vol. 60, pp. 54-67, 2016.
[4] B. Mohammed, M. Kiran, I.-U. Awan, and K. M. Maiyama, "Optimising fault tolerance in real-time cloud computing IaaS environment," in 2016 IEEE 4th international conference on future internet of things and cloud (FiCloud), Vienna, Austria, 2016: IEEE, pp. 363-370.
[5] P. Zhang, S. Shu, and M. Zhou, "Adaptively Adjusting Dynamic Detection Cycle for Fault Detection in Clouds," in 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, 2018: IEEE, pp. 4047-4052.
[6] T. Wang, W. Zhang, C. Ye, J. Wei, H. Zhong, and T. Huang, "FD4C: Automatic fault diagnosis framework for Web applications in cloud computing," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 46, no. 1, pp. 61-75, 2015.
[7] S. F. Ghoreishi and M. Imani, "Offline fault detection in gene regulatory networks using next-generation sequencing data," in 2019 53rd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 2019: IEEE, pp. 1344-1348.
[8] VMware. "vSphere Availability." https://docs.vmware.com/en/VMware-vSphere/6.5/vsphere-esxi-vcenter-server-65-availability-guide.pdf (accessed 17 Jan., 2022).
[9] W.-J. Wang, H.-L. Huang, S.-H. Chuang, S.-J. Chen, C. H. Kao, and D. Liang, "Virtual machines of high availability using hardware-assisted failure detection," in 2015 International Carnahan Conference on Security Technology (ICCST), Taipei, Taiwan, 2015: IEEE, pp. 1-6.
[10] M. S. Rahman, M. Y. S. Uddin, T. Hasan, M. S. Rahman, and M. Kaykobad, "Using adaptive heartbeat rate on long-lived TCP connections," IEEE/ACM Transactions on Networking, vol. 26, no. 1, pp. 203-216, 2017.
[11] K. Razavi, G. Van Der Kolk, and T. Kielmann, "Prebaked µvms: Scalable, instant vm startup for iaas clouds," in 2015 IEEE 35th International Conference on Distributed Computing Systems, Columbus, OH, USA, 2015: IEEE, pp. 245-255.
[12] T. L. Nguyen, R. Nou, and A. Lebre, "YOLO: Speeding up VM Boot Time by reducing I/O operations," Inria, 2019. [Online]. Available: https://hal.inria.fr/hal-01983626
[13] OpenStack. "OpenStack Documentation." https://docs.openstack.org/queens/index.html (accessed 17 Jan., 2022).
[14] J. Gao and G. Tang, "Virtual machine placement strategy research," in 2013 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, Beijing, China, 2013: IEEE, pp. 294-297.
[15] S. Crago et al., "Heterogeneous cloud computing," in 2011 IEEE International Conference on Cluster Computing, Austin, TX, USA, 2011: IEEE, pp. 378-385.
[16] Intel, Hewlett-Packard, NEC, and Dell. "Intelligent Platform Management Interface Specification Second Generation v2.0." https://www.intel.com.tw/content/www/tw/zh/products/docs/servers/ipmi/ipmi-second-gen-interface-spec-v2-rev1-1.html (accessed 17 Jan., 2022).
[17] A. Aviziens, "Fault-tolerant systems," IEEE transactions on computers, vol. 100, no. 12, pp. 1304-1312, 1976.
[18] Z. Amin, H. Singh, and N. Sethi, "Review on fault tolerance techniques in cloud computing," International Journal of Computer Applications, vol. 116, no. 18, 2015.
[19] A. G. de Moraes Rossetto et al., "A new unreliable failure detector for self-healing in ubiquitous environments," in 2015 IEEE 29th International Conference on Advanced Information Networking and Applications, Gwangju, Korea (South), 2015: IEEE, pp. 316-323.
[20] J. Beauquier, S. Delaët, S. Dolev, and S. Tixeuil, "Transient fault detectors," Distributed Computing, vol. 20, no. 1, pp. 39-51, 2007.
[21] M. K. Gokhroo, M. C. Govil, and E. S. Pilli, "Detecting and mitigating faults in cloud computing environment," in 2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 2017: IEEE, pp. 1-9.
[22] J. Villamayor, D. Rexachs, E. Luque, and D. Lugones, "Raas: Resilience as a service," in 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Washington, DC, USA, 2018: IEEE, pp. 356-359.
[23] R. Yadav and A. S. Sidhu, "Fault tolerant algorithm for replication management in distributed cloud system," in 2015 IEEE 3rd International Conference on MOOCs, Innovation and Technology in Education (MITE), Amritsar, India, 2015: IEEE, pp. 78-83.
[24] D. Contractor, D. Patel, and S. Patel, "Trusted heartbeat framework for cloud computing," Journal of Information Security, vol. 7, no. 03, pp. 103-111, 2016.
[25] J. Liu, Z. Wu, J. Wu, J. Dong, Y. Zhao, and D. Wen, "A Weibull distribution accrual failure detector for cloud computing," PloS one, vol. 12, no. 3, pp. 1-16, 2017.
[26] F. Zhao, X. Koutsoukos, H. Haussecker, J. Reich, and P. Cheung, "Monitoring and fault diagnosis of hybrid systems," IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 35, no. 6, pp. 1225-1240, 2005.
[27] F.-w. Wang, J.-y. Shi, and L. Wang, "Method of diagnostic tree design for system-level faults based on dependency matrix and fault tree," in 2011 IEEE 18th International Conference on Industrial Engineering and Engineering Management, Changchun, China, 2011: IEEE, pp. 1113-1117.
[28] W. D. Shambroom, "Use of protocol validation and verification techniques in the design of a fault-tolerant computer architecture," in FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing, Toulouse, France, 1993: IEEE, pp. 636-640.
[29] C. M. Dobre, F. Pop, A. Costan, M. I. Andreica, and V. Cristea, "Robust failure detection architecture for large scale distributed systems," in Proceedings of the 17th International Conference on Control Systems and Computer Science, Bucharest, Romania, 2009, vol. 1, pp. 433-440.
[30] X. Zhang, L. Luan, L. Han, and Z. Lu, "Research and Improvement on Failure Detection Algorithm," in 2008 Third International Conference on Pervasive Computing and Applications, Alexandria, 2008, vol. 1: IEEE, pp. 532-536.
[31] S. Costache, N. Parlavantzas, C. Morin, and S. Kortas, "On the use of a proportional-share market for application slo support in clouds," in European Conference on Parallel Processing, Berlin, Heidelberg, 2013: Springer, pp. 341-352.
[32] M. Mao and M. Humphrey, "A performance study on the vm startup time in the cloud," in 2012 IEEE Fifth International Conference on Cloud Computing, Honolulu, HI, USA, 2012: IEEE, pp. 423-430.
[33] T. L. Nguyen and A. Lebre, "Virtual machine boot time model," in 2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), St. Petersburg, Russia, 2017: IEEE, pp. 430-437.
[34] D. Trihinas, G. Pallis, and M. D. Dikaiakos, "Monitoring elastically adaptive multi-cloud services," IEEE Transactions on Cloud Computing, vol. 6, no. 3, pp. 800-814, 2015.
[35] Y. Wu, Y. Yuan, G. Yang, and W. Zheng, "An adaptive task-level fault-tolerant approach to Grid," The Journal of Supercomputing, vol. 51, no. 2, pp. 97-114, 2010.
[36] M. A. Rodriguez and R. Buyya, "Deadline based resource provisioningand scheduling algorithm for scientific workflows on clouds," IEEE transactions on cloud computing, vol. 2, no. 2, pp. 222-235, 2014.
[37] A. Singh. "Cloudsim Tutorials." https://www.cloudsimtutorials.online/ (accessed 18 Jan., 2022).
[38] S. A. Ali. "Re: How to get ready time of virtual machine in cloudsim?" https://www.researchgate.net/post/How_to_get_ready_time_of_virtual_machine_in_cloudsim/5af286c8c1c6b1aab443337e/citation/download (accessed 18 Jan., 2022).
[39] F. Lopez-Pires and B. Baran, "Virtual machine placement literature review," arXiv preprint arXiv:1506.01509, 2015.
[40] A. Alashaikh, E. Alanazi, and A. Al-Fuqaha, "A Survey on the Use of Preferences for Virtual Machine Placement in Cloud Data Centers," ACM Computing Surveys (CSUR), vol. 54, no. 5, pp. 1-39, 2021.
[41] S. Taherizadeh and V. Stankovski, "Dynamic multi-level auto-scaling rules for containerized applications," The Computer Journal, vol. 62, no. 2, pp. 174-197, 2019.
[42] K. Razavi, L. M. Razorea, and T. Kielmann, "Reducing vm startup time and storage costs by vm image content consolidation," in European Conference on Parallel Processing, Aachen, Germany, 2013: Springer, pp. 75-84.
[43] M. Schmidt, N. Fallenbeck, M. Smith, and B. Freisleben, "Efficient distribution of virtual machines for cloud computing," in 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, Pisa, Italy, 2010: IEEE, pp. 567-574.
[44] OpenStack. "Launch an instance from a volume." https://docs.openstack.org/nova/queens/user/launch-instance-from-volume.html (accessed 23 Jan., 2022).
[45] Y. Govindaraju, H. A. Duran-Limon, and E. Mezura-Montes, "A regression tree predictive model for virtual machine startup time in IaaS clouds," Cluster Computing, vol. 24, no. 2, pp. 1217-1233, 2021.
[46] C.-d. Lu, "Scalable diskless checkpointing for large parallel systems," 2005.
[47] W. Tarreau. "HAProxy Documentation." https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#timeout%20server (accessed 26 Jan., 2022).
[48] S. Levine. "Configuring the Red Hat high availability add-on with Pacemaker - additional fencing configuration options." https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/configuring_the_red_hat_high_availability_add-on_with_pacemaker/s1-fencedevicesadditional-haar (accessed 26 Jan., 2022).
[49] M. Zahran, "Heterogeneous computing: Here to stay," Communications of the ACM, vol. 60, no. 3, pp. 42-45, 2017.
[50] H. Wu et al., "A reference model for virtual machine launching overhead," IEEE Transactions on Cloud Computing, vol. 4, no. 3, pp. 250-264, 2014.
[51] M. Bolte, M. Sievers, G. Birkenheuer, O. Niehörster, and A. Brinkmann, "Non-intrusive virtualization management using libvirt," in 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010), 2010: IEEE, pp. 574-579.
[52] D. Both, "Linux Boot and Startup," in Using and Administering Linux: Volume 1: Springer, 2020, pp. 451-490.
[53] Y.-L. Lee. Repository for experimental data related to average VM boot time. [Online]. Available: https://github.com/Ncu-software-research-center/NCU-VMDataset.git
[54] Q. Fu and L. White, "The impact of background traffic on TCP performance over indirect and direct routing," in The 8th International Conference on Communication Systems, 2002. ICCS 2002., 2002, vol. 1: IEEE, pp. 594-598.
[55] A. Munir et al., "Minimizing flow completion times in data centers," in 2013 Proceedings IEEE INFOCOM, 2013: IEEE, pp. 2157-2165.
[56] J. Álvarez Horcajo, D. López Pajares, I. Martinez Yelmo, J. A. Carral Pelayo, and J. M. Arco Rodríguez, "Improving multipath routing of TCP flows by network exploration," IEEE Access, vol. 7, pp. 13608-13621, 2019.
[57] A. Braccini, A. Del Bimbo, and E. Vicario, "Interprocess communication dependency on network load," IEEE transactions on software engineering, vol. 17, no. 4, p. 357, 1991.
[58] P. Mell and T. Grance, "The NIST definition of cloud computing," in "National Institute of Standards and Technology Special Publication 800-145," 2011.
[59] U. Parui and V. Sanil, "Introduction to Microsoft Azure," in Pro SQL Server Always On Availability Groups: Springer, 2016, pp. 277-281
[60] R. Hat. "Cloud Computing - What is PaaS?" https://www.redhat.com/en/topics/cloud-computing/what-is-paas (accessed 21 Feb., 2022).
[61] Z. Gao, C. Cecati, and S. X. Ding, "A survey of fault diagnosis and fault-tolerant techniques—Part I: Fault diagnosis with model-based and signal-based approaches," IEEE transactions on industrial electronics, vol. 62, no. 6, pp. 3757-3767, 2015.
[62] Z. Gao, C. Cecati, and S. X. Ding, "A survey of fault diagnosis and fault-tolerant techniques—Part II: Fault diagnosis with knowledge-based and hybrid/active approaches," IEEE Transactions on Industrial Electronics, vol. 62, no. 6, pp. 3768 - 3774, 2015.
[63] C. G. Bezerra, B. S. J. Costa, L. A. Guedes, and P. P. Angelov, "An evolving approach to unsupervised and real-time fault detection in industrial processes," Expert systems with applications, vol. 63, pp. 134-144, 2016.
[64] X. Zhang, L. Tang, and J. Decastro, "Robust fault diagnosis of aircraft engines: A nonlinear adaptive estimation-based approach," IEEE Transactions on Control Systems Technology, vol. 21, no. 3, pp. 861-868, 2012.
[65] M. Yu and D. Wang, "Model-based health monitoring for a vehicle steering system with multiple faults of unknown types," IEEE Transactions on industrial electronics, vol. 61, no. 7, pp. 3574-3586, 2013.
[66] Sematext. "Sematext Monitoring." https://sematext.com/docs/monitoring/#setting-up-monitoring-agents (accessed 21 Feb., 2022).
[67] AppDynamics. "Overview of End User Monitoring." https://docs.appdynamics.com/4.5.x/en/end-user-monitoring/overview-of-end-user-monitoring (accessed 21 Feb., 2022).
[68] Datadog. "Synthetic Monitoring." https://docs.datadoghq.com/synthetics/ (accessed 21 Feb., 2022).
[69] R. da Rosa Righi, V. F. Rodrigues, C. A. Da Costa, G. Galante, L. C. E. De Bona, and T. Ferreto, "Autoelastic: Automatic resource elasticity for high performance applications in the cloud," IEEE Transactions on Cloud Computing, vol. 4, no. 1, pp. 6-19, 2015.
[70] M. Ghobaei-Arani, A. Souri, T. Baker, and A. Hussien, "ControCity: an autonomous approach for controlling elasticity using buffer Management in Cloud Computing Environment," IEEE Access, vol. 7, pp. 106912-106924, 2019.
[71] C. Ferdinand and R. Heckmann, "ait: Worst-case execution time prediction by static program analysis," in Building the Information Society: Springer, 2004, pp. 377-383.
[72] V. Nitu et al., "Swift birth and quick death: Enabling fast parallel guest boot and destruction in the xen hypervisor," ACM SIGPLAN Notices, vol. 52, no. 7, pp. 1-14, 2017.
[73] H. Alwi, C. Edwards, and C. P. Tan, "Fault tolerant control and fault detection and isolation," in Fault Detection and Fault-Tolerant Control Using Sliding Modes: Springer, 2011, pp. 7-27.
[74] S. Senbel, "Teaching Self-Balancing Trees Using a Beauty Contest," in Proceedings of the 2019 ACM Conference on Innovation and Technology in Computer Science Education, 2019, pp. 245-246.
[75] S. U. Jan, Y. D. Lee, and I. S. Koo, "A distributed sensor-fault detection and diagnosis framework using machine learning," Information Sciences, vol. 547, pp. 777-796, 2021.
[76] D. Mohapatra, B. Subudhi, and R. Daniel, "Real-time sensor fault detection in tokamak using different machine learning algorithms," Fusion Engineering and Design, vol. 151, p. 111401, 2020.
[77] R. Luo, M. Misra, and D. M. Himmelblau, "Sensor fault detection via multiscale analysis and dynamic PCA," Industrial & Engineering Chemistry Research, vol. 38, no. 4, pp. 1489-1495, 1999. |