Document Type : Review Article

Authors

Department of Computer Engineering, Arak Branch, Islamic Azad University, Arak, Iran

Abstract

Cloud computing in the field of high-performance distributed computing has emerged as a new development in which the demand for access to resources via the Internet is presented in distributed servers that dynamically scale Are acceptable. One of the important research issues that must be considered to achieve efficient performance is fault tolerance. Fault tolerance is a way to find faults and failures in a system. Predicting and reducing errors play an important role in increasing the performance and popularity of cloud computing. In this study, an adaptive workflow scheduling approach is presented to increase fault tolerance in cloud computing. The present approach calculates the probability of failure for each resource according to the execution time of tasks on the resources. In the present method, a deadline is set for each of the tasks. If the task is not completed within the specified time, the probability of failure in the source increases and subsequent tasks are not sent to the desired source. The simulation results of the proposed method show that the proposed idea can work well on workflows and improve service quality factors.

Keywords

[1] V. Mohammadian, N. J. Navimipour, M. Hosseinzadeh, and A. Darwesh, “Comprehensive and Systematic Study on the Fault Tolerance Architectures in Cloud Computing,” Journal of Circuits, Systems and Computers, p. 2050240, Jun. 2020, doi: 10.1142/s0218126620502400.
[2] T. Welsh and E. Benkhelifa, “On Resilience in Cloud Computing,” ACM Computing Surveys, vol. 53, no. 3. Association for Computing Machinery, Jun. 01, 2020, doi: 10.1145/3388922.
[3] M. Nazari Cheraghlou, A. Khadem-Zadeh, and M. Haghparast, “A survey of fault tolerance architecture in cloud computing,” Journal of Network and Computer Applications, vol. 61. Academic Press, pp. 81–92, Feb. 01, 2016, doi: 10.1016/j.jnca.2015.10.004.
[4] S. Kumar, D. S. Rana, and S. C. Dimri, “Fault tolerance and load balancing algorithm in cloud computing: A survey,” International Journal of Advanced Research in Computer and Communication Engineering, vol. 4, no. 7, pp. 92–96, 2015.
[5] U. Dwivedi and H. Dev, “A review on fault tolerance techniques and algorithms in green cloud computing,” Journal of Computational and Theoretical Nanoscience, vol. 15. American Scientific Publishers, pp. 2689–2700, Sep. 01, 2018, doi: 10.1166/jctn.2018.7560.
[6] M. Hasan and M. S. Goraya, “Fault tolerance in cloud computing environment: A systematic survey,” Computers in Industry, vol. 99. Elsevier B.V., pp. 156–172, Aug. 01, 2018, doi: 10.1016/j.compind.2018.03.027.
[7] D. Jain, N. Zaidi, R. Bansal, P. Kumar, and T. Choudhury, “Inspection of fault tolerance in cloud environment,” in Advances in Intelligent Systems and Computing, 2018, vol. 672, pp. 1022–1030, doi: 10.1007/978-981-10-7512-4_103.
[8] C. Kathpal and R. Garg, “Survey on Fault-Tolerance-Aware Scheduling in Cloud Computing,” in Lecture Notes in Networks and Systems, vol. 40, Springer, 2019, pp. 275–283.
[9] P. Kumari and P. Kaur, “A survey of fault tolerance in cloud computing,” Journal of King Saud University - Computer and Information Sciences, Oct. 2018, doi: 10.1016/j.jksuci.2018.09.021.
[10] G. Singh and S. Kinger, “A survey on fault tolerance techniques and methods in cloud computing,” International Journal of Engineering Research and Technology, vol. 2, no. 6, 2013.
[11] A. S. Abohamama, M. F. Alrahmawy, and M. A. Elsoud, “Improving the dependability of cloud environment for hosting real time applications,” Ain Shams Engineering Journal, vol. 9, no. 4, pp. 3335–3346, Dec. 2018, doi: 10.1016/j.asej.2017.11.006.
[12] H. Yan, X. Zhu, H. Chen, H. Guo, W. Zhou, and W. Bao, “DEFT: Dynamic Fault-Tolerant Elastic scheduling for tasks with uncertain runtime in cloud,” Information Sciences, vol. 477, pp. 30–46, Mar. 2019, doi: 10.1016/j.ins.2018.10.020.
[13] Y. Ding, G. Yao, and K. Hao, “Fault-tolerant elastic scheduling algorithm for workflow in Cloud systems,” Information Sciences, vol. 393, pp. 47–65, Jul. 2017, doi: 10.1016/j.ins.2017.01.035.
[14] S. Talwani and I. Chana, “Fault tolerance techniques for scientific applications in cloud,” in 2nd International Conference on Telecommunication and Networks, TEL-NET 2017, Apr. 2018, vol. 2018-January, pp. 1–5, doi: 10.1109/TEL-NET.2017.8343578.