Fault-Tolerance Techniques in embedded system (original) (raw)

Building Embedded Fault-Tolerant Systems for Critical Applications: An Experimental Study

IFIP WG10.3 Publications, 2002

An increasing range of industries have a growing dependence on embedded software systems, many of which are safety-critical, real-time applications that require e xtremely high dependability. Two fundamental approaches - fault avoidance a nd fault tolerance - have been proposed to increase the overall dependability of such systems. However, the increased cost of using the fault tolerance approach may mean

Comparative Analysis Of Fault-Tolerance Techniques For Space Applications

2014

Fault-tolerance technique enables a system or application to continue working even if some fault /error occurs in a system. Therefore, it is vital to choose appropriate fault tolerant technique best suited to our application. In case of real-time embedded systems in a space project, the importance of such techniques becomes more critical. In space applications, there is minor or no possibility of maintenance and faults occurrence may lead to serious consequences in terms of partial or complete mission failure. This paper describes the comparison of various fault tolerant techniques for space applications. This also suggests the suitability of these techniques in particular scenario. The study of fault tolerance techniques relevant to real-time embedded systems and on-board space applications (satellites) is given due importance. This study will not only summarize fault tolerant techniques but also describe their strengths. The paper describes the future trends of faults-tolerance t...

Operating System Fault Tolerance Support for Real-Time Embedded Applications

2009

Fault tolerance is a means of achieving high dependability for critical and highavailability systems. Despite the efforts to prevent and remove faults during the development of these systems, the application of fault tolerance is usually required because the hardware may fail during system operation and software faults are very hard to eliminate completely. One of the difficulties in implementing fault tolerance techniques is the lack of support from operating systems and middleware. In most fault tolerant projects, the programmer has to develop a fault tolerance implementation for each application. This strong customization makes the fault-tolerant software costly and difficult to implement and maintain. In particular, for small-scale embedded systems, the introduction of fault tolerance techniques may also have impact on their restricted resources, such as processing power and memory size. Contents Acknowledgements.

A Survey of Fault-Tolerance Techniques for Embedded Systems From the Perspective of Power, Energy, and Thermal Issues

IEEE Access, 2022

The relentless technology scaling has provided a significant increase in processor performance, but on the other hand, it has led to adverse impacts on system reliability. In particular, technology scaling increases the processor susceptibility to radiation-induced transient faults. Moreover, technology scaling with the discontinuation of Dennard scaling increases the power densities, thereby temperatures, on the chip. High temperature, in turn, accelerates transistor aging mechanisms, which may ultimately lead to permanent faults on the chip. To assure a reliable system operation, despite these potential reliability concerns, fault-tolerance techniques have emerged. Specifically, fault-tolerance techniques employ some kind of redundancies to satisfy specific reliability requirements. However, the integration of fault-tolerance techniques into real-time embedded systems complicates preserving timing constraints. As a remedy, many task mapping/scheduling policies have been proposed to consider the integration of fault-tolerance techniques and enforce both timing and reliability guarantees for real-time embedded systems. More advanced techniques aim additionally at minimizing power and energy while at the same time satisfying timing and reliability constraints. Recently, some scheduling techniques have started to tackle a new challenge, which is the temperature increase induced by employing fault-tolerance techniques. These emerging techniques aim at satisfying temperature constraints besides timing and reliability constraints. This paper provides an in-depth survey of the emerging research efforts that exploit fault-tolerance techniques while considering timing, power/energy, and temperature from the real-time embedded systems' design perspective. In particular, the task mapping/scheduling policies for fault-tolerance real-time embedded systems are reviewed and classified according to their considered goals and constraints. Moreover, the employed fault-tolerance techniques, application models, and hardware models are considered as additional dimensions of the presented classification. Lastly, this survey gives deep insights into the main achievements and shortcomings of the existing approaches and highlights the most promising ones. INDEX TERMS Fault-tolerance, embedded systems, real-time computing, scheduling, power/energy minimization, thermal-aware design.

Evaluation of Software-Based Fault-Tolerant Techniques on Embedded OS ’ s Components

2014

Software-based fault-tolerant techniques at the operating system level are an effective way to enhance the reliability of safety-critical embedded applications. This paper provides an analysis and comparison of five well-known recovery techniques, i.e., micro rebooting, recovery block, N-Version Programming (NVP), micro extension, and transactional extension for an embedded operating system’s components, from performance point of view. These techniques are applied without any modification on the main architecture of the operating system. The techniques are implemented on a virtual ARM Integrator board which is emulated by the QEMU software (2.0.0) under the control of Embedded Linux operating system (3.9.0). The totals of 5000 software errors are ignited using a simulation environment. The results show that the recovery time overhead varies between 0.17% and 0.67%, and the performance overhead varies between 5.81% and 218.65% depending on the techniques. Keywords-embedded operating ...

A mathematical Tool for Support of Fault-Tolerant Embedded Systems Design

2007

Abstract Designers of fault-tolerant computer systems need methodological and software framework which would support their efforts in analysis and optimization of new design solutions, based on new and forthcoming hardware and software technologies, embedded systems, in particular These new and advanced technologies-high-performance and self-reconfigurable systems, nanotechnologies-lead to unprecedented challenges.

Waves of Faults in Embedded System and Way out through Fault Tolerance

International Journal of …, 2011

Software has rapidly become an important and indispensable element in many aspects of our daily lives. If such element is not running as on our need, we have to go through the problems about it. In initially, the paper focus on the different types of faults, their impact and fault classification. Faults are subdivided into different activities such as fault prediction, fault detection, fault prevention, fault correction etc. Here we study the faults in context boiler system. The concern thing is Faults classification as external, location, duration, and effect, permanent, temporary and may more. Any fault arise within system can be avoid, prevent or removed. Then we propose the different fault tolerance techniques to deal with different faults.

Application-Level Fault Tolerance in Real-Time Embedded Systems

Critical real-time embedded systems need to make use of fault tolerance techniques to cope with operation time errors, either in hardware or software. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This work proposes and evaluates a fault tolerance framework for supporting the development of dependable applications. This framework is build upon basic operating system services and middleware communications and brings flexible and transparent support for application threads. A case study involving radar filtering is described and the framework advantages and drawbacks are discussed.

Experimental Evaluation of Hardware/Software Fault Tolerance

IFAC Proceedings Volumes, 2000

The paper deals with the problem of the effectiveness of fault tolerant techniques aimed at high reliability and safety. It concentrates mostly on software redundant techniques which use also hardware error detectors. These techniques has been checked by inserting faults into the analyzed systems and observing their behaviour. For this purpose special tools has been developed. Some experimental results illustrate the feasibility of the proposed methods.