Soft-error detection through software fault-tolerance techniques (original) (raw)

SEDSR: Soft Error Detection Using Software Redundancy

Journal of Software Engineering and Applications, 2012

This paper presents a new method for soft error detection using software redundancy (SEDSR) that is able to detect transient faults. Soft errors damage the control flow and data of programs and designers usually use hardware-based solutions to handle them. Software-based techniques for soft error detection force less cost and delay to systems and do not change their configuration. Therefore, these kinds of methods are appropriate alternatives for hardware-based techniques. SEDSR has two separate parts for data and control flow errors detection. Fault injection method is used to compare SEDSR with previous methods of this field based on the new parameter of "Evaluation Factor" that takes in account fault coverage, memory and performance overheads. These parameters are important in real time safety critical applications. Experimental results on SPEC2000 and some traditional benchmarks of this field show that SEDSR is much better than previous methods of this field. SEDSR's evaluation factor is about 50% better than other methods of this field. These results show its success in satisfaction of the existing tradeoff between fault coverage, performance and memory overheads.

Improving error detection with selective redundancy in software-based techniques

2013 14th Latin American Test Workshop - LATW, 2013

This paper presents an analysis of the impact of selective software-based techniques to detect faults in microprocessor systems. A set of algorithms is implemented, compiled to a microprocessor and selected variables of the code are hardened with software-based techniques. Seven different methods that choose which variables are hardened are introduced and compared. The system is implemented over a miniMIPS microprocessor and a fault injection campaign is performed in order to verify the feasibility and effectiveness of each selective fault tolerance approach. Results can lead designers to choose more wisely which variables of the code should be hardened considering detection rates and hardening overheads. I.

Hybrid Technique for Soft Error Detection in Dependable Embedded Software: a First Experiment

2019 IEEE XXVIII International Scientific Conference Electronics (ET), 2019

Embedded systems’ hardware can be impacted by soft errors, which causes either a data flow error or a control flow error in the systems’ software. To counter such errors, numerous software-implemented techniques have been proposed to detect either one of them. However, there exist few techniques that are designed to detect both types of errors. This paper aims to fill that gap by proposing a software-implemented technique that has been designed to detect both data flow and control flow errors, called Data and Control Flow Error Detection (DCFED). We verified the technique using a fault injection campaign and compared the measured results to those of a similar technique, called Software Implemented Error Detection (SIED). The results show that DCFED achieves a higher error detection ratio.

An Improved Data Error Detection Technique for Dependable Embedded Software

2018 IEEE 23rd Pacific Rim International Symposium on Dependable Computing (PRDC), 2018

This paper presents a new software-implemented data error detection technique called Full Duplication and Selective Comparison. Our technique combines the ideas of existing techniques in order to increase the fault detection ratio, decrease the imposed code size and execution time overhead. As the name gives away, we opt to duplicate the entire code base and place comparison instructions in critical basic blocks only. The critical basic blocks are the blocks with two or more incoming edges. We evaluate our technique by implementing it for several case studies and by performing fault injection experiments. Next, we compared the obtained results to the parameters of three established techniques: Error Detection by Diverse Data and Duplicated Instructions, Critical Block Duplication and Software Implemented Fault Tolerance. The results show an average increase of 20.5% in fault detection ratio and an average decrease in code size and execution time overhead of 12.6% and 0.5%, respectiv...

A methodology for the generation of efficient error detection mechanisms

2011

A dependable software system must contain error detection mechanisms and error recovery mechanisms. Software components for the detection of errors are typically designed based on a system specification or the experience of software engineers, with their efficiency typically being measured using fault injection and metrics such as coverage and latency. In this paper, we introduce a methodology for the design of highly efficient error detection mechanisms. The proposed methodology combines fault injection analysis and data mining techniques in order to generate predicates for efficient error detection mechanisms. The results presented demonstrate the viability of the methodology as an approach for the development of efficient error detection mechanisms, as the predicates generated yield a true positive rate of almost 100% and a false positive rate very close to 0% for the detection of failure-inducing states. The main advantage of the proposed methodology over current state-of-the-art approaches is that efficient detectors are obtained by design, rather than by using specification-based detector design or the experience of software engineers.

Detecting Soft Errors by a Purely Software Approach: Method, Tools and Experimental Results

Embedded Software for SoC, 2004

In this paper is described a software technique allowing to detect soft errors occurring in processor-based digital architectures. The detection mechanism is based on a set of rules allowing the transformation of the target application into a new one, having same functionalities but being able to identify bit-flips arising in memory areas as well as those perturbing the processor's internal registers. Experimental results issued from fault injection sessions and preliminary radiation test campaigns performed in complex DSP processor, provide objective figures about the efficiency of the proposed error detection technique. Table 3: Detection efficiency and error rate for both program versions CMA Original CMA Hardened § © © none 83.27 %

Effectiveness and limitations of various software techniques for "soft error" detection: a comparative study

Proceedings Seventh International On-Line Testing Workshop, 2001

Politecnico di Torino vbvvvvvv ABSTRACT This paper deals with different software based strategies allowing the on-line detection of bit flip errors arising in microprocessor-based digital architectures as the consequence of the interaction with radiation. Fault injection experiments put in evidence the detection capabilities and the limitations of each of the studied techniques.

Proposing an Efficient Software-based Method to Enhance Reliability of Computer Systems against Soft Errors

2017

In recent years, along with rapid developments in technology, computer systems have increasingly become more integrated and more modular. Indeed, the reliability and efficiency of computer systems are of high significance. Hence, the quantitative evaluation of the optimization of reliability indexes in computer systems is considered to be a crucial issue. Reliability enhancement of computer systems against electromagnet and radiation interferences is a critical requirement in industries. Enhancing reliability can prevent system failure and fault and avoid financial and humanistic losses. Accordingly, software can play outstanding roles in reducing the number of errors in software programs. Consequently, software can enhance the reliability of computer systems. In this paper, an efficient software-based method was proposed to enhance the reliability of computer systems against soft errors. The results of the experiments revealed that the proposed method had lower overhead, higher eff...

Enhancing Software Reliability against Soft-Error using Minimum Redundancy on Critical Data

2017

Nowadays, software systems play remarkable roles in human life and software has become an indispensable aspect of modern society. Hence, regarding the high significance of software, establishing and maintaining software reliability is considered to be an essential issue so that error occurrence, failure and disaster can be prevented. Thus, the magnitude of errors in a program should be detected and identified and software reliability should be measured and investigated so as to prevent the spread of error. In line with this purpose, different methods have been proposed in the literature on software reliability; however, the majority of the proposed methods are inefficient and undesirable due to their high overhead, vulnerability, excessive redundancy and high data replication. The method introduced in this paper identifies vulnerable data of the program and uses class diagram and the proposed formula. Also, by applying minimum redundancy and duplication on 70% of the critical data o...