Performance and Dependability Validation of Highly Parallel Fault-Tolerant Systems (original) (raw)
Related papers
Fault-Aware Runtime Strategies for High-Performance Computing
IEEE Transactions on Parallel and Distributed Systems, 2009
Self-Healing Dilemmas in Distributed Systems: Fault Correction vs. Fault Tolerance
IEEE Transactions on Network and Service Management, 2021
Chameleon: a software infrastructure for adaptive fault tolerance
IEEE Transactions on Parallel and Distributed Systems, 1999
Design and assessment of high performance fault-tolerant digital systems
8th Computing in Aerospace Conference, 1991
Probabilistic diagnosis of performance faults in large-scale parallel applications
Proceedings of the 21st international conference on Parallel architectures and compilation techniques - PACT '12, 2012
Performance-reliability tradeoff analysis for multithreaded applications
2012 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2012
IEICE Transactions on Information and Systems, 2014
Verifying Safety of Fault-Tolerant Distributed Components
Lecture Notes in Computer Science, 2012
Queueing Analysis of Fault-Tolerant Computer Systems
IEEE Transactions on Software Engineering, 1987
Breaking the Limits of Redundancy Systems Analysis
Proceedings of the 29th European Safety and Reliability Conference (ESREL)
Development of massively parallel applications
Computer Physics Communications, 1994
Risk-Sensitive Control for the Parallel Server Model
SIAM Journal on Control and Optimization, 2013
A framework for dependability engineering of critical computing systems
Safety Science, 2002
High Performance Dependable Multiprocessor II
2007 IEEE Aerospace Conference, 2007
Control of cascading failures using protective measures
Scientific reports, 2024
Availability and performance aspects for mainframe consolidated servers
2016
Enhancing Dependability Through Flexible Adaptation to Changing Requirements
Luis Fernando Carrillo Andrade
Architecting Dependable Systems II, 2004
Concurrent error detection using watchdog processors-a survey
IEEE Transactions on Computers, 1988
Performance evaluation of fault tolerance techniques in grid computing system
Computers & Electrical Engineering, 2010
Automatic verification of the Inter-consistency fault tolerance mechanism
A framework for fault tolerance in distributed real time systems
IEEE International Conference on Emerging Technologies (ICET 2005), 2005
Formal Techniques for Synchronized Fault-Tolerant Systems
Dependable Computing and Fault-Tolerant Systems, 1993
An exception handling software architecture for developing fault-tolerant software
Proceedings. Fifth IEEE International Symposium on High Assurance Systems Engineering (HASE 2000), 2000
Numerical Evaluation of Performability and Job Completion Time in Repairable Fault-Tolerant Systems
1990