Common Framework For Automated Diagnosis on Distributed Information System (original) (raw)
Many information systems, suffer from a common problem : when system fail to function properly, it is often difficult to determine which part of the system is source of the problem. The time that is needed to detect and determine the sources of the problem can take days, but they are repaired quickly once found. This cause late recovery from failure, so the system have low availability and become unreliable. This paper will learn some approach to diagnose root cause of system failure without detailed knowledge about system structure and propose a common framework to do diagnosis based on these approach. We deploy sensors inside system and the sensors monitor the system behavior and runtime properties, that give information to help diagnose faults. Combining with the information of success or failure at the apllication layer, and using statistical analysis, user can infer which sensors indicate the most relevant data may contributed to the failure. Thus, it can do fast detection in order to aid system recovery process faster.
Sign up for access to the world's latest research.
checkGet notified about relevant papers
checkSave papers to use in your research
checkJoin the discussion with peers
checkTrack your impact
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.