A Perspective on the Future of the Magnetic Hard Disk Drive (HDD) Technology (original) (raw)
Related papers
Failure Trends in a Large Disk Drive Population
2007
It is estimated that over 90% of all new information produced in the world is being stored on magnetic media, most of it on hard disk drives. Despite their importance, there is relatively little published work on the failure patterns of disk drives, and the key factors that affect their lifetime. Most available data are either based on extrapolation from accelerated aging experiments or from relatively modest sized field studies. Moreover, larger population studies rarely have the infrastructure in place to collect health signals from components in operation, which is critical information for detailed failure analysis.
Improved disk-drive failure warnings
IEEE Transactions on Reliability, 2002
Improved methods are proposed for disk drive failure prediction. The SMART (Self Monitoring and Reporting Technology) failure prediction system is currently implemented in disk drives. Its purpose is to predict the near-term failure of an individual hard disk drive, and issue a backup warning to prevent data loss. Two experimentally tests of SMART showed only moderate accuracy at low false alarm rates. (A rate of 0.2% of total drives per year implies that 20% of drive returns would be good drives, relative to ≈1% annual failure rate of drives). This requirement for very low false alarm rates is well known in medical diagnostic tests for rare diseases, and methodology used there suggests ways to improve SMART.
Assessement of current health of hard disk drives
2009
After investigating several of different degradation signatures that can potentially characterize aging and failure of computer hard disk drives (HDDs), we identified that reported uncorrect, hardware ECC recovered and read write rate parameters can provide good degradation signature for assessing the condition and remaining useful life of HDDs. Using these signatures as inputs, we develop a neural network model to assess the current health of a HDD. We collected extensive data by conducting experiments on 13 HDDs in an accelerated degradation mode. Experiments on 13 HDDs generated several hundreds of data points during their operating life. We used two thirds of these data points for computing the neural network parameters and the rest for evaluating the accuracy of model predictions. The results indicate that the trained neural network is able to assess the health of a HDD correctly 88 times out of 100 instances.
Data Loss Prevention From Magnetic Hard Disk Drive
This research paper represents my work on data loss prevention from hard drive specially for some hardware issues. This is very common issue for magnetic hard disk drive. But there are some technology behind to improve this kind of situation. There are other techniques to improve hardware failure of hard disk drive and improve the efficiency of the total system including the data loss prevention. Apart from man made error, robust electrical parts and better hardware optimizations can be done to prevent from data loss.
Are disks the dominant contributor for storage failures?
ACM Transactions on Storage, 2008
Building reliable storage systems becomes increasingly challenging as the complexity of modern storage systems continues to grow. Understanding storage failure characteristics is crucially important for designing and building a reliable storage system. While several recent studies have been conducted on understanding storage failures, almost all of them focus on the failure characteristics of one component -disks -and do not study other storage component failures. This paper analyzes the failure characteristics of storage subsystems. More specifically, we analyzed the storage logs collected from about 39,000 storage systems commercially deployed at various customer sites. The data set covers a period of 44 months and includes about 1,800,000 disks hosted in about 155,000 storage shelf enclosures. Our study reveals many interesting findings, providing useful guideline for designing reliable storage systems. Some of our major findings include: (1) In addition to disk failures that contribute to 20-55% of storage subsystem failures, other components such as physical interconnects and protocol stacks also account for significant percentages of storage subsystem failures.
LOGI: an empirical model of heat-induced disk drive data loss and its implications for data recovery
Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering
Disk storage continues to be an important medium for data recording in software engineering, and recovering data from a failed storage disk can be expensive and time-consuming. Unfortunately, while physical damage instances are well documented, existing studies of data loss are limited, often only predicting times between failures. We present an empirical measurement of patterns of heat damage on indicative, low-cost commodity hard drives. Because damaged hard drives require many hours to read, we propose an efficient, accurate sampling algorithm. Using our empirical measurements, we develop LOGI, a formal mathematical model that, on average, predicts sector damage with precision, recall, F-measure, and accuracy values of over 0.95. We also present a case study on the usage of LOGI and discuss its implications for file carver software. We hope that this model is used by other researchers to simulate damage and bootstrap further study of disk failures, helping engineers make informed decisions about data storage for software systems. CCS CONCEPTS • Computing methodologies → Model development and analysis; • Hardware → Fault models and test metrics.
A Stochastic Analysis of Hard Disk Drives
International Journal of Stochastic Analysis
We provide a stochastic analysis of hard disk performance, including a closed form solution for the average access time of a memory request. The model we use covers a wide range of types and applications of disks, and in particular it captures modern innovations like zone bit recording. The derivation is based on an analytical technique we call "shuffling", which greatly simplifies the analysis relative to previous work and provides a simple, easy-to-use formula for the average access time. Our analysis can predict performance of single disks for a wide range of disk types and workloads. Furthermore, it can predict the performance benefits of several optimizations, including short-stroking and mirroring, which are common in disk arrays.
Improving Energy Effeciency and Reliability of Disk Storage Systems
Numerous energy saving techniques have been developed to aggressively reduce energy dissipation in parallel disks. However, many existing energy conservation schemes have substantial adverse impacts on disk reliability. To remedy this deficiency, in this paper we address the problem of making tradeoffs between energy efficiency and reliability in parallel disk systems. Among several factors affecting disk reliability, the two most important factors – disk utilization and ages – are the focus of this study. We built a mathematical reliability model to quantify the impacts of disk age and utilization on failure probabilities of mirrored disk systems. In light of the reliability model, we proposed a novel concept of safe utilization zone, within which energy dissipation in disks can be reduced without degrading reliability. We developed two approaches to improving both reliability and energy efficiency of disk systems through disk mirroring and utilization control, enforcing disk drives to be operated in safe utilization zones. Our utilization-based control schemes seamlessly integrate reliability with energy saving techniques in the context of fault-tolerant systems. Experimental results show that our approaches can significantly improve reliable while achieving high-energy efficiency for disk systems under a wide range of workload situations.
Evaluating the Reliability of Storage Systems
Modern storage systems are often large complex distributed systems. Current techniques for evaluating their reliability function require the solution of a system of differential equations. We present a more elementary, intuitive approach that focuses on the steady-state behavior of each storage organization when it goes through repeated cycles of failures succeeded by repairs. As a result, our approach provides immediately a purely algebraic method for computing both the average failure rate and mean time to failure. We show how to apply our technique to model the high infant mortality of disk drives and the behavior of the so-called S.M.A.R.T. drives, which can warn users of impending disk failures. 1 J.-F. Pâris is with the Abstract Modern storage systems are often large complex distributed systems. Current techniques for evaluating their reliability function require the solution of a system of differential equations. We present a more elementary, intuitive approach that focuses on the steady-state behavior of each storage organization when it goes through repeated cycles of failures succeeded by repairs. As a result, our approach provides immediately a purely algebraic method for computing both the average failure rate and mean time to failure. We show how to apply our technique to model the high infant mortality of disk drives and the behavior of the so-called S.M.A.R.T. drives, which can warn users of impending disk failures.