Long Term Preservation of Digital Data—Background Research (original) (raw)
Related papers
Critique of Architectures for Long-Term Digital Preservation
2009
Trusted Digital Repositories (TDRs) and Trustworthy Digital Objects (TDOs) seem to be the only generic digital preservation methodologies proposed. Before any preservation method is recommended for wide use, it should be exposed to searching analysis. Evolving technology and fading human memory threaten the long-term intelligibility of many kinds of documents. Furthermore, some records are susceptible to improper alterations that make them untrustworthy. We argue that the TDR approach has shortfalls as a method for long-term digital preservation of sensitive information. For specificity, we discuss a particular implementation. TDO methodology addresses these needs, providing for making digital documents durably intelligible. It uses EDP standards for a few file formats and XML structures for text documents. For other information formats, intelligibility is assured by using a virtual computer. To protect sensitive information-content whose inappropriate alteration might mislead its readers, the integrity and authenticity of each TDO is made testable by embedded public-key cryptographic message digests and signatures. The authenticity of the keys is protected recursively in a social hierarchy grounded by publishing keys of well-known institutions. A TDO is a specific kind of OAIS Archival Information Package convenient for sharing among repositories. The content and metadata of properly constructed TDOs are sufficient for creating the usual kinds of catalog records and search indices during repository ingestion. Comparison of TDR and TDO methodologies suggests differentiating near-term preservation measures from what is needed for the long term. The proper focus for long-term preservation technology is signed packages that each combine a record collection with its metadata and that also bind context-Trustworthy Digital Objects. If all that stuff was worth creating, surely some of it is worth saving! © 2009, H.M. Gladney G:\W\DL\DigPres\Crit\TDR&TDO.doc more expensive to correct than they are today. This examination should seek opportunities to reduce complexity that might mislead readers. Technology for near-term preservation needs flexibility for software improvements. In contrast, technology for long-term preservation needs to be insensitive to changing technology and infrastructure. It therefore proves helpful to distinguish near-term preservation from long-term preservation. What Is the Challenge? What is the meaning of preservation? Does the meaning change when it is applied to electronic rather than paper-based records? ... Will current strategies for preserving electronic records ensure longevity and authenticity? ... Have effective cost models been developed? 2 The notion of a digital preservation theory 3,4 is recent, being mentioned earlier than 2007 only in comments about shortfalls. What do people expect of a theory to think it useful? To be most helpful for engineering, a theory would exhibit at least the following characteristics. • It would be based on broad fundamental theory that is widely accepted as germane and successful. • It would differentiate its topic from nearby topics, particularly topics that already have good theories.
Long-term archiving of digital data
2010
E-government applications have to archive data or documents for long retention periods of 100 years or more. This requires to store digital data on stable media, and to ensure that the file formats can be read by available software. Both applications as well as media technology have only short life spans. Thus, data has to be migrated at frequent intervals onto new data carriers and to new file formats. However, original file versions usually need to be retained permanently. In terms of cost, stability and technology independence, microfilm storage offers a promising solution for off-line storage. This paper reports on a feasibility study analysing encoding techniques that allow digital data to be saved onto microfilm, testing data recovery as well as cost issues.
The Long Term Data Preservation (LTDP) functional user requirements and system requirements are presented in this paper, showing the most appropriate architectures in relation to main scenarios as identified by ESA FIRST and LAST activities. According to classical engineering process, functional user requirements concerned with long term data preservation of scientific data (Earth Science) are captured and analysed, finally deriving in the definition of the System Requirements and consequently the most appropriate system designs and architectures. In this context, the users of the system have different roles, mainly classified in consumers and data holders of EO archives. Initially, the functional user requirements involving consumers were identified in the context of FIRST activity, including an analysis of business needs requiring and justifying long term preservation and performing an analysis of typical scientific mission phases, along with a preliminary contribution to the iden...
The need for preservation aware storage
ACM SIGOPS Operating Systems Review, 2007
Digital Preservation deals with ensuring that digital data stored today can be read and interpreted tens or hundreds of years from now. At the heart of any solution to the preservation problem lies a storage component. This paper characterizes the requirements for such a component, defines its desirable properties and presents the need for preservationaware storage systems. Our research is conducted as part of CASPAR, a new European Union (EU) integrated project on the preservation of data for very long periods of time. The position presented was developed while designing the storage foundation for the CASPAR software framework.
An Overview of the Digital Preservation Storage Criteria and Usage Guide
2019
The Digital Preservation Storage Criteria (or “Criteria”) resulted from a community discussion at iPres 2015 on providing guidance to organizations that either use or provide digital preservation storage. First developed in 2016, they have been refined in iterative versions over the last three years based on feedback gathered at conference sessions and through a survey. The Criteria are intended to help with developing requirements for, or evaluations of, preservation storage solutions; to seed discussions about preservation storage; or to use within digital preservation instructional material. The latest version of the Criteria contains sixty-one criteria grouped into eight categories: content integrity, cost considerations, flexibility, information security, resilience, scalability & performance, support, and transparency. The key new development since the Criteria was presented at the iPRES 2018 workshop is a usage guide, developed to accompany the Criteria. It includes sections ...
A FRAMEWORK FOR IDENTIFYING UNCERTAINTIES IN LONG-TERM DIGITAL PRESERVATION
With the current expansion in digital information comes an increasing need to preserve such assets. The ENSURE (Enabling kNowledge Sustainability, Usability and Recovery for Economic value) pro-ject, a research project under the European Community's Seventh Framework Programme, is the par-ent project to this research area and its aim is to conduct advanced research to address the challenges of Long Term Digital Preservation (LTDP) to ensure the successful preservation, availability and ac-cessibility of preserved data in the future. Focusing on identifying uncertainties in the LTDP activities and their impact on cost and economic performance of digital preservation systems, this paper dis-cusses a framework to identify uncertainties in LTDP for business sectors interested.
Long-term Inactive Data Retention through Tape Storage Technology
INFuture2009 - Digital Resources and Knowledge Sharing, 2009
Increasingly the need to retain digital documents indefinitely for legal, administrative or historical purposes is simply leading to a “save everything forever” approach. The authors argue that due to the technological reasons it is much easier to preserve large amount of documents in the electronic than in the paper form. Thus the selection procedures tend to be less restrictive than they used to be. Nevertheless, for most organizations it would be impossible to sustain this data growth forever. Archives, libraries, museums, institutions holding cultural heritage, as well as other companies and firms, are implementing solutions for creating digital archives, digital libraries, digital repositories and other types of storage systems aiming at long-term preservation of digital materials. Most of the data held in such systems are inactive for a long time, i.e. only a small set of data is frequently retrieved. Therefore, due to the specific needs of every organization, the storage planning process and the technology that is going to be used for storage and long-term preservation requires individual approach. The focus of this paper is on the retention of the long-term inactive data through tape storage technology. The authors will discuss current state of the art tape storage capabilities, and their advantages and disadvantages as a long-term storage and preservation solution.
Evolving Domains, Problems and Solutions for Long Term Digital Preservation
We present, compare and contrast new directions in long term digital preservation as covered by the four large European Community funded research projects that started in 2011. The new projects widen the domain of digital preservation from the traditional purview of memory institutions preserving documents to include scenarios such as health-care, data with direct commercial value, and webbased data. Some of these projects consider not only how to preserve the programs needed to interpret the data but also how to manage and preserve the related workflows. Considerations such as risk analysis and cost estimation are built into some of them, and more than one of these efforts is examining the use of cloud-based technologies. All projects look into programmatic solutions, while emphasizing different aspects such as data collection, scalability, reconfigurability, and full lifecycle management. These new directions will make digital preservation applicable to a wider domain of users and will give better tools to assist in the process.
A System for Long-Term Document Preservation
Archiving Conference
This paper analyzes the requirements and describes a system designed for retaining records and ensuring their legibility, interpretability, availability, and provable authenticity over long periods of time. In general, information preservation is accomplished not by any one single technique, but by avoiding all of the many possible events that might cause loss. The focus of the system is on preservation in the 10 to 100 year time span-a long enough period such that many difficult problems are known and can be addressed, but not unimaginable in terms of the longevity of computer systems and technology. The general approach focuses on eliminating single points of failure-single elements whose failure would cause information loss-combined with active detection and repair in the event of failure. Techniques employed include secret sharing, aggressive "preemptive" format conversion, metadata acquisition, active monitoring, and using standard Internet storage services in a novel way.