The need for preservation aware storage (original) (raw)

Preservation DataStores: Architecture for Preservation Aware Storage

24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007), 2007

The volumes of digital information are growing continuously and most of today's information is "born digital". Alongside this trend, business, scientific, artistic and cultural needs require much of this information to be kept for decades, centuries or longer. The convergence of these two trends implies the need for storage systems that support very long term preservation for digital information. We describe Preservation DataStores, a novel storage architecture to support digital preservation. It is a layered architecture that builds upon open standards, along with the OAIS, XAM and OSD standards. This new architecture transforms the logical information-object, a basic concept in preservation systems, into a physical storage object. The transformation allows more robust and optimized implementations for preservation aware storage. The architecture of Preservation DataStores is being developed as an infrastructure component of the CASPAR project * and will be tested in the context of this project using scientific, cultural, and artistic data. * Work partially supported by European Community under the Information Society Technologies (IST) program of the 6th FP for RTD -project CASPAR contract IST-033572. The authors are solely responsible for the content of this paper. It does not represent the opinion of the European Community, and the European Community is not responsible for any use that might be made of data appearing therein.

Critique of Architectures for Long-Term Digital Preservation

2009

Trusted Digital Repositories (TDRs) and Trustworthy Digital Objects (TDOs) seem to be the only generic digital preservation methodologies proposed. Before any preservation method is recommended for wide use, it should be exposed to searching analysis. Evolving technology and fading human memory threaten the long-term intelligibility of many kinds of documents. Furthermore, some records are susceptible to improper alterations that make them untrustworthy. We argue that the TDR approach has shortfalls as a method for long-term digital preservation of sensitive information. For specificity, we discuss a particular implementation. TDO methodology addresses these needs, providing for making digital documents durably intelligible. It uses EDP standards for a few file formats and XML structures for text documents. For other information formats, intelligibility is assured by using a virtual computer. To protect sensitive information-content whose inappropriate alteration might mislead its readers, the integrity and authenticity of each TDO is made testable by embedded public-key cryptographic message digests and signatures. The authenticity of the keys is protected recursively in a social hierarchy grounded by publishing keys of well-known institutions. A TDO is a specific kind of OAIS Archival Information Package convenient for sharing among repositories. The content and metadata of properly constructed TDOs are sufficient for creating the usual kinds of catalog records and search indices during repository ingestion. Comparison of TDR and TDO methodologies suggests differentiating near-term preservation measures from what is needed for the long term. The proper focus for long-term preservation technology is signed packages that each combine a record collection with its metadata and that also bind context-Trustworthy Digital Objects. If all that stuff was worth creating, surely some of it is worth saving! © 2009, H.M. Gladney G:\W\DL\DigPres\Crit\TDR&TDO.doc more expensive to correct than they are today. This examination should seek opportunities to reduce complexity that might mislead readers. Technology for near-term preservation needs flexibility for software improvements. In contrast, technology for long-term preservation needs to be insensitive to changing technology and infrastructure. It therefore proves helpful to distinguish near-term preservation from long-term preservation. What Is the Challenge? What is the meaning of preservation? Does the meaning change when it is applied to electronic rather than paper-based records? ... Will current strategies for preserving electronic records ensure longevity and authenticity? ... Have effective cost models been developed? 2 The notion of a digital preservation theory 3,4 is recent, being mentioned earlier than 2007 only in comments about shortfalls. What do people expect of a theory to think it useful? To be most helpful for engineering, a theory would exhibit at least the following characteristics. • It would be based on broad fundamental theory that is widely accepted as germane and successful. • It would differentiate its topic from nearby topics, particularly topics that already have good theories.

Long Term Preservation of Digital Data—Background Research

2012

The main part of this report describes the outcome of our questionnaire study on LTP systems that was performed during the second half of 2011. The study discusses types of systems deployed in memory institutions and their main features.

Evolving Domains, Problems and Solutions for Long Term Digital Preservation

We present, compare and contrast new directions in long term digital preservation as covered by the four large European Community funded research projects that started in 2011. The new projects widen the domain of digital preservation from the traditional purview of memory institutions preserving documents to include scenarios such as health-care, data with direct commercial value, and webbased data. Some of these projects consider not only how to preserve the programs needed to interpret the data but also how to manage and preserve the related workflows. Considerations such as risk analysis and cost estimation are built into some of them, and more than one of these efforts is examining the use of cloud-based technologies. All projects look into programmatic solutions, while emphasizing different aspects such as data collection, scalability, reconfigurability, and full lifecycle management. These new directions will make digital preservation applicable to a wider domain of users and will give better tools to assist in the process.

PDS Cloud: Long Term Digital Preservation in the Cloud

2013 IEEE International Conference on Cloud Engineering (IC2E), 2013

The emergence of the cloud and advanced object-based storage services provides opportunities to support novel models for long term preservation of digital assets. Among the benefits of this approach is leveraging the cloud's inherent scalability and redundancy to dynamically adapt to evolving needs of digital preservation. Preservation DataStores in the Cloud (PDS Cloud) is an OAIS-based preservation-aware storage service employing multiple heterogeneous cloud providers. It materializes the logical concept of a preservation information-object into physical cloud storage objects. Preserved information can be interpreted by deploying virtual appliances in the compute cloud, built from readily available components and provisioned with data objects together with their designated rendering software. PDS Cloud has a hierarchical data model and resource naming structure, supporting independent tenants whose assets are organized in multiple aggregations based on content and value. Each aggregation has a separate preservation profile that is reconfigurable as requirements keep changing over the long term. Continuous changes to data objects, life-cycle activities, virtual appliances and cloud providers are applied in a manner transparent to the client. PDS Cloud is being developed as an infrastructure component of the European Union ENSURE project, where it is used for preservation of medical and financial data.

Digital Data Preservation: The Millennium CD and Graceful Degradation

Digital information is presently stored on optical disks, magnetic disks, and solid-state memory chips. The expected lifetime for these media is not good, when considering them for archival purposes: recordable optical disks: 7-15 years; magnetic disks: 30-50 years; solid-state memory: 10-12 years. Each of these numbers are for media stored under controlled conditions; the numbers get much worse if the media are stored under ordinary use conditions, which includes being transported and handled.

An Overview of the Digital Preservation Storage Criteria and Usage Guide

2019

The Digital Preservation Storage Criteria (or “Criteria”) resulted from a community discussion at iPres 2015 on providing guidance to organizations that either use or provide digital preservation storage. First developed in 2016, they have been refined in iterative versions over the last three years based on feedback gathered at conference sessions and through a survey. The Criteria are intended to help with developing requirements for, or evaluations of, preservation storage solutions; to seed discussions about preservation storage; or to use within digital preservation instructional material. The latest version of the Criteria contains sixty-one criteria grouped into eight categories: content integrity, cost considerations, flexibility, information security, resilience, scalability & performance, support, and transparency. The key new development since the Criteria was presented at the iPRES 2018 workshop is a usage guide, developed to accompany the Criteria. It includes sections ...

Towards a Theory of Digital Preservation

International Journal of Digital Curation, 2008

A preservation environment manages communication from the past while communicating with the future. Information generated in the past is sent into the future by the current preservation environment. The proof that the preservation environment preserves authenticity and integrity while performing the communication constitutes a theory of digital preservation. We examine the representation information that is needed about the preservation environment for a theory of digital preservation. The representation information includes descriptions of the preservation management policies, the preservation processes, and the state information that is needed to verify the correct working behavior of the system. We demonstrate rule-based data grids that can verify that prior policies correctly enforced preservation properties, while sending into the future descriptions of the current preservation management policies.