Learning from failure: The case of the disappearing web site (original) (raw)
Related papers
Approaches To The Preservation Of Web Sites
Web sites have a tendency to disappear which can result in the loss of valuable scholarly and cultural resources. Although there are a wide range of tools available for mirroring Web resources there is still no clear agreement on the strategies which should be adopted for the preservation of Web sites and the various approaches are fraught with technical and resourcing difficulties, as well as legal and copyright issues. This paper describes various approaches to the preservation of Web site and outlines some of the technical challenges. Recommendations on best practices for Web site managers and funding bodies are given.
Why web sites are lost (and how they're sometimes found)
Communications of the ACM, 2009
We have surveyed 52 individuals who have "lost" their own personal website (through a hard drive crash, bankrupt ISP, etc.) or tried to recover a lost website that once belonged to someone else. Our survey investigates why websites are lost and how successful individuals have been at recovering them using a variety of methods, including the use of search engine caches and web archives. The findings suggest that personal and third party loss of digital data is likely to continue as methods for backing up data are overlooked or performed incorrectly, and individual behavior is unlikely to change because of the perception that losing digital data is very uncommon and the responsibility of others. 4
It Takes A Village To Save The Web: The End Of Term Web Archive
Documents to the People, 2012
The goal of the project team was to execute a comprehensive harvest of the federal government domains (.gov, .mil, .org, etc.) in the final months of the Bush administration, and to document changes in the federal government websites as agencies transitioned to the Obama administration. This collaborative effort was prompted by the announcement that the National Archives and Records Administration (NARA), which had conducted harvests of prior administration transitions, would not be archiving agency websites during the 2008 transition. 1 This announcement prompted some considerable debate about the role of NARA in web archiving and the value of archiving websites in their totality. It also came just as the International Internet Preservation Consortium (IIPC) held its 2008 General Assembly. All five project partners are IIPC members, and were able to convene an immediate meeting to discuss what actions should be taken. With little time and no funding, the five End of Term (EOT) Project organizations responded together with the range of skills and resources needed to build the archive. The End of Term Web Archive (eotarchive.cdlib.org) includes federal government websites in the legislative, executive, and judicial branches of government. It holds over 160 million documents harvested from 3,300 websites, and represents sixteen terabytes of data. This article
An Exploratory Study of Advantages and Disadvantages of Website Preservation
Record Library Journal, 2021
It is a scientific journal that encompasses library science, records, information, and documentation. Record and Library Journal is a medium for researchers, academicians, professionals, practitioners, and students that are interested in the world of librarianship and records. This journal also facilitates knowledge sharing from the results of studies, case study, book review, and literature review. Jurnal ilmiah yang mencakup bidang ilmu perpustakaan, kearsipan, informasi, dan dokumentasi. Record and library journal merupakan wadah bagi peneliti, akademisi, profesional, praktisi, dan mahasiswa yang menggeluti dunia kepustakawanan dan kearsipan. Record and library journal juga memfasilitasi share knowledge dari hasil-hasil penelitian, studi kasus, book review, dan literature review.
Web Archiving in the UK: Current Developments and Reflections for the Future
2017
This work presents a brief overview on the history of Web archiving projects in some English speaking countries, paying particular attention to the development and main problems faced by the UK Web Archive Consortium (UKWAC) and UK Web Archive partnership in Britain. It highlights, particularly, the changeable nature of Web pages through constant content removal and/or alteration and the evolving technological innovations brought recently by Web 2.0 applications, discussing how these factors have an impact on Web archiving projects. It also examines different collecting approaches, harvesting software limitations and how the current copyright and deposit regulations in the UK covering digital contents are failing to support Web archive projects in the country. From the perspective of users' access, this dissertation offers an analysis of UK Web archive interfaces identifying their main drawbacks and suggesting how these could be further improved in order to better respond to users' information needs and access to archived Web content.
Maintaining the Web: Web Archiving, Labour and the Internet Archive
2017
Web archives – including social media archives – have become a critical resource for accessing historical snapshots of the Web. Beyond the increased promotion of web archives as a source for scholarly research, a growing number of examples point towards the use of web archives as tools for political accountability (e.g. Politwoops); to provide temporary access points during times of restricted web access (e.g. during the 2013 US government shutdown); and to reconstruct deleted domains (e.g. Ben-David’s (2016) work on the former Yugoslav top-level domain) – to name a few. Growing concerns and public debates over the trustworthiness of online media have positioned both web archiving and web archives as necessary and legitimate sources in the face of an ever-shifting ‘ephemeral Web,’ political unrest and algorithmically-generated access to web-based information.
The evolution of web archiving
International Journal on Digital Libraries, 2016
Web archives preserve information published on the web or digitized from printed publications. Much of this information is unique and historically valuable. However, the lack of knowledge about the global status of web archiving initiatives hamper their improvement and collaboration. To overcome this problem, we conducted two surveys, in 2010 and 2014, which provide a comprehensive characterization on web archiving initiatives and their evolution. We identified several patterns and trends that highlight challenges and opportunities. We discuss these patterns and trends that enable to define strategies, estimate resources and provide guidelines for research and development of better technology. Our results show that during the last years there was a significant growth in initiatives and countries hosting these initiatives, volume of data and number of contents preserved. While this indicates that the web archiving community is dedicating a growing effort on preserving digital information, other results presented throughout the paper raise concerns such as the small amount of archived data in comparison with the amount of data that is being published online.
Preservation of Web Resources: The JISC PoWR Project
2008
This paper describes the work of the JISC-funded PoWR (Preservation Of Web Resources) project which is developing a handbook on best practices and advice aimed at UK higher and further educational institutions for the preservation of Web sites and Web resources.
A Systematic Approach Towards Web Preservation
Information Technology and Libraries, 2019
The main purpose of the article is to divide the web preservation process into small explicable stages and design a step-by-step web preservation process that leads to creating a well-organized web archive. A number of research articles are studied about web preservation projects and web archives, and designed a step-by-step systematic approach for web preservation. The proposed comprehensive web preservation process describes and combines strengths of different techniques observed during the study for preserving digital web contents into a digital web archive. For each web preservation step, different approaches and possible implementation techniques have been identified that can be adopted in digital archiving. The potential value of the proposed model is to guide the archivist, related personnel, and organizations to effectively preserved their intellectual digital contents for future use. Moreover, the model can help to initiate a web preservation process and create a well-organ...