Page 80 - AC/E Digital Culture Annual Report 2014
P. 80
AC/E digital culture ANNUAL REPORT 2014THEME 7The Web Archiving Life Cycle Modelby Kristine Hanna Director of Archiving Services at the Internet Archive https://archive.org/about/bios.phpINTRODUCTIONThe technological tools for archiving the web have been evolving steadily for more than a decade. However, best practices and a common model of web archiving have yet to emerge. The Web Archi‐ ving Life Cycle Model is an attempt to incorporate the technological and programmatic arms of web archiving into a framework that will be relevant to any organization seeking to archive the web. Archi‐ ve‐It, the leading web archiving service in the com‐ munity, developed the model based on its work with memory institutions around the world.The Internet Archive has been archiving the web since 1996. In 2002, the Internet Archive released Heritirx, the open source web crawler. In 2009, the Heritrix crawler’s file output, the WARC file, was adopted as an ISO standard for web archiving, de‐ monstrating both the prevalence of active web ar‐ chiving programs and the importance of the web crawler itself. In early 2006, the Internet Archive launched the Archive‐It web archiving service (www.archive‐it.org) with 13 pilot partner institu‐ tions. Archive‐It is a subscription web archiving ser‐ vice that helps partner organizations harvest, build, and manage born digital collections. The partner base has steadily expanded since its launch, with 237 partners in 46 U.S. States and 15 countries, as of Ja‐ nuary 2013.AC/EDespite growth in the number of web archiving pro‐ grams, many institutions still struggle with developing best practices and methodologies to accomplish their goals. This difficulty partially stems from constantly evolving web technology, which can make it difficult to archive certain types of content effectively. Conflic‐ ting and evolving policy decisions from various stake‐ holders as well as shifting organizational structures and job responsibilities pose further obstacles to esta‐ blishing best practices. Additionally, some organiza‐ tion stakeholders have not fully adopted the belief that web archiving is crucial to their digital preserva‐ tion activities; and as a result, funding remains limited or non‐existent.In order to addressthe lack of stan‐dard best practicesand to increaseawareness of theimportance of webarchiving as fundamental to digital preservation, the Archive‐It team developed the Web Archiving Life Cy‐ cle Model (WALCM). This model is based on the team’s experiences as well as lessons learned by coun‐ tless partner institutions, including in‐depth case stu‐ dies from six of those institutions. The WALCM is an attempt to represent common workflows and create a measurable model for organizations to reference in order to create or improve their web archiving pro‐ grams.Archive‐It is a Web archive service that follows the Web Archiving Life Cycle ModelWHERE WE ARE HEADING: DIGITAL TRENDS IN THE WORLD OF CULTURETHEME 7: THE WEB ARCHIVING LIFE CYCLE MODEL CURRENT PAGE...80