Page 92 - AC/E Digital Culture Annual Report 2014
P. 92

AC/E digital culture ANNUAL REPORT 2014THE INNER CIRCLE ‐ APPRAISAL AND SELECTIONarchive social media feeds generated by state agen‐ cies on Facebook, Twitter and Flickr, because they see these feeds as extensions of the official web ba‐ sed records. This policy decision is further described in the risk management section of this paper.Universities that archive the web sometimes take a different approach to site appraisal. They tend to archive the university web presence or create collec‐ tions based on specific themes. For example, the major topic areas of Columbia University and the University of Alberta web archive collections include human rights issues and Canadian industry and cul‐ ture, respectively. Translating the institution’s major objectives into a list of sites to crawl is the goal of the appraisal and selection process. To do so, the University of Alberta for instance works with subject liaisons to choose URLs. Appraisal and Selection is an evolving area and one we hope to learn more about from our partners.3b. ScopingAfter choosing what sites to archive, institutions must decide if they want to archive entire websites or portions thereof. This can be done before the first page is captured or after content is harvested as partAC/Eof the overall collection quality review. This part of the lifecycle can be quite technical depending on the tools an institution uses.The Archive‐It service gives institutions several ways to adjust the scoping of their crawls. First, partners can limit what they crawl by listing only part of a website as the starting point for the crawl instead of the entire website. For example, an institution could choose to archive http://www.ncgov.com/ government/index.aspx instead of http:// www.ncgov.com and would only capture pages nes‐ ted under that URL. Archive‐It also includes other tools that can limit how much of a site is crawled. Recent survey results show that 73% of respondents report that they use a host‐constraining tool at least sometimes, This tool allows partner to block specific hosts, or sub‐sections of a site, from being archived. For example, an institution may not want to collect third party images that may be hosted on a target website. Limiting the duration of a crawl throughWHERE WE ARE HEADING: DIGITAL TRENDS IN THE WORLD OF CULTURETHEME 7: THE WEB ARCHIVING LIFE CYCLE MODEL CURRENT PAGE...THE INNER CIRCLE ‐ SCOPING92


































































































   90   91   92   93   94