Jump to content

UK Web Archive

From Wikipedia, the free encyclopedia

TheUK Web Archiveis aconsortiumof the six UKlegal depositlibraries which aims to collect all UK websites at least once each year.[1]

UK Web Archive
Established2005
Reference to legal mandateYes, provided in law by:
Other information
WebsiteOfficial websiteEdit this at Wikidata
UK Web Archive is located in the United Kingdom
George IV Bridge, National Library of Scotland
Moving Image Archive
Near Boston Spa
St. Pancras, London
Aberystwyth
NLW Reading Room at Cardiff University Library
Cambridge University Library
Weston Library
Trinity College Library
Libraries providing access to the archive.

History[edit]

In 2005, theBritish Library,The National Archives,Wellcome Trust,National Library of Scotland,National Library of WalesandJISCformed the UK Web Archiving Consortium, a project to archive websites.[3]

UKWAC archived selected websites by licence or permission, usingPANDAS softwaredeveloped by theNational Library of Australia.During the project its members collected sites relevant to their interest; the Wellcome Library collectedmedical sites,the national libraries sites that reflect life in contemporary Wales or Scotland. The British Library worked with a broad policy of collecting sites of cultural, historical and political importance to the UK.[4]

The Consortium wound up in 2010. The Archiving and Preservation Working Group took over UKWAC's co-ordinating role web archiving in the UK. TheDigital Preservation Coalitionhosts the working group.[5]

Web Archiving[edit]

The archive undertakes an annual crawl of.ukand other UK geographicTop Level Domainssuch as.scot,.cymruor.london.

A graph showing a small part of a UK Web Archive website crawl. Every circle is a different website, and every line represents a link that was followed between websites. The size of the circle represents how many pages were visited from that site, and the width of the line represents the number of links followed.
A graph showing a small part of acrawl.Every circle is a different website, and every line represents a link that was followed between websites. The size of the circle represents how many pages were visited from that site, and the width of the line represents the number of links followed.UKWA Crawls: one hour in one minute

The crawl is archived in a shared infrastructure called theDigital Library System.Members of the public can nominate sites for preservation there through theUKWA website.The whole web archive is available to registered readers on library premises; and where permission has been given, or license conditions can be met, copies are also accessible through the website.[6]

The archive gathers sites in response to events,building collections- these have preserved writing and imagery recording naturaldisasters,election campaignssince2005and the UK'sblogospherefor research, among more than a hundred more.[7]

SHINE[edit]

A graph showing the percentage of the dataset which the phrases "millenium bug" or "y2k issue" occur in, between the years of 1996 and 2013. Both trends rise to a maximum in 1999, followed by a decline, following much the same shape.
SHINE graphshowing how often different phrases for "year 2000 problem"appear between the years of 1996 and 2013 on archived.uk webpages.

The UK Web Archive holds a collection of all the.ukwebsites that were archived by theInternet Archiveuntil the end of March in 2013.[8]SHINE is aweb interfacewhich can be used to createrepeatablelists of results of historical.uk pages.Trends,or occurrences ofkeywordsin the data set on.uk pages over that time, useconcordanceto show keywords in context.[9]

Mementos[edit]

Memento is a name for prior versions ofweb pagescoined by theMemento Project.The UK Web Archive Memento interface allows Mementos to be found acrossweb archives.[10]The interface can be used to find a Memento by itsdatein asnapshottable, or see how often a site appears across public web archives.

Researching the archive[edit]

Research into the web as a reflection ofsocietyhas helped develop access to the archive.[11]Libraries have developed guides toresearch skillsneeded to use web archives. These include using big data to see patterns or trends,[12]or writingcitationsfor archived copies of websites.[13]

GLAM Workbench[edit]

GLAMWorkbench is a project which looks at how researchers can use data preserved by galleries, libraries, archives and museums.[14]It includes a collection ofJupyter notebookswhich draw on Mementos and index data.[15]The notebooks mix description and editable code to help researchers find evidence in web archives.

Where the whole archive can be accessed, by Library
Bodleian Libraries British Library Cambridge University Libraries National Library of Scotland National Library of Wales Trinity College Dublin

See also[edit]

References[edit]

  1. ^"UKWA Home".www.webarchive.org.uk.Retrieved2020-10-13.
  2. ^"The Legal Deposit Libraries (Non-Print Works) Regulations 2013".legislation.gov.uk.RetrievedFebruary 21,2022.
  3. ^"15 Years of the UK Web Archive - The Early Years - UK Web Archive blog".blogs.bl.uk.Archivedfrom the original on 8 March 2020.Retrieved2020-10-13.
  4. ^"UK Web Archiving Consortium: Evaluation Report".Digital Preservation Coalition.April 2006. Archived fromthe originalon 9 January 2017.Retrieved17 March2014.
  5. ^"Web Archiving & Preservation Working Group - Digital Preservation Coalition".www.dpconline.org.Archivedfrom the original on 31 July 2020.Retrieved2020-10-13.
  6. ^"What is the UK Web Archive?".UK Web Archive.Archivedfrom the original on 5 December 2019.Retrieved17 March2014.
  7. ^"15 Years of UKWA - Looking back at our first collections - UK Web Archive blog".blogs.bl.uk.Archivedfrom the original on 29 July 2020.Retrieved2020-10-19.
  8. ^www.webarchive.org.uk."JISC UK Web Domain Dataset (1996-2013)".data.webarchive.org.uk.Retrieved2020-10-16.
  9. ^"Trend results 1996-2013 for" big data ":: SHINE".www.webarchive.org.uk.Retrieved2020-10-13.
  10. ^"Mementos - Archived history of www.webarchive.org.uk".Mementos - Finding historical archives across the world wide web.Retrieved2020-10-09.
  11. ^Blaney, Jonathan (19 April 2016)."More project case studies available".Big UK Domain Data for the Arts and Humanities.Archivedfrom the original on 16 February 2017.Retrieved2020-10-09.
  12. ^McNally, Anna."LibGuides: Finding and Using Digital Archives during COVID-19: Web archives".libguides.westminster.ac.uk.Retrieved2020-10-14.
  13. ^Thomas, Susan."Oxford LibGuides: Web Archives: Home".ox.libguides.com.Retrieved2020-10-14.
  14. ^"Welcome to the GLAM Workbench - GLAM Workbench".glam-workbench.github.io.Retrieved2020-10-13.
  15. ^Sherratt, Tim; Jackson, Andrew (2020-06-15)."GLAM-Workbench/web-archives".Zenodo.Bibcode:2020zndo...3894079S.doi:10.5281/zenodo.3894079.
  16. ^Team, National Records of Scotland Web (2013-05-31)."NRS Web Continuity Service".National Records of Scotland.Archivedfrom the original on 18 January 2020.Retrieved2020-10-13.
  17. ^"Search the PRONI Web Archive".nidirect.2015-12-09.Archivedfrom the original on 27 Aug 2020.Retrieved2020-10-13.
  18. ^"MirrorWeb - UK Parliament Web Archive".webarchive.parliament.uk.Retrieved2020-10-13.

External links[edit]