In 2019, Getty began a website redesign project changing the technology stack and updating the way we interact with our communities online Implementation of our Machine integration . The legacy website contained more than 19,000 web pages and we knew many were no longer useful or relevant and should be retired, possibly after being archived. This led us to leverage the content we’d captured using the Internet Archive’s Archive-It service.
We’d been crawling our site since 2017
but had treated the results more phone number list as a record of institutional change over time than as an archival resource to be consulted after deletion of a page. We needed to direct traffic to our Wayback Machine captures thus ensuring deleted pages remain accessible when a user requests a deprecated URL. We decided to dynamically display a link to the archived page from our site’s 404 error “Page not found” pageGetty.edu 404 error “Page not found” message including the dynamically generated instructions and Internet Archive page link.
The project to audit all existing
pages required us to educate content pre-arrival offer jeeper champagne owners across the institution about web archiving practices and purpose. We developed processes for completing human reviews of large amounts of captured content. This work is described in more detail in a 2021 Digital Preservation Coalition blog post that mentions the Web Archives Collecting Policy we developed.
In this blog post we’ll discuss the work required to use the Internet Archive’s data API to add the necessary link on our 404 pages pointing to the most recent Wayback Machine capture of a deleted page.
Technical Underpinnings
was very straightforward from a technical point of view. The first example provided in the Wayback Machine APIs documentation page provided the technical guidance needed for our use case to display a link to the most recent capture of any page Consumer Data consumer data deleted from our website. With no requirements for authentication or management of keys or platform-specific software development kit (SDK) dependencies, our development Implementation of our Machine integration process was simplified. We chose to incorporate the Wayback API using Nuxt.js, the web framework used to build the new Getty.edu site.