I love the Wayback Machine but it has some bizarre and crippling flaws which make it incapable of preserving the web’s content. In fact, the last 5 or 6 times I went to recover old content via the Wayback Machine, the Internet Archive had lost all of the content that it had already saved at one point.
This can happen 2 ways. I already wrote about one of them: Wayback Machine Error: Page cannot be displayed due to robots.txt. The other way is when a website is 301 Redirected.
How this happens
Wayback Machine may save a site’s content for years, even after a site goes offline or is shut down. But, then, if the site is later redirected to a new site, Wayback somehow magically “loses” all of the old content. I wonder if the content is still there on their servers but just inaccessible from the web interface. Hmm…
This is an example of some old, deleted Examiner.com content. So in this case, I went ahead and clicked on “Save this URL in the Wayback Machine” even though the URL was NOT on the web. I just wanted to see what would happen. And what happened is exactly what happens any time a site is 301 redirected.
Wayback changed the URL to AXS.com.
So the old, original article is now lost, which was about “Occupy Orlando” and now just point to the AXS.com home page:
This is bizarre.
I looked and was unable to find anyone at the Internet Archive to reach out to. I’d like to make them aware of this problem. It must be a mistake! Isn’t the entire point of the Internet Archive to … you know … archive the Internet?
Did you lose content from the Wayback Machine?
It happens over and over with everything from small sites to larger publishers which go away. In 2016 Examiner.com shut down and in more recent history, the Internet lost LAist.com, SFist.com and DCist.com. Hundreds of thousands of pages which the Internet Archive DID have saved are coming up missing all the time due to this “flaw”.
I can’t imagine it was designed to work this way.
If you lost your content or know a solution to this problem please comment below. I have hundreds of people who will thank you for it who reached out to me when Examiner.com closed.
- 3 Ways To Tell if Google has Indexed Your Content - November 1, 2020
- Google Slows New Content from Entering Search Results in October 2020 - October 19, 2020
- August 15 2020 Google Algorithm Update and Organic Traffic Fluctuations - August 16, 2020