A little over a year ago, I pointed out that as far as I could tell the Internet Archive’s Wayback Machine appeared to be somewhat moribund. At the time, I couldn’t find any major web site that had been updated more than a year-and-a-half prior.
Well, as my friend Tracy Seneca pointed out a while back, the Wayback Machine appears to have received an update to both data and interface. She called it a “spiffy” new interface, and I’d have to agree. It seems intuitive, informative, and useful. For example, take a look at the result for http://whitehouse.gov/ . Although there is clearly a lag in data of about six months, that is typical and shouldn’t be much of a problem for what is supposed to be an historical record rather than up-to-the minute.
That’s the good news. The bad news is that web sites in the backwaters of the Interwebs may not be crawled as much or perhaps even at all. Take my own web site as an example. It hasn’t been crawled since June 2009, which by my reckoning is close to 2 years ago. Not that I blame them, mind you, as who is all that interested in the minutiae of my life? But that means that any claims to be “archiving the web” should be taken with a grain of salt. Maybe say “archiving the parts of the web that matter” or “ignoring what doesn’t matter so much”. You get the drift.
But in the end this is yet another mea culpa moment. I’m happy that I was wrong that the Internet Archive was not maintaining the Wayback Machine and I apologize for casting aspersions on their abilities to keep the service alive. It’s there, and being updated, even if spotty in places.