navigation: e home : Trends : article

Archive site preserves earliest Web pages

Like a time machine through cyberspace, the Wayback Machine (, includes links to how Web pages at popular sites appeared last month, last year or even five years ago.
By John Yaukey
Gannett News Service

Nostalgic for the Web of yesteryear — those golden oldie sites that have been severed from their vital hyperlinks or erased from servers?

Then surf over to the Wayback Machine (, a project aimed at preserving the rapidly changing Web.

"The average life of a Web page is about 100 days," said Brewster Kahle, director of the San Francisco-based Internet Archive, which is administering the Wayback project. "Within that time, most Web pages are either changed, pulled or they're forgotten about and they fade away."

Alas, the ravages of what's known as linkrot.

But what makes the Web so valuable — its immediacy, vastness and lack of any central controlling authority — also make it difficult to preserve.

"The Web makes up a substantial part of our digital culture and a lot of it has been lost," Kahle said.

But a lot has also been saved.

The Wayback archive contains 10 billion preserved pages (a whopping 100 terabytes, or trillions of bytes, of data), and it's now growing by about 1 billion pages a month.

Consider that the Library of Congress — the world's largest collection of books — contains 26 million volumes by comparison.

The Wayback archive contains not only Web sites, but also many of the old USENET message boards where so many geeks spent countless hours before the Web was created. For those too young to remember, USENET is a worldwide bulletin board system that contains thousands of newsgroups on virtually every conceivable subject.

While the Wayback archive can be great fun — taking you back to Yahoo!'s homepage in 1996 or tech news site ZDNet's 1997 homepage — it's also proving to be an invaluable tool for researchers.

"Look at our fascination now with TV and its effects," said Lee Rainie, director of the Washington, D.C.-based Pew Internet and American Life Project. "But we don't have much data on the early years of TV. With the Wayback machine we're retaining some of the early development of the Web so we can ask questions such as was the Internet a socially isolating phenomenon or was it a connecting agent?"

Using the Wayback machine is as easy as using a search engine.

Just go to the site and click the "Take Me Back!" button.