.Brewster Kahle, perhaps
best known as the developer of the Wide Area Information Server, better known as WAIS, presented the opening plenary address at the 2001 ASIST Annual Meeting, on Sunday, November 4, in Washington, DC.
The concept of
providing universal access to human knowledge overwhelms many of the people who think about it. But that doesn't mean it's not a worthwhile goal. Kahle says he's striving to do just that. In the process, he's created the Internet
Archive (www.archive.org) – a collection taking up more electronic space than the Library of Congress.
Kahle says the last people to attempt to provide universal access to human knowledge were the ancient Greeks, with
their concept of the encyclopedia and their library at Alexandria. Experts say the Greeks managed to provide access to about half their world's knowledge. Today, we're continually adding information to our archives, but we're
losing ground. Still, now that digital technology is more widespread, we're in a position to address the goal again. Kahle notes that many people today are concerned about the legal issues that accompany such a goal. But what is
our civic responsibility in this matter? He says legislators all want libraries; they just don't know how they fit in now.
Some critics say publishers should do all the archiving. In Kahle's view, this approach has
some problems. Some of the roles of libraries as third-party checks would not be served. For example, he downloaded the Adobe eBook version of Project Gutenberg's copy of Alice in Wonderland.
But the rules say the eBook cannot be copied, printed, lent or read aloud. Having publishers keep the only copy of an item just won't work, he believes.
How can the people archiving the Web deal with the rights
issues? What does it mean to "lend" digital materials? What implications does that have for interlibrary loan? Every time someone reads something, is it copied? When collection began in 1996, observers predicted attorneys would
descend on them. So far, Kahle says, that hasn't happened. He says the head of the Copyright Office told him that while people can preemptively sue someone for copyright violations, they usually send a "cease and desist" letter
first. An earlier Web system, DejaNews, adopted a "post and purge the complainers" strategy, where they made a search interface to the materials that had been publicly distributed for free in the past and gave the original posters
the ability to opt out. This is the same system the search engines use. Some sites take out parts of their sites, as does the New York Times. Kahle said, "The newspaper of record refuses to be recorded."
Kahle's Alexa (a
name he took to honor the great library at Alexandria) Web collection now comprises more than 100 terabytes. There are 16 million sites, and more than 10 billion pages have been added over the past five years. Alexa has more text
than the Library of Congress and the digital storage costs only $300,000. While not everything saved is of the highest value, you can find real gems in the collection if you have the proper search tools.
The free
Alexa service shows which organization originally hosted a Web page and when. There's subject indexing, too, bringing users to related and competitive links built by using path and link analysis. So far, there are 80-million
"catalog entries" to places on the Web served by the Alexa service.
In 1996, the Internet Archive cooperated with the Smithsonian Institution to collect information on that year's election. In 2000, the Library of
Congress commissioned the Archive to create an archive of the presidential election. The election collection runs between two and three terabytes.
And just this fall, the Internet Archive put up the Wayback Machine
(the title taken from the old Peabody and Sherman cartoons) to archive out-of-print Web pages. Kahle showed his audience the way the Yahoo home page looked in December 1996. He also visited the White House during the same time
period and brought up President Clinton's remarks from September 10, 1996, on efforts to combat terrorism. In some ways, things haven't changed that much. "If you don't have a memory, society really loses something," Kahle says.
"Help people remember, learn and create," Kahle says. "What a great thing to wake up and do every morning!" The ideal, according to Kahle, is for anyone to be able to walk into any library and gain access to the world's
collections. He told the audience to imagine a shoeless, HIV-positive child in Uganda who walks a day to get to a library and can then access the latest medical information. The idea is profound, he says, and added we can do this
only if ASIST members and others help.
The traditional approach to library borrowing is for a library to buy a copy of a book and lend it to its patrons. Video streaming represents a new variation on that theme. The
TelevisionArchive (www.tvarchive.org) is making available copies of TV news broadcasts. For the events of September 11, there's a collection of television coverage from around the world. Kahle says they started archiving all Web
sites and 20 TV channels for a one-week period from September 11 through September 18. The archive is a way to get world reaction – were there really people cheering in the streets about the terrorist attacks? This way, you can
look at TV programs from Islamic nations instead of just watching American reporting.
Kahle concluded by noting the goal is to provide "universal access to human knowledge, one page at a time, one patron at a time."
But he can't do it all – the job must be done by lots of people trained in how to do it. It's up to us as a community to provide access to these materials.
For Further Reading
Kahle, B., Prelinger, R. & Jackson, M.E. (2001). Public access to digital material. D-Lib Magazine, [Online], 7. Available: