I've called my talk "The Universe is Expanding." The title is a reference to one of the early scenes in
Woody Allen's 1977 movie, Annie Hall. Little Woody Allen (called Alvy in the movie) is sitting in the doctor's office with his mother. What seems to be the problem? "Well," his mother says with evident
frustration, "he's depressed. It's something he read." "The universe is expanding," Alvy explains morosely. "The universe is everything, and if it's expanding, someday it will break apart and that will be the end of
everything." "He's even stopped doing his homework," his mother continues, with a look of total disgust. "What's the point?" Alvy counters. "What has the universe got to do with it?" his mother explodes. "You're here in
Brooklyn. Brooklyn is not expanding!"
What makes this little scene so funny, like so much of Woody Allen's humor, is the juxtaposition of the mundane and seemingly trivial with the cosmic and existential. It has the
quality of a Kafka parable or perhaps a Zen koan. Is the universe expanding or isn't it? In a sense they're both right, little Alvy and his mother. The mother can see concretely that Brooklyn isn't expanding. Its
streets, tenements, shops and the rhythm of daily life are just the same as they've been. But Alvy is right too. Whether the universe is really expanding (after all, it's a scientific theory), the anxiety he's feeling
is real enough. And that anxiety can have a very direct effect on his ability to function - to do his homework, or anything else for that matter.
But today Alvy is right in another sense as well - and in a way his
mother couldn't have foreseen in 1952, or whenever this scene is supposed to have taken place. Brooklyn is expanding. The Brooklyn Botanic Garden is on the Web, where you can take a virtual tour of its gardens from just
about any spot on the planet. The Brooklyn Museum is on the Web, too, as is the Brooklyn Academy of Music. There is understandably a great deal of excitement about these developments. But there is anxiety, too. Many of
us, along with Alvy, are somewhat disoriented and anxious about the changes that are sweeping through our lives.
Not long ago I came across an interesting expression of this kind of anxious confusion. In
Wired magazine, I saw a short piece called "What's a Document?" [Wired
4.08 (August 1996), p. 112; http: //www.wired.com/wired/archive/4.08/document.html]. "Have you noticed that the word document doesn't mean much these days?" it begins. "It didn't used to be like this," the writer, David Weinberger of the Open Text Corporation, goes on to say. In simpler times, documents were things on paper which had some official or institutional role to play. But now, alas, the term has been co-opted by the computer manufacturers, who use it to refer to spreadsheets and Web pages and multimedia presentations, as well as to text-only files. The result is a meaningless grab bag playing havoc with a once coherent notion. The fact that we can't say what a document is anymore, the writer concludes, is a clear indication of the profound changes now taking place.
What is a document? Is it true that we can no longer say what a document is? Is it true that recent technological developments have somehow broken the mold, leaving us without the security of once useful and familiar
notions? These are the challenges I propose to take up now. With apologies to Mr. Weinberger, I believe we can say what a document is, and doing this can help us make sense of the what's going on around us.
What Are Documents?
One obvious place to start is with the dictionary. A quick look at a couple of dictionaries confirms Mr. Weinberger's understanding. The Random House Dictionary
defines a document as "a written or printed paper furnishing information or evidence, as a passport, deed, bill of sale, bill of lading, etc.; a legal or official paper." The American Heritage Dictionary
calls a document "a written or printed paper that bears the original, official or legal form of something and can be used to furnish decisive evidence or information." For dictionaries, the word document
evidently has something to do with writing, paper and evidence.
Is this the end of the story - case closed? Well, not quite. The work of the lexicographer is to spell out the meaning of words, to describe
how they are used. My own interest is less in the word document
than in (what I believe to be) an underlying cultural category. Can we say something insightful about the nature of written forms without tying them necessarily to particular technologies? It's worth noting that others have traveled this route before. In a recent article, "What is a 'Document'?" [
Journal of the American Society for Information Science, 1997, 48(9), p. 804-809], Michael Buckland explores how various pioneers in information science earlier in this century, including Paul Otlet and Suzanne
Briet, grappled with the question of what a document is. Each seems to have had an intuition that there was a more basic way of looking at documents, which didn't tie them inherently to paper. Suzanne Briet, for
example, went so far as to conclude that under certain circumstances even an antelope could be a document.
So what are documents? My answer is that they are bits of the material world -- clay, stone, animal skin,
plant fiber, sand -- that we've imbued with the ability to speak. One of the earliest characterizations of documents comes from Genesis, and curiously, it is a description of human beings, not of written forms: "God
formed Adam from the dust of the earth, and blew into his nostrils the breath of life, and Adam became a living soul." The parallel between this mythic event and the creation of actual documents is strikingly close. For
indeed, what we do when we make documents is to take the dust of the earth and breathe our breath, our voice, into it.
This way of looking at documents is hardly new. In fact, it has quite ancient roots.
It may be subtly embedded in Genesis, but it is explicitly stated in Plato's Phaedrus. Toward the end of this Socratic dialogue, Socrates and Phaedrus are discussing the nature of writing. Socrates has this to
You know, Phaedrus, that's the strange thing about writing, which makes it truly analogous to painting. The painter's products stand before us as though they were alive: but if you question them, they maintain
a most majestic silence. It is the same with written words: they seem to talk to you as though they were intelligent, but if you ask them anything about what they say, from a desire to be instructed, they go on
telling you the same thing again and again.
Clearly for Socrates (as for Plato) written forms are pale shadows of their human counterparts. They may speak, but they are incapable of dialogue, the Socratic path to wisdom. This is true enough. But it
fails to get at what is most extraordinary about written forms. For it is exactly in their ability to ensure the repeatability of their talk that they are most powerful. The brilliance of writing -- of creating
communicative symbols -- is the discovery of a way to make things talk, coupled with the ability to ensure the repeatability of that talk. (Plato's formulation of repeatability, that something goes on "telling you the
same thing again and again," can lead one to think that documents must preserve their talk forever. Not only is this impossible, it does an injustice to the way documents actually work. All documents are fixed and
fluid, as I have argued elsewhere [Levy, D.M. Fixed or fluid? Document stability and new media. In Proceedings of the European Conference on Hypertext Technology '94. 1994. Edinburgh, Scotland: ACM].)
useful about this perspective is the way it takes the focus off the technology per se. Any technologies or media that ensure repeatability will do. For many centuries our technical means have revolved around fixing
marks (symbols) in a two-dimensional substrate. But just in the last hundred years, we've figured out how to record activity (sounds and images) via film, audio and videotape. Here it can't be a question of holding
marks fixed on a surface, since activity, by definition, involves change over time. What these newer technologies do is to allow us to replay -- to repeat -- patterns of sound and image. These new communicative forms,
to use Plato's words, "go on telling you the same thing again and again." A different technological means is being used to achieve the same end.
This way of looking at documents also sets up a strong parallel between
documents and people. Each in their own way are talking things. This is hardly an accidental parallel. Documents are exactly those things we create to speak for us, on our behalf and in our absence. And in speaking for
us, they take on work, they do jobs for us. And not unlike people, they often wear uniforms which broadcast the roles they're intended to play. A newspaper, a cash register receipt, a greeting card, a detective novel --
each of these has a distinctive look that's meant to signal what it's for and the kind of content it's meant to carry. What I'm talking about, of course, is genre, a term heavily used in literary and communication
theory. Each document genre is essentially a specialized form of talk, tailored to operate in particular circumstances. It's the specialization of form and content to do a certain kind of work in the world.
The larger point is that documents are social actors. They have a social life, to use Brown and Duguid's phrase from their article "The Social Life of Documents" [First Monday, 1996, 1(1)
http://firstmonday.dk/issues/issue1/documents/]. They participate in our world, the human lifeworld, where they talk for us and do work for us. If we only focus on the things themselves (their form and content) or on
the technologies out of which they're constructed (paper, ink, printing presses, computers), we'll miss where the real action lies, literally and figuratively. To do justice to documents, to make sense of them, we need
to appreciate their socio-technical nature. This means seeing how particular technologies are marshaled to serve particular social purposes.
The Power of Documents: Documents and Social Order
Implicit in what I've been saying is that documents are powerful. They are power objects. To see this, let's first notice that speech itself is an exercise of power. We talk about free speech as an essential democratic
right. In phrases like "speaking out," "having a voice" and "giving voice to," it's clear that what's at issue is the political dimension of life, the exercise of power.
Documents speak out, and by fixing their talk
or otherwise making it repeatable, they make it possible for many people to hear what they have to say. Any document can be analyzed in terms of the power it exerts: whose interests it is serving, what work it is trying
to do, who it is trying to convince, cajole or influence. But to get a feel for the sheer magnitude of the power documents exercise, we need to look at them collectively rather than individually. We need to see them in
aggregate. To begin, let's notice how pervasive documents are. They are basically everywhere - in all corners of our lives.
But we might cast this observation slightly differently by noticing the crucial role
documents play in all our major cultural institutions. Science, law and government, religion, education and the arts, commerce and administration all rely on the stabilizing power of documents to accomplish their ends.
In the form of books and journal articles, documents are carriers of scientific knowledge. As sacred scripture they are the central artifacts around which religious traditions have been organized. As written statutes,
charters and contracts they play a crucial role in constructing and regulating lawful behavior. As works of literature, paintings and drawings, they are the tangible products of artistic practice. As textbooks and
student notes, they are crucial instruments around which learning practices are organized. As receipts and accounts, memos and forms, they are critical ingredients in the way commerce, and indeed all bureaucratic
conduct, is organized. In each of these cases the ability to hold talk fixed -- to provide communicative stability -- is crucial.
These institutions are essentially the cultural mechanism by which we create and
maintain a meaningful and orderly social world. Science and religion, each in its own way, are quests for meaning, order and intelligibility. Media, the arts and entertainment are also means by which we tell ourselves
(and continually reinforce) stories about who we are and why we're here. Education is concerned with socializing our young -- bringing them in to the social order we've constructed and training them to carry the
meaning-making and order-making project forward. Government, law, commerce and administration are all about regulating human conduct – the exchange of goods and services, the orderly procession of human affairs.
Through their extensive role in all these institutions, documents therefore play a crucial role in supporting -- in making and maintaining -- the social order. To do this is indeed to exert a great deal of power and
But documents not only support the social order, they themselves are part of it. They themselves need to be tended and taken care of, just like everything else in our world. Without physical maintenance,
documents will decay. Without constant organizing, they will become inaccessible. And without organized practices for accomplishing these aims the work would be daunting and unmanageable. This is where libraries come
in. Libraries have had the responsibility for keeping certain classes of our documents in line and in order. I say "certain classes" because the work of the modern library (from the second half of the 19th century to
the present) has been centered around the book. Its primary order-making practices -- including cataloging and classification, reference services and conservation -- were developed in support of the codex (the bound
book) and its derivatives, such as newspapers, magazines and other periodicals. These practices don't appear to accommodate digital materials readily. It is all very disorienting and confusing.
Talk about confusion: the technology itself can be remarkably disorienting. Operating systems, application software, hardware configurations, service providers -- it's quite a tangle of products,
standards and practices. Anyone wanting to venture online these days has to master, or at least to become conversant with, a whole new technical language. And it's so easy to get swept up in the maelstrom and fail to
see what's going on. But when you clear away the surface clutter, you discover that the technology is actually quite simple.
The computer, at least when used as a writing tool, is basically a souped-up printing press.
The printing press works by separating the printing plate from the printed images which are produced from it. This is a powerful idea: one plate can be used to produce a large number of identical images. Before the
invention of movable type, the plate might be a single block of wood or some other durable material in which the desired image had been carved. Movable type was a further innovation on this idea, making it possible to
create composite plates out of reusable component images (the type).
This simple but profound split -- between a template and the artifacts produced from it -- is how digital materials are structured. (See figure.) In
the digital case, the template is a digital representation: a sequence of ASCII character codes, a bitmap or some other more complex representation. This is the material (the bits) you find on a floppy disk, a hard
drive, a fileserver or some other storage medium. But while the digital representation, the digital template, is necessary, it isn't sufficient. Much like the printing plate, the digital template has a purpose outside
itself: to produce marks or images on screens or on paper (or to produce sound in the airwaves).
Figure: The split between template and artifact.
The computer is therefore a generalized printing press, able to
"stamp out" text, images and sounds in a variety of media. Digital developments aren't actually as revolutionary as some would have us think. Instead, they are part of a long and continuous process of social and
technological innovation. By creating digital representations which can be easily manipulated and shipped around, we've bought ourselves the ability to do more, faster, further, cheaper, in greater quantity (and
These are considerable achievements, but they are not without cost. The split between digital template and product leads to problems that are as yet ill understood -- and certainly far from solved.
Digital representations require a complex of highly sophisticated and temperamental hardware and software to produce things that people can actually use. Digital materials are therefore vulnerable to the idiosyncratic
and only partly controllable details of their immediate environment. If you've ever tried to print a document outside your usual locale, you know exactly what I'm talking about. (And anyone who thinks that HTML or XML
or some other standard will make this go away has not been paying attention.) How will we come to terms with this new vulnerability? The problem extends beyond immediate printing and viewing to preservation. What
exactly must we try to preserve? If we just preserve the digital representation, there is no guarantee that appropriate hardware and software will be available at some indefinite future time. But even if we find ways to
preserve (or to reproduce on demand) some version of the technical context, the relevant hardware and software, this still doesn't guarantee that what is tangibly produced will be adequate. And this is hardly the only
technical challenge waiting to be resolved.
More Than Technology
But we will miss the point if we focus only on the technology. As I suggested earlier, documents are socio-technical - they are
technologies taken up in the service of social purposes. To look at the technologies alone, or at the artifacts made with them, is to miss where the action is. At the moment, we are working out how to make and use new
kinds of talking things - digital talking things. This means more than just working out the technical details of editing, distribution, preservation and so on. It means working out the specialized forms of talk, the new
genres, in relation to the institutional structures and social practices in which these forms will operate.
It's no wonder that David Weinberger is confused. We've had a long time to work out paper-based
genres and their associated work practices. But their digital counterparts are in their infancy: digital genres are still emerging (what are home pages?), they are fluid and their relation to practice is still being
worked out. He is right in thinking that something new is afoot and right in noting that the new forms violate the dictionary definition of document. But he is wrong in thinking that current developments violate
a deeper sense of what documents are all about.
Nowhere can this turmoil be seen more clearly than with respect to library practices. As I noted earlier, libraries have had the job of ordering and stabilizing certain
classes of documents - primarily books and certain other specific forms on paper. But libraries use documents to maintain documents. The emergence of digital materials introduces uncertainty into both document domains:
the collections they maintain and the internal documents they use to maintain them. Libraries are therefore in the position of needing to figure out what to do about materials on the Web (how to catalog Web pages, which
ones to catalog, how and when to provide reference services to online materials, etc.), and also how best to make use of these same kinds of materials (Web-based catalogs, for example) to support their own internal
practices. It isn't even clear whether libraries as we now know them will survive.
No wonder it is an anxious time for libraries and for librarians. But it isn't just librarians who are feeling this way.
All of us, I believe, at some level can identify with little Woody Allen's concerns in Annie Hall.
The universe is expanding - technologically, at least. We are anxious in part because we can't yet see what
these changes will mean for our lives -- for our careers, for our children's futures, for our sense of order and well-being. At such a time, it is crucial that we move beyond a limited focus on technology to the social
questions, the questions of order and meaning in which our written forms play such an important part. What kind of lives do we want to live? What kind of social order do we want to be part of? To what extent are the
technological choices we are now making consistent with the character and quality of life we hope to maintain or to achieve?
David M. Levy is a member of the research staff at Xerox Palo Alto Research Center. He can be reached by mail at 3333 Coyote Road, Palo Alto, CA 94303; by phone at 650/812-4376; or by e-mail at