Organizing resources on the Internet is both an old and a radically new problem. Describing and organizing information resources for retrieval has a long history of professional practice with a wide repertory of tools and experience to back it up. But the Internet is overwhelming in the variety, transience and sheer numbers of resources presented. The various communities of producers, users and value-added agencies, such as libraries, are scrambling to cope with this phenomenon by any means available.
In its initial years the Internet has relied heavily on tools and methods, such as Web Crawlers, that require little or no human intervention or systematization. But it has long (in Internet Time) been apparent that an approach based only on the full-text indexing of the contents of Internet sites is not a complete or fully adequate solution for providing access to these resources. We need means to augment and enrich the "self-description" of materials and encourage creators and third party agencies to engage in this task. Adding additional information or "metadata" about a resource is an essential basis for better organization of resources. Metadata can enhance the probability that a pertinent resource will be retrieved, provide a clearer overview of a subject area and improve the user's ability to discriminate among similar sources.
Metadata is used to document information about resources, such as Web sites, and often provides an "index" or "directory" to the resource. It may reside as a header to a resource or be linked to it by other means. It provides a user (human or machine) with a means to discover that the resource exists and how it might be obtained or accessed. It can cover many aspects, such as subject content, creators, publishers, quality, structure, history, access rights and restrictions, relationship to other works or appropriate audience.
But such an undertaking raises many problems. What is worth cataloging? Who will provide the descriptions? How can the needs of different communities for different kinds of metadata be accommodated? Can or should the extraordinarily heterogeneous resources themselves be placed within a single framework? At what level, both of detail and structure, should such descriptions be standardized? When and by whom? How can we ensure that resources, once described, can be located throughout their lives? How do we deal with the dynamic contents of many of these resources?
During the past two years, the Internet and library communities have explored these issues intensively and arrived at some answers. This issue reports a sample of these activities from many different perspectives -- with an emphasis on practice and understanding of current developments.
In the first article, Erik Jul summarizes the history of the standard library cataloging of Internet resources. He discusses various issues that confront this approach: what Internet resources are worth cataloging, whether current standards are sufficient and problems engendered by transient Uniform Resource Locators (URLs).
Stuart Weibel introduces the Dublin Core, a set of 15 metadata elements developed in an international, cross-disciplinary effort. He describes its current role in resource description for WWW documents as well as its potential role as a core set of descriptors -- a meta-metadata element set. He also briefly introduces the Warwick Framework that resulted from the second Metadata Workshop in Warwick, UK, in 1996, and the Resource Description Framework (RDF) being developed by the World Wide Web Consortium (W3C).
Ray Schwartz reviews current efforts to create stable identifiers, Uniform Resource Names (URNs) and Uniform Resource Characteristics (URCs) for World Wide Web resources.
Sherry Vellucci addresses the coexistence of various types of metadata (e.g., MARC and Dublin Core) in the electronic environment, including local library catalogs and other electronic "catalogs," such as InterCat. She foresees "metacatalogs" that will be able to handle records and documents coded in a wide variety of metadata formats.
Carmel Maguire provides an informative presentation of the very active work on metadata in Australia.
Stuart Sutton and Sam Oh describe the use of metadata on the Gateway to Educational Materials (GEM) project sponsored by the National Library of Education and the Department of Education. GEM will provide the nation's teachers with "one-stop" access to lesson plans, curricula and other Internet-based educational resources. The paper discusses the development of the GEM metadata and the extensions and use of the Dublin Core Element Set and the Warwick Framework.
From OCLC, Diane Vizine-Goetz reports on ongoing work to exploit the content and structure of standard cataloging tools, such as the Library of Congress Subject Headings and the Dewey Decimal Classification, to organize Internet resources. This work employs both human-constructed and automatically generated descriptions and is linked to the creation of innovative searching and indexing systems such as NetFirst, Dewey ExTended Concept (ETC) Trees and WordSmith. This special section concludes with Keith Shafer's look at one particular OCLC project, Scorpion, in more detail.