Web Exclusive

Coverage of ASIS 1997 Annual Meeting

Digital Object Identifiers (DOI) -- Future Access Tool Now Being Designed


ASIS Annual Meeting Invited Session, November 2, 1997

Speakers:

Larry Lannom Corporation for National Research Initiatives (CNRI)): Underpinning the DOI: Handle Systems Overview
Craig Van Dyck (John Wiley & Sons, Chair AAP Enabling Technology Committee): History and Policy
Moderator:
Clifford Lynch, Coalition for Networked Information

Session Abstract: DOI are being designed to present information about any digitized object -- a document, or part of a document, an executable program, sound or video, even a collection of these, each one of which might have its own DOI. When the DOI infrastructure is implemented, one will be able to retrieve either a digital object itself or information about where and under what conditions the object can be retrieved, depending on the owner's wishes. The information presented through a DOI could be a document itself, publisher information, copyright or usage information, or whatever is desired/considered necessary by the object's owner.

The design of the DOI system was initiated by the Association of American Publishers (AAP) to meet their complex needs of protecting and disseminating digital information, especially electronic journals. The technological backbone of the system is the CNRI Handle System designed by Robert Kahn and others at the Corporation for National Research Initiatives. Technologically, the system is probably sufficiently flexible to incorporate most needs, uses and users. Several trial implementations are in place today, but policies on usage are now being developed for larger scale implementations. (From the Final Program)

Session Report: In his introduction to the session, Clifford Lynch stressed that identifiers of all types must be thought of as systems, not tags, and that they emerge to meet specific needs and requirements in a particular context. Extending their use to different contexts "often gets us in trouble."

Looking at identifiers as systems, Lynch pointed out that, in addition to all of the rules or standards that may be associated with the syntax or semantics of tags or naming conventions, there must be an infrastructure that manages, assigns and maintains them. Policies and procedures must also be in place regarding, for instance, who may apply identifiers to what sorts of objects. Standards by themselves "don't amount to much."

DOI are "one of the first well-developed examples of a community's thinking about what an identifier within the Universal Resource Name (URN) framework might look like and how it might work." (Note: URNs are discussed by Lynch elsewhere in this issue and by Ray Schwartz in the October/November 1997 issue of the Bulletin.) Lynch and others stressed the need to separate out the DOI as a standard and an infrastructure from the applications that DOIs might facilitate. As Lynch said, "They are neither an insidious plot to convert the web into a 'pay-per-click' environment, nor are they a means of automatically extending copyright branding into the Internet "wild west." The session focused on presenting a detailed and fundamental review of what DOI are and are not.

Larry Lannom of CNRI discussed the development of the CNRI Handle System, the technology underlying the current implementation of DOI. Handles address the problem of naming objects on networks in a way that is not location-specific, in contrast to the Universal Resource Locators (URLs) that are now the mainstay of Internet identification. Such names also need to be long-lasting. For instance, copyright protection extends for the life of the author plus 70 years. The general approach is to provide "one level of indirection" -- that is, the system assigns a permanent name that is "resolved" to provide one or more current physical addresses for the object.

Handles first came to light in a digital library project - the DARPA supported Computer Science Technical Reports (CS-TR) project, a precursor to the NSF/DARPA/NASA DLI -sponsored Digital Libraries Initiative. As part of the latter project, CNRI developed the Handle System advanced prototype in conjunction with the Library of Congress, the AAP, DTIC, and other agencies. It was implemented to be:

It also provides Handles have a two-part format: the authority name (or "prefix") and a unique item identifier, assigned by the authority ("suffix.") The prefix is assigned by the Handle System administration, but the unique item identifier can be in any format the naming authority wishes to use. The Handles are linked, in turn, to whatever information is useful for the application. In the case of DOIs, it is usually the Internet location of the object. This "Handle data" is again in two parts -- a data type (such as a URL) and the actual Handle location data itself.

The resolution service is made available through "Handle servers" which may be distributed and/or replicated to improve performance. Users access the server through a resolution client on their workstation, or, the designer's hope, through an extension to their Web browsers. However, the system is not part of the Web and will not be confined to Web protocols. The Handle System can be used with the Web, but, like the Domain Name System, has an existence apart from it. "Proxy servers" will translate requests from the user's protocol to the Handle server and reformulate Handle data for transmission back to users.

In addition, there is a separate administration server that registers authorities and maintains the Handle database.

The system is open: specifications and APIs (application programming interface) for the Handle and administration servers are open and available at www.doi.org or www.handle.net. Multiple resolution systems are possible, and local servers may be developed.

Next, Van Dyck outlined the progress history of the DOI project. In the Fall and Winter of 1994 the Technology Enabling Committee of the AAP felt the need to "define a posture" with respect to the emerging Web technology and other developments. Their goals were to protect intellectual property, develop infrastructure, and develop the market for electronic publications.

In 1995-1996, they carried on discussion with vendors through a Technology Roundtable. On the basis of these discussions, they decided that it was necessary to exercise leadership and actively develop consensus. They needed to be proactive and develop a system that would meet the needs of publishers. A "wait and see" or "ride-along" approach was not sufficient. There were many efforts worldwide that were often redundant and even counter-productive. They also decided that they should focus on creating an infrastructure for object identification that would support applications such as rights management or authentication, but which would not include any of them.

In consequence, in March 1996, they issued a Request for Proposal for a system that would:

In the Fall of 1996, CNRI with its Handle System technology was chosen to develop a prototype.

From September 1996, until February 1997, the system was emerging. Policies were being established. For instance, the AAP and its partners decided that the publisher (authority) code assigned by the system would be a "dumb" number, not carrying any semantic content. The item numbers could be anything the publisher wished. The AAP worked to make publishers comfortable with the proposals and to extend support outside the United States. The system was unveiled at the Professional and Scholarly Publishers Meeting in February 1997. The International Publishers Association (IPA) and the International Association of Scientific, Technical and Medical Publishers (STM) then formed the Information Identifier Committee (IIC), which reviewed the DOI Project and officially endorsed it in April and May, 1997.

In April of this year, the Phase I prototype was initiated. During Phase I 12 publishers deposited 250,000 DOIs. Many publishers are using the Standard International Component Identifiers (SICIs) as the suffix to their DOIs.

General availability of the Phase II prototype was announced in New York on September 22, 1997 and at the Frankfurt Book Fair in October. The system is owned and administered by the International DOI Foundation, which has open membership. The Foundation has offices in the United States and in Geneva. The Board of Directors has members from the European scientific and technical publishing community, as well as the US. Membership fees are being established. The cost to register a DOI will be $.01 (US) per object. Access will be at no cost to the user. Members will sign an agreement that will foster a high quality system. Only a copyright holder or someone who has been assigned authority by the rights holder may register or change a DOI. Information on who owns the DOI is maintained in the system. Still to come is linkage to more extensive meta-data systems, such as Dublin core records, that will allow users to move from the description of an object to its DOI.

There are many important policy and procedural issues to be resolved during the prototype, such as the following:

The Foundation expects the market for DOIs to surge in about a year.

The question period elicited some further information and clarifications. One question dealt with whether the content of objects assigned DOIs can change. In discussion and in subsequent reviews of this report Lannom pointed out that this is a policy question on which the technology is neutral. There is a clear advantage to assigning Handles to things that change constantly but benefit from a stable identifier, e.g., today's weather forecast. The most that can be said at the moment, he believes, is that the relationship between a given DOI and variations in the content or format of the identified resource is a complex topic that falls completely in the policy area. Lynch also pointed out that "unique" (as in "unique identifier") is a "tricky" term. For instance, where do formats play into the definition -- should the same content have a different DOI depending on whether it is in ASCII or PDF? Such questions are very difficult to answer outside of a given application context.

There were also questions about material not under copyright and/or "gray" literature of the sort collected by many international agencies from their worldwide constituencies. Van Dyck said that secondary publishers or collections might register such material for the copyright holder with permission. Where no copyright exists, the same material may be assigned multiple DOIs by multiple agencies.

With respect to installing resolution clients as browser extensions, CNRI has talked to both Microsoft and Netscape, but there are no firm arrangements. The speakers characterized such integration as "not a deal breaker," but nice to have.


Reporter: Irene L. Travis, ASIS Bulletin Editor