Metadata in Australia

by Carmel Maguire

There is plenty of interest and activity in metadata in Australia. Librarians, archivists, records managers and museum curators, as well as network managers and IT researchers, are all into metadata. So are the land information community, the geologists and the government information specialists and policy analysts, not to mention the management gurus who are trying to achieve knowledge management in their organizations. Nearly all this effort has the common goal of making electronic data do really useful, cost-effective things for people, but it is not easy to find common ground for discussion and cooperation.

As Neil McLean, Macquarie University librarian, points out, "There is no conceptual framework for metadata and there is a complete mix-up of semantics." Renato Iannella, director of the Resource Discovery Project at the Distributed Systems Technology Centre (DSTC) at the University of Queensland, talks about "the fundamental area of ontologies" wherein "people often use different terms to refer to the same concept and they use the same term to refer to different concepts."

http://www.ariadne.ac.uk/issue8/resource-discovery/

Whatever the ontological shortcomings in their understanding, Australian information science practitioners in libraries, information centers and computer firms are certainly eager to come to grips with metadata. More than 75 attended a recent one-day seminar on metadata in Sydney sponsored by the Information Science Section of the Australian Library and Information Association (ALIA/ISS is the nearest Australian equivalent to ASIS). In mid-August, Standards Australia (the Australian equivalent of NISO) sponsored a well-attended seminar on information exchange applications at which metadata was an important topic.

There seems to have been a sharp increase in interest since the 4th Dublin Core Workshop was held at the National Library of Australia in March this year. An official report of that meeting may be found at http://www.dstc.edu.au/DC4/, and a lighthearted account by two of the British delegates, Paul Miller and Tony Gill, is at http://www.ariadne.ac.uk/issue8/canberra-metadata/ Insiders report that there was an interesting stoush (Australian word for lively argument) between the minimalists and the structuralists at the meeting. Not the least interesting aspect was that the more experience the participants had with reasonably complex metadata, such as MARC, the less prone they were to endorse any complications of the Dublin Core. However, the "Canberra qualifiers" did emerge with qualified approval.

Warwick Cathro, assistant director general for services at the National Library, suggests that the Dublin Core element Coverage, which describes the spatial and temporal characteristics of the information resource, might be used to connect all the different varieties of data. The proposed standard for Coverage may be consulted at

http://alexandria.sdc.ucsb.edu/public-documents/metadata/dc_coverage.html

At the same time Warwick also stresses that "it is important to remember that Dublin Core is really a 'core,' and does not displace specialized metadata sets."

Building a Conceptual Framework

In his opening address -- Metadata: The Search for a New Order -- at the ALIA/ISS seminar, Neil McLean offered a lively and helpful guide to the maze in which metadata is enmeshed, and he encapsulated his overall concept in a diagram entitled Taxonomy of Information Infrastructure. (See Figure 1.)

The categories of complexities which lie between the Information Resources/Suppliers on the far left of the diagram and the User on the far right are the Mirrors, Caches and Archives. These elements provide Distributed Digital Access and can be accessed through a variety of Search/Discovery services bewildering in their number and structure. In the Interfaces some amelioration of this variety is offered through Z39.50, the Search Engines and the Web protocols. The diagram places nearest to the User the Value-added Subject Gateways that are tipped by many to play an increasingly important role in electronic information resource management. The Generic Gateway Mechanisms, as Neil calls them, form the last of the layers through which the User gets to see what's in the electronic stores. Neil brought a great laugh from the audience when he suggested that, while we talk about the new service paradigm in which the user is in control, it is painfully obvious that the user is totally out of control in the environment created by today's information infrastructure.

The importance of standards to metadata was underlined at the ALIA/ISS seminar when Warwick Cathro, in his paper on The Dublin Core: Simplicity or Complexity, pointed to how much metadata work consists of embedding standards within standards. There are lots of examples -- PURLs (Persistent Uniform Resource Locators) in Dublin Core, Dublin Core into HTML(Hypertext Markup Language), PICS (Platform for Internet Content Selection) in its new version to be based on XML (Extensible Markup Language) and so on. So with metadata up come all the problems of getting wide adoption of voluntary standards. These problems, of course, are only exceeded by those which can arise when standards are made mandatory, especially by governments. Tony Barry, an Australian information scientist, is nervous about governments specifying mandatory elements of metadata, especially the possibility that Australian governments -- federal and state -- may demand that PICS be used to label every Internet document according to a narrow range of ratings.

Lively inputs to metadata discussions in Australia come also from the IT R&D sector. Liddy Nevile is director of the Sunrise Research Enterprise at the Royal Melbourne Institute of Technology. Her hope is that there will soon be an Australian host for W3C -- the World Wide Web Consortium founded by the Web's creator, Tim Berners Lee. W3C occupies itself with technical and social challenges involving the interoperability of software systems, making the Web secure for commercial and personal data, the internationalization of the Web and its accessibility. Liddy, who regards metadata as "only data at a different level of granularity," spoke at the ALIA/ISS seminar and opened up the possibility that in the next 10 years our computers will be made smart enough to be trustworthy guides in moving us from a vanished world of static order into one where we can deal with dynamic chaos. Liddy urged Australian individuals and institutions in the information business to participate in the work of W3C.

Building Resources Across Information Sectors

In applications of metadata to electronic document resources, the National Library is well away, not only with the Dublin Core work and other cooperative national ventures, but with its own activities. Notable among these is PANDORA (Preserving and Accessing Networked Documentary Resources of Australia), which is an electronic archive designed to provide long term access to significant Australian online publications.

A review of progress to June 1997

One finds at
http://www.nla.gov.au/policy/pandje97.html

that so far 180 of the 1800 electronic publications considered have been selected for retention in the archive and made accessible on the National Library's OPAC through the Web-based interface, WEBPAC. Inclusion of documents in this electronic archive depends on getting the publishers' permission. So far there have been no refusals.

Among PANDORA's objectives is to "implement and publicize a system of describing documents based on the Dublin Core attributes, to make online searching for information more efficient." Andrew Wells, director of technical services at the National Library, describes PANDORA as "mundane but important -- innovative in that it aims to develop model approaches to managing and preserving Internet resources." Immersion in it has brought home to the Library's catalogers and systems people the reality of the need for persistent names across machines, across time and across new protocols and of the desirability of distributed PURL servers.

Australian state and local public libraries are working to achieve resources for the dramatic upgrade in quality and accessibility which electronic resources make possible. Collaboration among museums, libraries and art galleries is also being actively encouraged through the Australian Cultural Network (ACN) project. ACN is a national program funded for three years from 1997, which is to provide a "virtual gallery" for exhibitions, a gateway for access to cultural networks and a forum for the exchange of ideas. The Australian State and local public libraries play an important part in making government information available to people in remote as well as large communities. At the ALIA/ISS seminar Maxine Brodie, director of information technology at the State Library of New South Wales, reported, "It is likely that the whole of government initiatives in information management will mandate use of an Australian adaptation of the GILS (Government Information Locator Service) standard in the future." She added, "This will be useful for the State Library to identify its human expertise and its vast paper-based collections as well as its electronic resources."

http://www.slnsw.gov.au/staffpaper/beyond04.htm

GILS and Other Applications

In mid-1996 an expert group representing the Australian Archives, the National Library, major federal agencies and national information management research projects prepared a report on Architecture for Access to Government Information

http://www.adfa.oz.au/DOD/imsc/imsctg/imsctg1a.htm

Recommendations included the adoption of HTML/HTTP as "the universal client access for Australian Government information" and of the GILS Core Element Set "as the metadata standard for describing Australian government information holdings at the collection level," to be known as "AusGILS." This report also recommended that "agencies use the HTML META tag to include metadata in all HTML documents, using AusGILS, the Dublin Core or an appropriate specialized metadata set." Recognizing both the realities of established metadata standards within government agencies and the lack of power in any other agency to enforce new standards, the expert group pointed out that "[t]hough there are retrieval advantages in agencies adopting a standard methodology of description at item level, the report is not prescriptive on this matter and offers advice on suitable metadata alternatives for agency use."

Application of GILS has already been made in ERIN (the Environmental Resources Information Network), a useful resource maintained by the Australian Department of the Environment, Sport and Territories

http://www.erin.gov.au

ERIN also provides a listserv to discuss metadata issues with the community at large. (To subscribe send an e-mail containing the message subscribe ozmeta-l to majordomo@erin.gov.au.) Another network, EdNA (the Educational Network of Australia) has adopted its own EdNA Metadata Standard, described as an extension of the Dublin Core metadata system. Version 0.2 of this standard has been released for comment and experimentation. Also available on the EdNA site (http://www.edna.edu.au) is a program in which content for Dublin Core fields can be entered and metadata tags displayed in the syntax that EdNA will support.

Another subject gateway is provided by the Australian and New Zealand Land Information Council (ANZLIC). ANZLIC is evidence of the pressing need to achieve national and international coordination of land information management. In particular it is trying to develop a set of coordination arrangements to ensure that Australia and New Zealand make effective use of their investments in spatial data. ANZLIC's work on metadata is well-advanced. In their report on Core Data Elements for Land and Geographic Directories in Australia and New Zealand

http://www.anzlic.org.au/metaelem.htm

they have adopted an approach which is as far as possible consistent with the guidelines on Digital Geospatial Metadata produced by the U.S. Federal Geographic Data Committee and also with the Australia New Zealand Standard on Spatial Data Transfer, AS/NZS 4270, which many organizations in the two countries already use.

The Australian Geological Survey Organization (AGSO) has also developed a metadatabase system in order to manage the many geoscientific datasets that it has collected. The designers of the system presented an interesting paper at the First IEEE Metadata Conference in Silver Spring, Maryland, in April 1996. (Callahan, S.D., B.D. Johnson and E.P. Shelley, Dataset Publishing -- A Means to Motivate Metadata Entry.

http://www.spirit.net.au/earthware/Papers/IEEE96/IEEE.html

This paper is relevant to a wider audience because it opens up the general problem of how to motivate the authors and publishers of datasets to supply the metadata in the form required.

Cooperative Research

The Resource Discovery Project (RDP) is one of the major research units of the Distributed Systems Technology Centre (DSTC), a cooperative research center with government and private funding. The Centre functions as a not-for-profit company and has more than 25 participating organizations. Metadata is a significant area of research in the RDP, which is working on many aspects of use of Dublin Core and its interweaving with other systems. The work has focussed on making flexible mechanisms for building Z39.50 access to various database formats, such as Harvest. Experiments are in train with an information presentation tool called the HyperIndex Browser (HIB).

RDP has also developed a meta-searcher called HotOIL that can currently access both HTTP and Z39.50 servers. HotOIL goes to each repository, translates the user's request into the required format for each one, merges the retrieved results and displays a summary. The HIB is used as a seamless front end, and using HotOIL looks and feels like using a single database. To try an application of HotOIL, go to

http://www.dstc.edu.au/cgi-bin/RDU/hotOIL/hotOIL

In his very readable account of the RDP work

http://www.ariadne.ac.uk/issue8/resource-discovery/

Renato Iannella explains that "[i]nternally, HotOIL uses URNs and metadata to describe the search engines that it accesses. It also uses the Dublin Core metadata set to describe the resources returned from each search engine."

RDP is also developing other systems which will enhance knowledge management within organizations by encouraging more sharing of the same data and by repeating searches so that new data on the Web relevant to earlier unsuccessful searches is delivered to the original inquirer. In a new project, the Australian Vice Chancellors Committee has funded a project that aims to develop and disseminate metadata tools. This MetaWeb project

http://www.dstc.edu.au/RDU/MetaWeb/

is a joint project of the DSTC, the Australian Defence Force Academy, Charles Sturt University and the National Library of Australia.

And to Come?

Obviously, no slackening of interest in metadata in the near future in Australia or anywhere else for that matter is anticipated. It is indeed a global professional village that information specialists inhabit these days. And the globalization of communication made possible by IT today can only enhance the personal and professional development of members of the information science community in smaller economies like Australia while allowing them to contribute to world thought and action as never before.


Carmel Maguire is honorary visiting professor in the School of Information Library and Archive Studies at the University of New South Wales, Sydney, 2052 Australia. She can be reached by e-mail at c.maguire@unsw.edu.au or by telephone at +612 93853444.