Carl Taylor
2DK Multimedia
Covent Garden
London WC2E 8PS, UK
cdtaylor@dkmm.co.uk
tel: 0171-8365411
Previously we have described the similarity measures and how they are used in navigation tools (Tudhope et al 1995a, 1995b). A series of research prototypes of social history museum hypermedia systems (in Lisp/HyperCard) have been implemented, and demonstrated to curators and historians, but not subjected to extended usability testing. The dataset is a small collection of historical photographs of Pontypridd from the archives of the Pontypridd Historical and Cultural Centre. They were indexed by a standard subject classification for social history, the Social History and Industrial Classification (SHIC 1983). SHIC has five hierarchical levels, of which we have implemented four. Two other fundamental dimensions of indexing were space and time.
This paper reports on current work on an experimental implementation of a unified similarity coefficient and an associated navigation tool. Integrating multiple dimensions allows the strengths of different indexing techniques to be combined. We first categorise different types of similarity measures by the kinds of index terms they employ. We then go on to describe the unified similarity coefficient and a navigation tool based on it, followed by a brief description of the underlying similarity measures.
Hypermedia (multimedia) applications employ a wider range of types of index descriptors than has been usual in information retrieval. For example in content-based indexing and retrieval, image processing may be used to derive statistical characterisations of images or video frames based on colour, texture, or automatic feature extraction. This would allow a user to request an image database for similar images to a given drawing or photograph (eg the 'media-based navigation' of Hirata et al 1993). Figure 1 summarises a sample of similarity coefficient applications in hypermedia with respect to different types of index attribute, and whether index terms are predefined or automatically calculated.
| Terms | Predefined | Automatic |
|---|---|---|
| Dichotomous | HyperSet, BookHouse, Apple/Grolier, Strathtutor | IR Vector Space |
| Quantitative | Talaria,spatial Euclidean distance | Content-based Image/Video, IR Vector Space |
| Semantic | Thesaurus Search Aids | Automatic Classification |
Predefined terms may be standard keywords, classifications or thesauri, or specially defined for the application. Automatic calculation of terms might occur in free text retrieval methods, or as feature vectors resulting from automatic feature extraction to characterise components of a digital image.
Dichotomous terms are the most common type in hypermedia applications (the terminology is taken from Gower's characterisation of terms, see the unified similarity coefficient below). Here the descriptor vector simply indicates the presence/absence of (boolean) terms in the information item or query. Frequently vector space applications use weighted vectors of real terms to reflect the relative importance of terms in a document, or relevance to a particular query, and Table 1 lists them as both dichotomous and quantitative. Gower also distinguishes 'qualitative terms' which are multi-level terms (eg, red, green, blue) that do not form an ordered set, and with no semantic relationships between terms. The Talaria hypermedia healthcare training tool (Madigan et al 1994) extends the Strathtutor approach by assigning information units a position in a high-dimensional 'repertory hypergrid'. Terms can have six levels representing the degree of match between the information unit and one of 29 traits. Identification of traits and ranking of information units was the result of a prior knowledge elicitation exercise. Information units can be automatically linked by a Euclidean distance measure since terms are multi-level. Quantitative terms could also represent the cartographic data used in multimedia geographic information systems, and can represent the real coordinate systems used to express the results of the feature extraction algorithms used (say) in the Miyabi system (Hirata et al 1993) to index digital images, or in video (Zhang et al 1995) - the automatic content-based indexing of video (television news programs) to allow direct navigation to news segments. In museum applications, information items are frequently indexed by a standard classification system or thesaurus, where semantic relationships may exist between a set of index terms, for example hierarchic subclass/superclass relationships, or related terms. A semantic closeness measure allows a measure of similarity to be asserted between non-identical terms. Thus semantically structured index spaces could be considered another basic type.
Certain classes of hypermedia applications, including museum systems, require a multi-dimensional classification of information. Different aspects of an information space are indexed by independent dimensions, which may require different data types. Wu et al (1995) have implemented a content-based retrieval engine for multi-dimensional multimedia objects in image databases that includes a similarity coefficient combining different levels of indexing and similarity measures. They have developed applications in computer-aided facial image (mugshot) retrieval and a trademark archive. In social history museum information systems space, time, and subject classification are fundamental categories of information. Our previous work treated each dimension separately and intersected the results. Here we have been exploring the potential that a unified similarity coefficient can offer.

Sijk is the similarity between the two items on the kth dimension of indexing. Wijk is a user-defined weight for each dimension, but is set to zero when no valid comparison is possible between two items on that dimension. Thus, the coefficient handles missing data (for example, dates might not be available or indexing might be incomplete). Each dimension returns an individual similarity measure between 0 and 1. Gower distinguished between quantitative, dichotomous, and qualitative terms, as described above. (Additionally, he distinguished 'alternative' terms from dichotomous where two absences count as a match.) We have extended the coefficient with semantic closeness measures where semantic relationships exist between terms, and in the time dimension included temporal periods. The aim is to incorporate:
A media item can have anything up to five keyword terms associated with it - an arbitrary limit. In contrast to controlled vocabularies, in museum archives there is frequently a short free text passage associated with an information item, perhaps hand writing on the back of a photograph, or entered by a curator to describe how the object was acquired. Such information may not be easy to reconcile with controlled vocabulary classification systems. However it is amenable to automatic term extraction and free text vector space techniques, for example. Incorporating such terms into a unified similarity coefficient offers the possibility of combining the strengths of both controlled vocabulary and free text information retrieval. The keyword terms for these artificial examples were keyed in (some were manually extracted from a text in the system produced by a former worker at the Brown Lenox Chainworks, commenting on the photographs). However, similar keywords or stems could be automatically extracted from any relevant documents using standard techniques.
Both structured, controlled vocabulary (thesauri, classifications) and free text approaches have strengths and weaknesses (Aitchison and Gilchrist 1987) in terms of retrieval. Here we try to illustrate the potential for combining the two. For another approach to the problem, with different levels of indexing for free text terms and semantic concepts, see Agosti et al (1995). The Cheshire II advanced online catalog (Larson et al 1995) combines boolean and free text queries with probabilistic ranking to exploit (in a working library environment) the advantages of both types of search strategy. Romer (1996) sees a need for combining text-based and automatic content-based methods for accessing image databases.
The figures show illustrative scenarios of how the tool might be used. Figure 1 shows the similarity tool. In the scenario, the user has previously used the hierarchical SHIC browser to view media items classified by the third level term Manufacture of Metal Goods. From the Media Viewer's list of matching media items, MI93 (Chainmaking: left photograph in Figure 1) was selected as the basis of a Find Similar operation. Realising that it would be useless to further use SHIC to refine the query, the user assigns weights to Keywords and Npeople in the media similarity tool and ignores the other dimensions. The results of the search over the media items previously found by the SHIC browser are shown in the Media Viewer in Figure 1. Some items have been discarded, and those that match reasonably well over the keywords and have fairly similar numbers of people are retained. In fact, our practice in the original SHIC indexing of the dataset was somewhat uneven; the set of items relating to the Brown Lenox chainworks were only indexed by a single SHIC term, unlike some other parts of the dataset, where several SHIC terms were employed in finer-grained classifications. However, we have encountered museum collections which include sections created by different curators with different practices in collection management; some may favour systematic indexing, while others may favour rich, unstructured textual descriptions at the expense of classification.
Figure 2 shows the results of a second navigation using four of the index dimensions: SHIC, Geography, Keywords, and Npeople. The weightings can be seen in the top left field of the media viewer window. The Figure shows two media items, the original media item (left), MI39, and one similar photograph (right). Having come across an interesting street scene of coronation celebrations, the user asks for similar items fairly generally. The photograph on the right is the same scene at night and scores highly here for its identical geographical and SHIC classifications (Events in Community Life, Food Retailing). In fact, plausible, but non-matching keyword terms (eg, butcher Vs shop) were artificially assigned to these photographs to illustrate one potential advantage of including a structured classification dimension (in this case SHIC). The semantic closeness measure over SHIC permits expression of similarity between photographs with differing numbers of terms and between close, but not exactly marching terms, in the SHIC hierarchical classification (in this scenario, Events in Community Life Vs Community Life, or Food Retailing Vs Retail Distribution). The geographical dimension also helps to return several photographs of the outdoor Market in the same general area. The media item, MI42, returned by the search is an interesting case, in that it does not possess classifications in each dimension. Due to incomplete information, no geographical classification could be assigned to it, and no comparison is possible on that dimension. In the unified similarity coefficient dimensions over which it is impossible to compute similarity are simply discounted from the calculation, something not possible in our previous work.

We are exploring criteria for choosing the cost factor associated with each traversal in the semantic closeness measures, and what control can be passed up to the user interface. The question of usability of the tool needs its own investigation. The present experimental demonstrator employs a slow, brute force method when searching for similar information items; more work is needed on efficient versions of the unified similarity algorithm.
Agosti M., Melucci M, Crestani F. (1995). Automatic authoring and construction of hypermedia for information retrieval. Multimedia Systems, 3(1), 15-24.
Beynon-Davies P., Tudhope D., Taylor C., Jones C. (1994). A Semantic Database Approach to Intelligent Hypermedia Systems. Information and Software Technology, 36 (6), 323-329.
Gower, J. (1971). A General Coefficient of Similarity and Some of its Properties. Biometrics 27, 857-74.
Hirata K., Hara Y., Shibata N. (1993). Media-based Navigation for Hypermedia Systems. Proceedings ACM Conference on Hypertext, Seattle, 159-173.
Kibby M.R., Mayes J.T. (1989). Towards Intelligent Hypertext. In R. McCaleese (Ed.), Hypertext: Theory into Practice (pp 164-172). Oxford: Intellect.
Larson R., Moon R., McDonough J., Kuntz L., O'Leary P. (1995). Cheshire II: Design and evaluation of a next-generation online catalog system. Proceedings of the 58th Annual Meeting of the American Society for Information Science (ASIS'95), Chicago, 215-225.
Madigan, D., Chapman, C., Gavrin, J., Villumsen O., Boose J. (1994). ÔRepertory Hypergrids: An application to Clinical Practices Guidelines. Proceedings ACM European Conference on Hypermedia Technology, Edinburgh, 117-125.
Pejtersen A. (1989). A library system for information retrieval based on a cognitive task analysis and supported by an icon-based interface. Proceedings ACM Conference on Information Retrieval, 40-47.
Parunak, H. (1993). Hypercubes grow on trees (and other observations from the land of Hypersets). Proceedings ACM Conference on Hypertext, Seattle. 73-81.
Salomon G., Oren T., Kreitman K. (1989). Using Guides to Explore Multimedia Databases, Proceedings 22nd Hawaii International Conference on System Sciences, 3-12.
Salton G. (1989). Automatic Text Processing. Reading,MA: Addison-Wesley.
Salton G. and Allan J. (1993). Selective Text Utilization and Text Traversal. Proceedings ACM Conference on Hypertext, Seattle. 131-144.
Romer D. (1996). Image and Multimedia Retrieval. Getty Art History Information Program Research Agenda Project Report, http://www.ahip.getty.edu/agenda/image.html.
SHIC Working Party. (1983). Social History and Industrial Classification: A Subject Classification for Museum Collections. University of Sheffield, UK: Centre for English Cultural Tradition and Language.
Sneath P., Sokal R. (1973). Numerical Taxonomy: The Principles and Practice of Numerical Classification. San Francisco: W. H. Freeman and Company.
Taylor C., Tudhope D., Beynon-Davies P. (1994). Representation and Manipulation of Conceptual, Temporal and Geographical Knowledge in a Museum Hypermedia System. Proceedings ACM European Conference on Hypermedia Technology. Edinburgh, 239-244.
Taylor C. (1996). Semantic Modelling and Navigation in a Museum Hypermedia System. PhD Thesis. University of Glamorgan.
Tudhope D., Taylor C., Beynon-Davies P. (1995a). Semantic Navigation Tools for Museum Hypermedia. Computers and the History of Art, 5(2), 51-63.
Tudhope D., Taylor C., Beynon-Davies P. (1995b). Classification and Hypermedia. Proceedings 6th ASIS Sig/CR Classification Research Workshop, Chicago, 173-192.
Wu J., Narasimhalu A., Mehtre B., Lam C., Gao Y. (1995). CORE: a content-based retrieval engine for multimedia information systems. Multimedia Systems, 3(1), 25-41.
Zhang H., Tan S., Smoliar S., Yihong G. (1995). Automatic parsing and indexing of news video. Multimedia Systems 2(6), 256-266.
© 1996, American Society for Information Science. Permission to copy and distribute this document is hereby granted provided that this copyright notice is retained on all copies and that copies are not altered.