ASIST AM 04 START ConferenceManager    


An Approach to Document Relevance Based on Clustering

Amanda Spink - aspink@sis.pitt.edu Monica Desai - mdesai@cse.psu.edu

Presented at ASIST 2004 Annual Meeting; "Managing and Enhancing Information: Cultures and Conflicts" (ASIST AM 04), Providence, Rhode Island, November 13 - 18, 2004


Abstract

Search engines fail to make a clear distinction between items of varying relevance when presenting search results to users. Instead, they rely on the user of the system to estimate which items are relevant, partially relevant, or not relevant. The user of the system is given the task of distinguishing between documents that are relevant to different degrees. This often hinders the accessibility of relevant or partially relevant documents, particularly when the result set is large and many non-relevant documents are scattered throughout the set. In this paper, we present the results of a clustering scheme that groups documents within relevant, partially relevant, and not relevant clusters for a given search. A ranking algorithm accomplishes the task of clustering the documents. Data was collected from end-users issuing categorical, interval, and descriptive relevance judgments. The degree of overlap between users and the system for each of the clustered regions was measured. This research showed that clustering documents on the Web by regions of relevance is quite feasible.


  
START Conference Manager (V2.47.4)
Maintainer: rrgerber@softconf.com