AM07 START Conference Manager    

The Benefits of Skimming in Data Fusion

Anselm Spoerri

(Submission #61)


Summary

Data fusion methods commonly use and compare all the documents returned by multiple retrieval systems to create a new result list. On the one hand, as documents further down in the result lists are considered, a document’s probability of being relevant decreases significantly and major source of noise is introduced. On the other hand, retrieval systems tend to find similar relevant documents when searching the same database, but they do not find them in the same rank positions. Thus, data fusion methods need to consider all of the documents returned by the retrieval systems. Using TREC 3, 6, 7, 8, 12 and 13 data, this paper examines how “skimming”, where the number of documents examined in the result lists is gradually increased, can help to identify relevant documents. It is shown that “gradual skimming” and what can be learned as the list depth is increased can help to improve the retrieval effectiveness of data fusion methods.


  
START Conference Manager (V2.54.4)
Maintainer: asis@asis.org