|AM08 2008||START Conference Manager|
The initial data shows that about 46% of the words from the titles are found in the tags literally and about 25% of the words from the tags are found in the titles literally. Similarly, about 52% of the words from the titles are found in the descriptions and about 27% of the words from the tags are found in the descriptions. It should be noted that the counting is strictly word-by-word counting without any consideration of variations such as articles (a, an, and the), apostrophes, plurals (-s or -es, for example) and tenses (-ed, for example).
Significant overlapping between the words used in tagging and title and between tagging and description were confirmed with the initial data analysis based on strict word-by-word comparison. It is believed that with more aggressive word counting, considering variations such as plurals and tenses, the overlapping percentage would be much higher. In further data analysis, highly refined counting will be conducted to confirm this prediction with a much larger data set.
This study reveals a significant redundancy of tagging against already established access points such as title and description with an empirical data set. Unlike the majority of current research on tagging which support tagging's potential, this study questions the effectiveness of it for theoretical and practical reasons.
|START Conference Manager (V2.54.6)|