2013 Annual Meeting
Montréal, Québec, Canada | November 1-5, 2013
Jacques Savoy, University of Neuchâtel
This paper presents and evaluates a collaborative attribution scheme based on six authorship attribution schemes representing the two main paradigms used in authorship studies. Based on very frequent words as features, the classical paradigm proposes to compute an intertextual distance between the disputed text and the different author profiles (concatenation of their writings). As a second paradigm, we can apply different machine learning schemes such as the naïve Bayes, and the support vector machines (SVM). As an evaluation corpus, we have used the Federalist Papers, a well-known collection in authorship attribution. During our evaluation, we have tried to follow the recommendations and the best practices known to assess the various attribution schemes. The evaluation shows that in the two paradigms we can find effective attribution schemes. But when combining these individual results using a vote aggregation method, the final collaborative decision is always correct and robust. Moreover, to indicate the degree of belief attached to the combined attribution, we can consider the percentage of votes obtained by each possible assignment. When analyzing the output given by the individual attribution schemes, we also found that the provided information is difficult to interpret for the end-user.