ASIST AM 03 2003 START ConferenceManager    

Identification of Effective Predictive Variables for Document Qualities

Kwong Bor Ng (Queens College, CUNY), Paul Kantor (Rutgers), Rong Tang (SUNY Albany), Robert Rittman (Rutgers), Sharon Small (SUNY Albany), Peng Song (Rutgers), Tomek Strzalkowski (SUNY Albany), Ying Sun (Rutgers), and Nina Wacholder (Rutgers)

Presented at ASIST 2003 Annual Meeting -- Humanizing Information Technology: From Ideas to Bits and Back (ASIST AM 03 2003), Westin Long Beach, Long Beach, California, October 20 - 23, 2003


We analyzed textual properties of documents to identify predictive variables for various document qualities by means of statistical and linguistic methods. We have created a collection of 1000 documents, each document has been judged in terms of nine document qualities (accuracy, reliability, objectivity, depth, author/producer credibility, readability, verbosity and conciseness, grammatical correctness, one-sided or multi-view.) Employing statistical analyses, we considered a kind of linear combination, asking (1) if it was possible to combine textual features linearly to predict document qualities; (2) what textual features had good predictive power; (3) what textual features were minimally required for prediction with a detection rate much better than the false alarm rate. We present several promising results, indicating that with a few number of textual features, we can predict various document qualities much better than chance.

Server START Conference Manager (V2.45.2)
Update Time 29 May 2003 at 07:31:04
Start Conference Manager
Conference Manager