A Comprehensive and Systematic Model of User Evaluation of Web Search Engines I. Theory and Background
Louise T. Su
Published online 7 July 2003

In this issue Su presents an extensive literature review of web 1. search engine evaluation from 1995 to 2000 concluding that the extensive progress made does not extend to knowledge of end user motives, backgrounds, information needs, strategies, success rates, or judgements concerning engine effectiveness. An evaluation model is suggested which includes lists of performance measures based upon relevance, efficiency, utility, user satisfaction, and the number of good links provided; and also participant measures based upon background, experience, needs, and search behavior. The steps for a test of the model are then outlined.











A Comprehensive and Systematic Model of User Evaluation of Web Search Engines II. An Evaluation by Undergraduates
Louise T. Su
Published online 10 July 2003

In her second paper she tests her model on 36 volunteer junior and senior students at the University of Pittsburgh, each of whom had an information need and some online search experience. AltaVista, Excite, Infoseek, and Lycos were run under Netscape 4.0 with each subject searching on all four engines and each engine searched in all four possible order positions. Relevance judgements were made in a second session with the five most user relevant drops ranked. Both online questionnaires and post search interviews were utilized and a log program recorded times, terms, and search results. ANOVA tests were run to find the effect of engine and participant discipline, while system and user rankings were tested for correlation, and non-parametric tests run on nominal and ordinal data. Disciplines are significantly different as to their requirement for comprehensiveness. Engine effect is significant for precision and relative recall with the ranking for all measures being AltaVista, Excite, Infoseek, Lycos. The ranking provided by Lycos was closest to the participants (Pearson's .28) with AltaVista and Infoseek following closely. Infoseek had the lowest mean search times and participants used between 3 and 5 queries on each engine, but efficiency measures did not vary significantly. User satisfaction ratings vary depending upon the measure utilized, but valuation of results as a whole find both AltaVista and Excite significantly better than Lycos. Content analysis of interview data indicate four user criteria for satisfaction interaction, value, precision, and overall performance.










A Summarization System for Chinese News from Multiple Sources
Hsin-Hsi Chen, June-Jei Kuo, Sheng-Jie Huang, Chuan-Jie Lin, and Hung-Chia Wung
Published online 29 July 2003

Chen, et alia, receive online news from six Chinese online newspapers, cluster the stories together based first upon predefined topics and then named entities extracted from the text, partition this text into meaningful units, link the meaningful units which denote the same event using noun and verb similarity measures, and finally display the results by selecting only the longest sentence from a set of similar meaningful units ordered by their original position. Presentation should be improved by moving meaningful units to the fore that have the most informative words. These are words of both high document and high term frequency. Nine events occurring over a one month period were selected as a test corpus. Using a baseline of similarity measures computed with thesaurus assistance, with each term matched only once and order not considered, several matching strategies were compared with small variations. The presentation techniques were tested by evaluators answering questions with various designs while degree or reduction, precision, and interaction times were recorded. Use of informative words did not increase performance and removal of lightly covered stories did not reduce performance. A larger scale test without human users indicates informative words may in fact improve performance.









Interdisciplinarity in Science A Tentative Typology of Disciplines and Research Areas
Fernanda Morillo, Mar­a Bordons, and Isabel G›mez
Published online 8 July 2003

Morillo, Bordons and Gomez make use of ISI's practice of multi-assignment of journals to topical categories to indicate the existence of cognitive links among disciplines. The categories, excluding Multi-disciplinary science and Education/Scientific disciplines, were grouped into nine general research areas. They then determined the percentage of multi-assigned journals per category, the number of such links for a category within a research area and also external to its assigned research area, and the number of different links in a category normalized by category size in journals. The strength of the relationship was also measured by dividing the number of journals in common in two categories by the square root of product of the number of journals in each category. On average 53% of journals in each category were multi-assigned but categories varied from no multi-assigned journals to 100%. Bio-medicine and Technology appear to be highly multi-disciplinary while Humanities is far less so. New disciplines tend to be highly interdisciplinary and show considerable linkage with external research areas.








Author Cocitation Analysis and Pearson's r
Howard D. White
Published online 18 July 2003

White responds to a previously published criticism of the use of Pearson's r as similarity measure in author co-citation analysis which suggested that r over responds to dissimilarity when a second group of authors with minimal co-citation to an initial group is combined with that group. Cosine and chi square were suggested as replacements. The criticism appears to focus on the simultaneous study of disjoint literatures, which seems an unlikely circumstance. Large blocks of cells with zero co-citation will destabilize Pearson's r but such have not appeared in actual data and are likely only do so when author-pairs are chosen for lack of co-citation or a less than cohesive set of authors has been chosen rather than a literature. Using the disjoint data with Pearson's r, the cosine measure and chi square, multidimensional scaling and hierarchical clustering routines yield maps that are all very similar.


"Type/Token-Taken" Informetrics Some Comments and Further Examples
Quentin L. Burrell
Published online 8 July 2003

Finally, in a brief communication, Burrell develops Egghe's  Type/Token-Taken model of sources generating items as a discrete rather than continuous formulation and finds some results simpler and more clear-cut. He also illustrates the development of the log normal and negative binomial distributions in these term.


Metadata Fundamentals for All Librarians, by Priscilla Caplan
Wallace Koehler
Published online 21 July 2003


