Journal of the Association for Information Science



Bert R. Boyce




Cyril W. Cleverdon
Stephen Robertson




Simon's Generating Mechanism: Consequences and Their Correspondence to
Empirical Facts
Vesna Olui-Vukovi
Olui-Vukovi finds that viewing the Bradford distribution as a stochastic process, as did Simon, using one-year intervals, provides a good correspondence with empirical data, but that for longer time intervals the model over-predicts high production and under-predicts low production.




Natural Language versus Controlled Vocabulary in Information Retrieval: A Case Study in Soil Mechanics
Manikya Rao Muddamalle

Two studies look at particular cases of index language performance. Rao, in a very traditional study conducted in an operational system, compared indexing from an expansion of the Microthesaurus of Soil Mechanics Terms to natural language queries. The results are inconclusive.




An Association-Based Method for Automatic Indexing with a Controlled
Christian Plaunt and Barbara A. Norgard

Using a sample of 4,626 INSPEC records, Plaunt and Norgard associated assigned subject headings with extracted lexical units from author, title, and abstract fields creating an association dictionary of word, subject heading pairs with a strength of association value. Extracted lexical units from a reserved 10% of the sample were then passed against the dictionary and the matching subject headings and their weights used to create document vectors for these new documents. Where multiple units are associated with the same subject heading the weights are summed and then an arbitrary number of the highest ranking headings are chosen to represent the document

Using indexing consistency measures, and precision and recall measures where relevance to a set of lexical units is determined by the subject headings assigned by human indexers, performance was tested at various depths. Consistency meets or exceeds that shown in human tests and supplementing title extraction with abstract or author units improves performance.

To determine the probability of association in an other than normal distribution situation the "likelihood ratio'' treats the co-occurrence as a binomial counting problem. The resultant measure shows superior performance to Chi Square.




Design Considerations in Instrumenting and Monitoring Web-Based Information Retrieval Systems
Michael D. Cooper

After an exposition of the structure of software for web servers and browser clients, Cooper suggests a need for monitoring software that would analyze and classify user behavior and provide suggestions for improved user satisfaction, and improved system design and performance without identifying individual users with their transactions. A multiple server model is suggested to collect useful data in a web-based client server environment.

Browser logging may well need to go beyond the messages sent and received by the server, although even this requires some method of isolating client interactions from one another either by cookie, or by the use of an assigned session ID Number. Current servers keep essentially "content free'' logs, which are not satisfactory for information retrieval analysis.




Document Representation and Retrieval Using Empirical Facts: Evaluation of a Pilot System
Sam G. Oh

Oh tests an ``Empirical Facts Retrieval System'' against a conventional retrieval system, using a collection of documents that contained statistical relationships among empirical variables. The EFRS outperforms the conventional system in terms of precision, user satisfaction and search effort. The EFRS has the ability to specify in its queries the relationship between specified variables, and its direction, so a linking mechanism exists that improves specificity. The index language is just the variable names in use. The technique is limited to documents reporting statistical relationships among variables and requires some means of identifying these variables at time of indexing.




Standardizing Relative Impacts: Estimating the Quality of Research from Citation Counts
G. Van Hooydonk

Van Hooydonk notes that a large part of the observed impact factor of a journal, or impact of a discipline, is dependent on the number of publications therein. If we use the number of publications on a topic to predict an expected impact factor, the ratio of the observed factor to the expected factor can be used to provide a topically relative impact measure. In the 55 sub-disciplines of science in the Institute of Scientific Information database about 80% of the impact factor is correlated with disciplinary publication level. The removal of disciplinary differences leads to changed journal rankings, which should better indicate quality of research.




The Emergence of Distributed Library Services:  A European Perspective
Lorcan Dempsey, Rosemary Russell, and Robin Murray

The Dempsey, Russell, and Murray article is the most recent European Research Letter and discusses attempts in Europe to provide cooperative standards-based library service, particularly those based upon the Z39.50 standard. These efforts are so far concentrated in limited domains without single point access.




Expert Systems: Introduction to First and Second Generation and Hybrid
Knowledge Based Systems
by Chris Nikolopoulos
reviewed by Sait Dogru




Automated Information Retrieval: Theory and Methods
by Valery I. Frants, Jacob Shapiro, and Vladimir G. Voiskunskii
reviewed by Geoffrey Z. Liu




Web Security & Commerce
by Simon Garfinkel
reviewed by Melanie J. Norton


