Journal of the Association for Information Science



Bert R. Boyce




Extending Theory for User-Centered Information Services: Diagnosing and Learning from Error in Complex Statistical Data
Alice Robbin and Lee Frost-Kumpf

According to Robbin and Frost-Kumpf, data production and utilization are best understood as social processes. Error is socially produced in data production and use and will always be present to some degree. Information can be organized to inform the researcher on how to avoid error. To design modern information systems that will reduce error, each system should incorporate a permanent repository of conversations about error. Systems should include their rules for data production and use and use prototyping.




Integrating Structured Data and Text: A Relational Approach
David A. Grossman, Ophir Frieder, David O. Holmes, and David C. Roberts

Grossman et al. demonstrate that standard relational database software can be used for information retrieval purposes where unstructured text and structured fields may be mixed in SQL queries to provide Boolean, proximity, and weighted ranked searches. By using only the least frequently occurring terms across the collection in queries by sorting and taking a controlled percentage of the original query set, performance is improved. The SQL server outperforms Lotus Notes below a 50% query term reduction threshold. The storage size required for the SQL server files is considerably larger.




Evaluation of Search Results: A New Approach
Vladimir G. Voiskunskii

Voiskunskii believes that no single value measure is justified pragmatically for the evaluation of search results in all circumstances. Practically we see that the square root of the product of precision and recall is an acceptable measure for contemporary retrieval systems. The use of the easily obtained square of the number of relevant documents in the retrieved set divided by the number of the documents in the retrieved set provides an adequate substitute in the sense that, with both measures, the order of the rankings is unaffected by the number of relevant documents.




Comparing Boolean and Probabilistic Information Retrieval Systems across Queries and Disciplines
Robert M. Losee

A general model for performance prediction of Boolean and Probabilistic retrieval systems is presented by Losee which could suggest the most likely search system in a situation where choices are available. The model indicates that accounting for term dependence, rather than assuming independence, will positively affect performance. Situations based upon individual and joint term probabilities can produce an indication of which Boolean operator would be most effective and whether a Probabilistic search might improve performance.




A Graphical, Self-Organizing Approach to Classifying Electronic Meeting Output
Richard E. Orwig, Hsinchun Chen, and Jay F. Nunamaker, Jr.

Kohonen's Self-Organizing Map (SOM) is a neural network where random number mapping nodes are compared with input nodes to identify the smallest Euclidean distance between the mapping and input vectors. Orwig, Chen, and Nunamaker adjust the smallest distance vector to reduce the distance to all neighboring nodes, and repeat the process until the input nodes are exhausted and a clustering has taken place. Training inputs were used to form classes that then group the messages in an electronic meeting system where group members exchange ideas in order to address a problem. The method organizes more quickly than would a human facilitator, but less so than a Hopfield algorithm. No significant difference was found in recall performance, but the human facilitator list outperformed the Kohonen list in precision. Both human and Kohonen outscored Hopfield on term association capability. Considering the reduced time and effort using Kohonen, it appears as a viable option.




Science-Technology Coupling: The Case of Mathematical Logic and Computer Science
Roland Wagner-Dbler

Wagner-Dbler compared a bibliography of mathematical logic from 1874 to 1990 with references from the first 37 volumes of the Journal of the ACM. Over 15% of the references were present. One hundred papers in the JACM were also in the bibliography indicating the presence of hybrid scholars. The logic code from Mathematical Reviews occurs most often on the same document as computer science. A considerable time lag (over 48 years) occurs between publication in the logic literature and citation in the JACM.




Describing Technological Paradigm Transitions: A Methodological Exploration
Danny P. Wallace and Connie Van Fleet

Wallace and Van Fleet have finally provided a clear explanation of what is meant by nonquantitative research, and a description of a methodology for those who find historical and ethnographic methods as overly restrictive as those endorsed a century ago by a researcher whose notoriety without doubt is due to his unreasonable views on methodology. ``When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science, whatever the matter may be''--Lord Kelvin, Popular Lectures and Addresses (1889---1894) Volume 1, p. 73. The Sessio Taurino is a method for the rest of us.



Top: Rolodex index cards, a more primitive (however handy) form of data storage and retrieval. Bottom: Art Resource, NY. Scroll showing the family tree of Scottish kings and queens and their descendants. 14th century British Library, London, UK. A ``tree'' is used as a metaphor for seeking information. Who came first and who is related to whom--the tree takes us backward and forward in time.

by Adrienne Weiss, Designer

