Journal of the Association for Information Science



Bert R. Boyce




Topological Aspects of Information Retrieval
Leo Egghe and Ronald Rousseau

We begin with two articles suggesting the possible separation of document and query vector space. Viewing information retrieval as a topology on a document space determined by a similarity function between queries and documents gives what Egghe and Rousseau call a retrieval topology. Such topologies might use a pseudo metric which measures the distance between documents independent of the query space, or might make all similarity functions between documents and queries continuous, called here the similarity topology. The topological model allows the introduction of Boolean operators. The inner product is suggested as producing a more powerful model than the cosine measure.




On the Necessity of Term Dependence in a Query Space for Weighted Retrieval
Peter Bollmann-Sdorra and Vijay V. Raghavan

Bollmann-Sdorra and Raghavan show that if query term weights are to be useful in retrieval, term independence is an undesirable property in a query space. Independence remains desirable in document space. It would appear that the assumptions that documents and queries are elements of the same space, and that term independence is required, are not warranted.




Optimizing a Library's Loan Policy: An Integer Programming Approach
Hesham K. Al-Fares

Al-Fares presents a new loan policy model which incorporates a decision variable for maximum books to be borrowed, along with the traditional loan period, and adds user satisfaction with policies to the usual book availability satisfaction indicator. Each indicator is defined as the ratio of satisfied demand to total demand. Number of renewals, duplications, demand, and reservations are considered to have a very small effect.




On the Fusion of Documents from Multiple Collection Information Retrieval
Ronald R. Yager and Alexander Rybalov

Yager and Rybalov assume m retrieval systems without file overlap each providing a ranked list of texts based upon their varying ranking criteria, and in response to a common query, and define fusion as the construction of a single ordered list of the n most relevant texts over all m system responses. This requires determining the potential of each system to provide relevant answers to the query.

A previous fusion method which is empirically effective but where different runs will result in different orderings, uses a random selection method biased toward the length of the contributing list. Alternatively one might use the longest remaining list for each choice or take equally from each collection until the shortest is exhausted, and then continue until the next shortest is exhausted, and on, until all are exhausted. A centralized fusion scheme computes a value based upon the number of documents in a list and the number already removed. The value is re-computed for each collection after each removal of the top element in the collection with the highest value. Another possibility is a proportional approach, where the list value is its remaining number of elements less one divided by the original number, and a value can be assigned to each individual document which is the number of elements in the list less its position in the list, divided by the number of elements in the list.




Indexing and Access for Digital Libraries and the Internet: Human, Database, and Domain Factors
Marcia J. Bates

Bates provides a review of what we know and do not know about indexing and access that will apply to large digital document files. Particularly she emphasizes that statistical regularities exist in the subject representation of files and should influence design, that subject domain should affect system design, and that what we know of human linguistic and searching behavior must be taken into account for an optimal information retrieval system.




Software Engineering as Seen through Its Research Literature: A Study in
Co-Word Analysis
Neal Coulter, Ira Monarch, and Suresh Konda

The indexing for 16,691 documents from 1982 to 1994 which were assigned at least one term from the software engineering category was collected by Coulter, Monarch, and Konda and a co-occurrence study carried out to determine the interaction of software engineering areas of study over time. The association measure was the square of the co-occurrences of two terms over the product of their occurrences. The threshold value was varied with the size of the data sets, but the number of links and nodes was fixed at twenty-four and twenty. For the period 1982 - 1986 15 networks were generated; for 1987 - 1990 16; and for 1991 - 1994 11. The networks exhibit considerable change over time although some consistent themes, like software development and user interfaces, persist.




Information Aspects of New Organizational Designs: Exploring the Non-Traditional Organization
Bob Travica

To address the role of information technology in non-traditional organizations Travica treats IT as level of use of several specific technologies, and non-traditional structure as the level of organization structure, plus other selected variables. Data came from surveys of a random stratified sample of employees at twelve local accounting offices and an interview with the local manager. Information technology correlates with non-traditional structure. Information technology correlates negatively with formalization and centralization, and positively with cross boundary communication. Spatial dispersion is negatively associated with trust sharing.


ASIS HomeSearch ASISMake A Comment

© 1998 , Association for Information Science
Last update: November 06, 1998