JASIST Home PageJASIST Table of Contents Page

Journal of the Association for Information Science and technology

 EDITORIAL

 

In This Issue
   
Bert R. Boyce
 

443

 

 RESEARCH

 

Assessment of the Effects of User Characteristics on Mental Models of Information Retrieval Systems
    Xiangmin Zhang and Mark Chignell
    Published online 15 February 2001

In this issue we begin with Zhang and Chignell who use the Repertory Grid Technique (RGT) to extract user's mental models of information retrieval systems in order to study the effects on these models of four characteristics: educational and professional status, first language, academic discipline, and computer experience. Each of 64 subjects rated nine retrieval system concepts as to three attributes (form/process, targeted/not targeted, and specific to IR system/applicable to all information systems) yielding 27 variables for analysis. A factor analysis yielded nine factors with an eigenvalue greater than one, which accounted for 68% of the variation from the original ratings. The first factor appeared to be concerned with the purposefulness of querying; the second, applicability of data organization; the third, the function of querying; the forth, applicability of querying; the fifth, applicability of browsing; the sixth, function of data structure; the seventh, purposefulness of browsing; the eighth, function of the document; and the ninth factor, the purposefulness of data structure. Analysis of variance and Tukey tests were applied to the subjects factor scores. Educational and professional background, discipline, and computer experience all had significant effects on the factor scores representing the mental models, language did not. Student an information professional scores differed widely on factors 1 and 3. Graduates differ from other students on factors 2 and 6. The user's discipline shows significant differences on factors 1, 2, 3, and 7, and computer experience has differences on 1, 2, and 7. Overall information professionals and students have strikingly different models. Science students see browsing as a targeted activity but humanities students do not. Language does not seem to affect mental models of information retrieval systems.

445
 

 

 

 

 

 

 

 

 

 

 

 

 

 


Modeling the Retrieval Process for an Information Retrieval System Using an Ordinal Fuzzy Linguistic Approach
    E. Herrera-Viedma
    Published online 15 February 2001

Herrera-Viedma, believes that quantitative weights computed from term occurrence are appropriate for the characterization of documents, but not for queries or the estimated relevance levels for ranking of retrieved documents, where human understanding argues for qualitative expression. Terms for queries are ranked in seven symmetric ordinal classes by searchers, or by an importance weight or by a weight indicating how many documents should be returned for that term. An RSV is computed for each document for each ordered representation of the query. These are then aggregated by the search system for final evaluation of documents. The aggregation is carried out by linguistic implication functions which provide varied definitions of disjunction and conjunction depending upon the relative importance of the logical sub-expressions of the query. Users will need to determine which, or how many of the ordering schemes to use.

460

 

 

 

 

 

 

 

 


Discovering Term Occurrence Structure in Text
    Abraham Bookstein and T. Raita
    Published online 15 February 2001

Bookstein and Raita observe that term occurrences tend to clump in texts.  That is to say, if a term's occurrence is observed in adjacent text segments, the expected number of random clumps will be exceeded. Strongly clumped terms have retrieval value, and if text is partitioned to minimize clumping strength such stretches of text are likely to be content homogeneous. Linear clumping strength is measured by the ratio of the expected value of clumps formed to the observed value. The standard deviation will express the degree of non-randomness or clumping. Condensation clumping views the problem as a distribution of terms (balls) into text segments (urns) and the ratio of the expected number of segments
containing the term to the observed number as the clumping measure. The common retrieval measure, inverse document frequency, can be rewritten in these terms with little difference between the two when the probability the segment contains the term is small. The standard deviation of the condensation clumping measure will allow an expression of the degree of non-randomness, but is complex to compute. The use of an approximate value at least as large as the standard deviation simplifies the process. The two measures diverge as segments are merged together with linear clumping decreasing and condensation clumping increasing.    Using the same general model a measure is constructed using the gaps between segments with term occurrence, where the text is considered to be wrapped in a circular fashion. More generality is achieved, but it appears that performance is very similar to the previous measures.

    476

 


 

 

 

 

 

 


 

 

 

 


Optimal Query Expansion (QE) Processing Methods with Semantically Encoded Structured Thesauri Terminology
    Jane Greenberg
    Published online 22 February 2001

Greenberg looks at the automatic expansion of queries using thesaurus terms in varying relationships with entry terms, based on a binary relevance evaluation of initial return by end users, as opposed to interactive expansion where the system provides a list of possibilities based on the initial return and the user chooses expansion terms. Using ten queries collected from MBA students, the ProQuest Controlled Vocabulary, and the ABI/Inform database on DIALOG, she mapped each query to the thesaurus terms as a base, and created four expansions: synonyms, narrower terms, related terms, and broader terms. Relevance judgements were made on the basis of topical matching (aboutness) by the contributors of the queries reviewing the Union set of the responses to the query forms where each retrieved list was limited to a length 15 or less citations. The automatic expansions separately took all synonyms, all narrower terms, all broader terms, and all related terms. For interactive expansion users chose from a alphabetized union list of the terms in thesaurus records for query terms. These selections were then incorporated in the query expansion by the searcher. Users chose from all groups but took over half of the suggested synonyms and broader terms, and over a quarter of the narrower and related terms. Synonyms and narrower terms augmented recall without a significant loss in precision in both automated and interactive searching, which argues for their use in automated expansion since less effort is required. Broader and related terms improved recall the most but would not be useful in automatic expansion if high precision is a goal. However, they, and particularly related terms, are seen as excellent candidates for use in interactive expansion.

    487

 

 

 

 

 

 

 

 

 

 

 


 

 


Evaluating Internet Resources: Identity, Affiliation, and Cognitive Authority
in a Networked World

    John W. Fritch and Robert L. Cromwell
    Published online 8 March 2001

The filters in print media that provide authority are not available on the Internet so that authorship and thus accountability are uncertain. Determining true authorship and affiliation are likely to be the most significant need in establishing cognitive authority of a site. Fritch and Cromwell suggest the assessment of documents, authors, institutions and affiliations separately followed by integration of the results while indicating confidence in decisions on a separate scale. In their example, confirming the connection of the domain name to the assumed sponsor via the Whois search is a first step. Looking for author statements and affiliations to other sites is the second. The identification of overt and covert links may disclose bias.

499
 

 

 


 

 

 

     

 BOOK REVIEWS

 

Electronic Expectations: Science Journals on the Web, by Tony Stankus
   
Michael Fosmire
    Published online 16 February 2001

        508
 

  

 

Snap to Grid: A User's Guide to Digital Arts, Media, and Cultures, by Peter Lunenfeld
    G. Benoit
    Published online 16 February 2001

509
 

 

 

 

 
 

From Web to Workplace: Designing Open Hypermedia Systems, by Kaj Gr\onbaek and Randall H. Trigg
    Ina Fourie
    Published online 21 February 2001

510

 

 

 

Organizing Audiovisual and Electronic Resources for Access: A Cataloging Guide, by Ingrid Hsieh-Yee
    Karen Spern
    Published online 21 February 2001

512
 

 

     
 

CALL FOR PAPERS

514


ASIST Home Page

Association for Information Science and Technology
8555 16th Street, Suite 850, Silver Spring, Maryland 20910, USA
Tel. 301-495-0900, Fax: 301-495-0810 | E-mail:
asis@asis.org

Copyright 2001, Association for Information Science and Technology