JASIST IndexJASIST Table of Contents

Journal of the Association for Information Science and Technology



In This Issue
Bert R. Boyce




State Digital Library Usability Contributing Organizational Factors
Hong (Iris) Xie and Dietmar Wolfram
Published online 18 September 2002

In this issue Xie and Wolfram study the Wisconsin state digital library BadgerLink to determine the organizational factors that lead to different use requirements and the degree to which these are met, as well as impact on physical libraries. To this end, usage data from EBSCOhost and ProQuest logs for BadgerLink were analyzed, 313 Wisconsin libraries of all types were surveyed (76% response rate), and analyzed along with 81 responses to a voluntary web survey of end users. Heaviest users were K-12 schools and institutions of higher education. Heaviest use sites were the two largest state universities and the state's largest public library. Small libraries were infrequent users. Web survey respondents were mature working professionals. Sixty percent searched for specific information, but 46% reported browsing in subject areas. Libraries with dedicated Internet access reported more frequent usage than those with dial-up connection. Those who accessed from libraries reported more frequent use than those at work or at home. Libraries that trained end users reported more use, but the majority of the web survey respondents reported themselves as self-taught. Logs confirm reported subject interests. Three surrogates were requested for every full text document but full text availability is reported as the reason for use by 30% of users. Availability has led to the cancellation of subscriptions in many libraries that are important promoters of the service. A model will need to include interactions based upon the influence of each involved participant on the others. It will also need to include the extension of the activities of one participant to other participant organizations and the communication among these organizations.












Unfounded Attribution of the ``Half-Life'' Index-Number of Literature Obsolescence to Burton and Kebler A Literature Science Study
Endre Szava-Kovats
Published online 21 August 2002

Szava-Kovats demonstrates that the common attribution of the origin of the concept of half-life in subject-oriented journal literatures to the 1960 Burton and Kebler article in American Documentation is not correct.  The first use appears to be in C. R. Gosnell's 1944 paper in College and Research Libraries. It was later discussed by J. D. Bernal at the 1958 International Conference on Scientific Information in Washington, DC. While Burton and Kebler do solve some of the theoretical problems by redefining half-life, they do not express confidence in the use of half-life in this milieu, and Burton later advocates in 1961 the term ``median age'' which was introduced by Broadus in this context in 1953.








Is the Relationship Between Numbers of References and Paper Lengths the Same for All Sciences?
Helmut A. Abt and Eugene Garfield
Published online 19 September 2002

It has been shown in the physical sciences that a paper's length is related to its number of references in a linear manner. Abt and Garfield here look at the life and social sciences with the thought that if the relation holds the citation counts will provide a measure of relative importance across these disciplines. In the life sciences 200 research papers from 1999-2000 were scanned in each of 10 journals to produce counts of 1000 word normalized pages. In the social sciences an average of 70 research papers in nine journals were scanned for the two-year period. Papers of average length in the various sciences have the same average number of references within plus or minus 17%. A look at the 30 to 60 papers over the two years in 18 review journals indicates twice the references of research papers of the same length.









Algorithmic Procedure for Finding Semantically Related Journals
Alexander I. Pudovkin and Eugene Garfield
Published online 3 September 2002

Journal Citation Reports provides a classification of journals most heavily cited by a given journal and which most heavily cite that journal, but size variation is not taken into account. Pudovkin and Garfield suggest a procedure for meeting this difficulty. The relatedness of journal i to journal j is determined by the number of citations from journal i to journal j in a given year normalized by the product of the papers published in the j journal in that year times the number of references cited in the i journal in that year. A multiplier of ten to the sixth is suggested to bring the values into an easily perceptible range. While citations received depend upon the overall cumulative number of papers published by a journal, the current year is utilized since that data is available in JCR. Citations to current year papers would be quite low in most fields and thus not included. To produce the final index, the maximum of the A citing B value, and the B citing A value is chosen and used to indicate the closeness of the journals. The procedure is illustrated for the journal Genetics.








Using Graded Relevance Assessments in IR Evaluation
Jaana Kekalainen and Kalervo Jarvelin
Published online 3 September 2002

  Kekalainen and Jarvelin use what they term generalized, nonbinary recall and precision measures where recall is the sum of the relevance scores of the retrieved documents divided by the sum of relevance scores of all documents in the data base, and precision is the sum of the relevance scores of the retrieved documents divided by the number of documents where the relevance scores are real numbers between zero and one. Using the In-Query system and a text data base of 53,893 newspaper articles with 30 queries selected from those for which four relevance categories to provide recall measures were available, search results were evaluated by four judges. Searches were done by average key term weight, Boolean expression, and by average term weight where the terms are grouped by a synonym operator, and for each case with and without expansion of the original terms. Use of higher standards of relevance appears to increase the superiority of the best method. Some methods do a better job of getting the highly relevant documents but do not increase retrieval of marginal ones. There is evidence that generalized precision provides more equitable results, while binary precision provides undeserved merit to some methods. Generally graded relevance measures seem to provide additional insight into IR evaluation.











Knowledge Management Hype, Hope, or Help?
David C. Blair
Published online 26 July 2002

David Blair's article takes a comprehensive view of Knowledge Management, following its relationship to data or information management and its still promising possibilities.





Knowledge Integration in Virtual Teams The Potential Role of KMS
Maryam Alavi and Amrit Tiwana
Published online 19 July 2002

Maryam Alavi and Amrit Tiwana identify four challenges to knowledge integration in virtual team environments and propose knowledge management system (KMS) approaches to meet these challenges.





Mundane Knowledge Management and Microlevel Organizational Learning An Ethological Approach
Elisabeth Davenport
Published online 25 July 2002

Elisabeth Davenport explores the concepts of mundane knowledge management and organizational ethology in a case study of a project to promote virtual enterprise formation.






Automatic Thesaurus Generation for Chinese Documents
Yuen-Hsien Tseng
Published online 19 September 2002

Tseng constructs a word co-occurrence based thesaurus by means of the automatic analysis of Chinese text. Words are identified by a longest dictionary match supplemented by a key word extraction algorithm that merges back nearby tokens and accepts shorter strings of characters if they occur more often than the longest string. Single character auxiliary words are a major source of error but this can be greatly reduced with the use of a 70-character 2680 word stop list.

Extracted terms with their associate document weights are sorted by decreasing frequency and the top of this list is associated using a Dice coefficient modified to account for longer documents on the weights of term pairs. Co-occurrence is not in the document as a whole but in paragraph or sentence size sections in order to reduce computation time. A window of 29 characters or 11 words was found to be sufficient. A thesaurus was produced from 25,230 Chinese news articles and judges asked to review the top 50 terms associated with each of 30 single word query terms. They determined 69% to be relevant.










On Bidirectional English-Arabic Search
M. Aljlayl, O. Frieder, and D. Grossman
Published online 19 September 2002

Aljlayl, Frieder, and Grossman review machine translation of query methodologies and apply them to English-Arabic/Arabic-English Cross-Language Information Retrieval. In the dictionary method, replacement of each term with all possible equivalents in the target language results in considerable ambiguity, while taking the first term in the dictionary list reduces the ambiguity but may fail to capture the meaning. A Two-Phase method takes all possible equivalents and translates them back, retaining only those that generate the original term. It results in an average query length of six terms in TREC7 and 12 in TREC9. Arabic to English translations consistently preformed below the original English queries, and the Two-Phase method consistently preformed at the highest level and significantly better than the Every-Match method.

Machine translation using other techniques is economical for queries but not likely so for documents. Using ALKAFI, a commercial translation system from Arabic to English and the Al-Mutarjim Al-Arabey system for English to Arabic, nearly 60% of monolingual retrievals were generated going from Arabic to English. Smaller numbers of terms in the source query improve performance, and these systems require syntactically well-formed queries for good performance.











The Influence of Mental Models and Goals on Search Patterns During Web Interaction
Debra J. Slone
Published online 19 September 2002

Thirty-one patrons, who were selected by Slone to provide a range of age and experience, agreed when approached while using the catalog of the Wake County library system to try searching via the Internet.  Fifteen searched the Wake County online catalog in this manner and 16 searched the World Wide Web, including that catalog. They were subjected to brief pre-structured taped interviews before and after their searches and observed during the searching process resulting in a log of behaviors, comments, pages accessed, and time spent. Data were analyzed across participants and categories. Web searches were characterized as linking, URL, search engine, within a site domain, and searching a web catalog; and participants by the number of these techniques used. Four used only one, 13 used two, 11 used three, two used four, and one all five.

Participant experience was characterized as never used, used search engines, browsing experience, email experience, URL experience, catalog experience, and finally chat room/newsgroup experience.  Sixteen percent of the participants had never used the Internet, 71% had used search engines, 65% had browsed, 58% had used email, 39% had used URLs, 39% had used online catalogs, and 32% had used chat rooms. The catalog was normally consulted before the web, where both were used, and experience with an online catalog assists in web use. Scrolling was found to be unpopular and practiced halfheartedly.





Children's Use of the Yahooligans! Web Search Engine. III. Cognitive and Physical Behaviors on Fully Self-Generated Search Tasks
Dania Bilal
Published online 19 September 2002

Bilal, in this third part of her Yahooligans! study looks at children's performance with self-generated search tasks, as compared to previously assigned search tasks looking for differences in success, cognitive behavior, physical behavior, and task preference. Lotus ScreenCam was used to record interactions and post search interviews to record impressions. The subjects, the same 22 seventh grade children in the previous studies, generated topics of interest that were mediated with the researcher into more specific topics where necessary. Fifteen usable sessions form the basis of the study. Eleven children were successful in finding information, a rate of 73% compared to 69% in assigned research questions, and 50% in assigned fact-finding questions.

Eighty-seven percent began using one or two keyword searches. Spelling was a problem. Successful children made fewer keyword searches and the number of search moves averaged 5.5 as compared to 2.4 on the research oriented task and 3.49 on the factual. Backtracking and looping were common. The self-generated task was preferred by 47% of the subjects.











Usability Testing for Library Web Sites A Hands-On Guide, by Elaina Norlin and CM Winters
Matt Jones
Published online 7 August 2002




Accessing and Browsing Information and Communication, by Ronald E. Rice, Maureen McCreadie, and Shan-Ju L. Chang
Robert J. Sandusky
Published online 29 August 2002




Strategies for Electronic Commerce and the Internet, by Henry C. Lucas, Jr.
Roisin Faherty
Published online 22 August 2002






ASIST Home Page

Association for Information Science and Technology
8555 16th Street, Suite 850, Silver Spring, Maryland 20910, USA
Tel. 301-495-0900, Fax: 301-495-0810 | E-mail:

Copyright © 2001, Association for Information Science and Technology