| |
Information Retrieval from Annotated Texts Aviezri S. Fraenkel and Shmuel T. Klein
We begin with a discussion of annotations to text, similar to hypertext
linked pages, as a source of index entries for expanded access. However, the use of terms extracted from annotations, marked as if they occurred in
the main text, will confound proximity searches in that text. Fraenkel and Klein suggest a file structure and distance evaluation algorithm which will
permit selective retrieval from main text, particular classes of annotations, and combinations of these.
Designer Selves: Construction of Technologically Mediated Identity
within Graphical, Multiuser Virtual Environments Jerome P. McDonough
We then move to the virtual environment of computer-mediated
communication systems, whose designers McDonough believes influence the projected identities of their system's users, the properties of whose personas are created and maintained by social interaction. In the two
systems McDonough examined, designers are overwhelmingly White, two-to-one male, with middle-class backgrounds and at least some college education,
and have extraordinary control over product design. Without benefit of user data, the designers seem to see their users as similar to themselves
demographically, but, while highly computer literate, not programmers. Users are believed to desire virtual personalities different from reality.
First 20 Precision among World Wide Web Search Services (Search Engines) H. Vernon Leighton and Jaideep Srivastava
Leighton and Srivastava compare the retrieval performance of five search
engines using precision in the first 20 sites retrieved as the basic measure. Retrieved pages were downloaded and presented to evaluators for
topical relevance judgements without indication of search engine used. The number of relevant links among the first three retrieved was multiplied by
20, the number in the next seven by 17, and any in the final ten by 10. These summed values were then divided by the maximum 279 less 10 for each
link less than 20 returned. Alta Vista, Excite, and Infoseek demonstrate superior rankings to HotBot and Lycos, with their relative position changing with the strength of the relevance relationship required.
Measuring Search-Engine Quality and Query Difficulty: Ranking with Target and Freestyle Robert M. Losee and Lee Anne H. Paris
Losee and Paris suggest that a measure of query difficulty and a measure of the probability that a search engine will produce an optimal ranking are
more relevant to the evaluation of retrieval systems than traditional measures. Five sets of document rankings of an evaluated MEDLINE collection: Boolean, original natural language on LEXIS-NEXIS Freestyle and
on DIALOG Target, terms on Freestyle and on Target with adjacency, and a full natural language form of query, are used despite the varying system
limits on set size. The parameters of average search length that minimize errors in its estimation (the probability that a ranking is optimal, and
the average search length re-scaled to a 0 to 1 range) are estimated by regression. The E measure (1--the harmonic mean of precision and recall)
strongly correlates with the re-scaled average search length, as does the number of terms in the query. Best-case Boolean and Freestyle search engines rank only slightly higher than Target.
Scholarly Communication and the Continuum of Electronic Publishing Rob Kling and Geoffrey McKim
Kling and McKim define three electronic journal types: the e-journal, a
package of articles primarily distributed and accessed in electronic form; the p-e-journal, a package of peer-reviewed articles available electronically but distributed primarily in paper form; the e-p journal, a
package of peer-reviewed articles primarily distributed electronically, but with limited paper distribution; plus sources of electronic working
articles which are electronically available scholarly communications that are not peer reviewed. Growth appears to be taking place primarily in the
p-e-journal area. Different publishers have strikingly different policies on whether or not web appearance constitutes publishing. There is a difference between availability on the web and access by the learned
community who are unlikely to search widely to bring together such material. Long-term stable accessibility is a perceived problem for e-journals, as is a lack of trustworthiness and ability to provide publicity.
Conversation and Community: The Potential of Electronic Conferences for Creating Intellectual Proximity in Distributed Learning Environments Judith Weedman
Weedman's review of the literature of computer-mediated course work activity finds student characteristics have more effect on outcome than mode of instruction. Most, but not all studies, show positive evidence of
reflective thinking. Community appears to develop in many studies, but at least two find very little student interest in using electronic tools to
develop a learning community. Research in socialization indicates that attitudes and values do not change in the professional school experience,
that students learn to be students rather than to be professionals, and professional identity does not fully form until after graduation. Surveys
collected in the 1980s from an electronic conference show nine uses per day, with one-half the students reporting they signed on at least once a week. Forty-five percent saw the medium as similar to face-to-face
interaction. Most users reported they would recognize by sight less than 50% of those with whom they communicated in the conference, and just under
half believed they interacted differently with those they already knew.
Information Seeking Behavior of Scientists in the Electronic Information Age: Astronomers, Chemists, Mathematicians, and Physicists
Cecelia M. Brown
Brown surveyed 80 science faculty at the University of Oklahoma via e-mail. Forty-nine respondents averaged 6 hours per week reading journal
articles. The scanning of current journals, conference attendance, and personal communication were the leading methods of remaining current. Current awareness services were little used. Awareness of retrospective
literature was reported to come primarily through citations and, to a lesser extent, use of indexing tools. Actual use of such tools was mostly
by chemists. Most obtained needed journal articles by photocopying the library copy. Chemists used Carl UnCover to some extent, but others relied
on interlibrary loan or reprint requests. Sixty-five percent preferred print to electronic format.
A Stemming Procedure and Stopword List for General French Corpora Jacques Savoy
Using average precision values over standard recall points, Savoy investigates the improvements that can be made by processing the retrieved set from a Boolean search using existing other than Boolean retrieval
models to rank these results. There appear to be meaningful improvements, and in the process a good review of current retrieval models is provided. |
845 855
870
882
890
907 929
944 |