|
| |
Bert R. Boyce |
1157 |
RESEARCH |
| |
Web-Based Analyses of E-journal Impact: Approaches, Problems, and Issues Stephen P. Harter and Charlotte E. Ford Published online 5 September 2000
We
begin with a look by Harter and Ford at the similarity and differences between citation in scholarly papers and hyper-linking in scholarly electronic journal articles. Using the 39 e-journals of Harter's previous study
of impact of e-journals, less those that required subscription or were defunct, impact was measured through back-links. E-journals exist at more than one location, in multiple formats, and have multiple URLs. There is
no clear way to gather all possibilities. Thus one link search per e-journal was conducted using only the http// format, and choosing the URL listed in the most directories. Because of the normal hierarchical directory
structure of the sites, a single truncated URL brings in all links to the home page, the articles, and perhaps associated files. In the cases where a hierarchical directory structure did not occur a second search at the
chief articles site was carried out. Three engines provided the link search capability; AltaVista, which was very inconsistent in day to day figures, HotBot, which did not provide the needed truncation capability, and
Infoseek, which produced only half the hits of the other two. However, the three ranked the journals in a very similar fashion, with high correlation, and so Infoseek was chosen for its consistency. Saved search
results were concatenated into a file for each of the 39 e-journals with up to 500 URLs in each. Using Grab-a-Site the web pages associated with these URLs were collected. Pearl programs computed the total number of
back-links, the number to different parts of the e-journal site, and the numbers generated internally and externally. Self links are quite high at about 50%; only one in 20 links are to external e-journal articles.
Total external back-links correlate strongly with back-links to external articles. There appears to be no correlation between citation ranking of e-journals and back-link ranking. File types linked to e-journals are
very diverse. . |
1159
|
| |
Predicting the Effectiveness of Naive Data Fusion on the Basis of System Characteristics Kwong Bor Ng and Paul B. Kantor Published online 5 September 2000 In system level data fusion, the retrieval status values assigned by multiple
systems are combined to improve overall performance. Ng and Kantor test fusion against the standard of an ``oracle'' choice of system made before search. The measure used, r, is based upon p100, which is the cumulated
number of relevant documents retrieved prior to reaching the one-hundred-and-first position in a ranked list, divided by 100. The measure r is the p100 of the poorer scheme over the p100 of the better scheme. Retrieval
scheme similarities are characterized by a measure z based on the number of pairs of documents placed in different order by each of two schemes. Measuring the effectiveness of a procedure for predicting the
effectiveness of data fusion requires the use of the ``Receiver Operating Characteristic, ROC, a plot of the correctly predicted effective cases as a function of ineffective cases predicted to be
effective. Output lists for TREC4 were used for training and TREC5 for testing. The ordering of the fused list is determined by the sum of the normalized relevance scores. When fusion gives
better performance the cases are generally above the z + r = 1 line and concentrated on the right side indicating that dissimilar outputs with comparable performance lead to effective fusion. Curves generated by
logistic regression were used to generate classification scores to create ROC curves. With a detection rate below 75% predictive power is far better than random. A non-parametric method ranking the data after splitting
it into 100 bins yields a more powerful ROC curve on the training data, but has less power on the test data. . |
1177 |
| |
Bibliometric Information Retrieval System (BIRS): A Web Search Interface Utilizing Bibliometric Research Results Ying Ding, Gobinda G. Chowdhury, Schubert Foo, and Weizhong Qian Published online 8 September 2000 BIRS, (Bibliometric Information
Retrieval System) provides Web based co-author, co-citation, and similar keyword maps which can be used to generate query terms for ten search engines accessible through a common interface. The maps, created by Ding et
alia, are structured from a ten year database of library and information science literature and layered as to level of detail. Thirty-five students chose one of six topics provided and searched in their choice of search
engine. The top 20 hits were then classed as relevant or not relevant. The subjects then used BIRS to expand their query information and searched the same engine again. They were then asked to compare the results and
comment on BIRS. Eighty percent reported an improved understanding of the subject area, seventy seven percent agreed the BIRS was a help in query construction with 91% using the keyword facility. Actual variations in
relevant and retrieved documents are not reported. |
1190
|
| |
Shape Recovery: A Visual Method for Evaluation of Information Retrieval Experiments Mark Rorvig and Steven Fitzpatrick Published online 7 September 2000 Rorvig
and Fitzpatrick form a document similarity matrix and use multidimensional scaling to create a set of Cartesian points for visual evaluation of retrieval performance. The distance from the centroid document in each
cluster to each document, up to one standard deviation of the mean of all these distances, is then computed, for correlation with control clusters, and the test and control clusters are displayed. Using full text from
five topic document sets from NIST TREC as control, and 50 and 200 term vectors from a local dictionary with and without stemming as the four treatments, both visual and correlation comparisons are made. High apparent
shape distortion agrees with low correlation and vice versa. Stemming has the biggest positive effect when the most distortion is apparent. The application of categories moves far more non-relevant documents to the
extremities of the visual field than it does relevant documents. Stemming brings the visual display back closer to the control but brings back many non-relevant documents. |
1205
|
| |
Empirical Studies of End-User Information Searching A.G. Sutcliffe, M. Ennis, and S.J. Watkinson Published online 8 September 2000
Sutcliffe, et alia, using 17 medical students as subjects, searched 4 topics on MEDLINE using WinSPIRS. Subjects notes, search strategies and search history were recorded and their actions and aloud thoughts subjected
to video and audio recording. Recall made use of a standard relevant set; chosen by experts from a union of subject outputs; precision was defined as both subject relevant and independent judge relevant over subject
relevant documents. Average recall was 14%. Novices significantly out-performed more experienced searchers on one question but other differences were not significant. More experienced searchers had significantly
similar ranking orders of the queries for recall, novices seemed to find all questions equally difficult. No differences were apparent for precision. There were no significant differences in retrieval times or
evaluation times overall but some questions indicated differences. Evaluation time was positively correlated with query complexity. More experienced searchers used more query iterations and used broadening and
narrowing strategies while novices favored trial and error. Novice searchers used only the AND operator. These results are seen as indicating the failure of current user interfaces to assist the searcher.
|
1211 |
| |
Success, a Structured Search Strategy: Rationale, Principles, and Implications Chaim Zins Published online 11 September 2000 Zins evaluates a procedure
which he has given the name "Success," and which involves determining the problem, locating the resources to search, defining the search terms, and executing the search. Three rounds of structured
questionnaires were sent to 15 information specialists in a typical Delphi approach in an attempt to analyze the strategy's principles and rationale, review its guidelines, forms and tables, and discuss its implications
for user instruction. There was disagreement on the need for subject expertise, agreement that both systematic thinking and creativity were required. A need for a fifth phase, evaluation came forward, as did the need
for a methodology selection guideline and a post evaluation reiteration guideline. The five phases were considered indispensable, but sometimes performed using remembered information and thus not observable. |
1232 |
| |
|
|
BOOK REVIEWS |
| |
Books, Bytes, and Bridges: Libraries and Computer Centers in Academic Institutions, edited by Larry Hardesty P. Scott Lapinski |
1248 |
| |
|
|
| |
|
|
| |
|
|
CALL FOR PAPERS
1250 |
| |
|
|
| |
|
|
| |
|
|
| |
|
|
|