Volume 54  Issue 3


Bert Boyce
a Measure for the Cohesion of Weighted Networks
Leo Egghe, Ronald Rousseau
Published Online: 8 Jan 2003

Measurement of the degree of interconnectedness in graph like networks of hyperlinks or citations can indicate the existence of research fields and assist in comparative evaluation of research efforts.  In this issue we begin with Egghe and  Rousseau who review compactness measures and investigate the compactness of a network as a weighted graph with dissimilarity values characterizing the arcs between nodes. They make use of a generalization of the Botofogo, Rivlin, Shneiderman,( BRS ) compaction measure which treats the distance between unreachable nodes not as infinity but rather as the number of nodes in the network. The dissimilarity values are determined by summing the reciprocals of the weights of the arcs in the shortest chain between two nodes where no weight is smaller than one. The BRS measure is then the maximum value for the sum of the dissimilarity measures less the actual sum divided by the difference between the maximum and minimum. The Wiener index,  the sum of all elements in the dissimilarity matrix divided by two, is then computed for Small's particle physics co-citation data as well as the BRS measure, the  dissimilarity values and shortest paths. The compactness measure for the weighted network is smaller than for the un-weighted. When the  bibliographic coupling network is utilized it is shown to be less compact than the co-citation network which indicates that the new measure produces results that confirm to an obvious case.










Methods for Identifying Versioned and Plagiarized Documents
Timothy C. Hoad, Justin Zobel
Published Online: 16 Jan 2003

Hoad and Zobel term documents that originate from the same source, whether versions or plagiarisms, co-derivatives. Identification of co-derivatives is normally by a technique called fingerprinting, which uses  hashing to generate surrogates in the form of integer strings derived from substrings of text, for comparison purposes, or by ranking using a similarity measure as in information retrieval. Hoad and Zobel derive several variants of what they term an identity measure, where documents with similar numbers of occurrences of words benefit and those with dissimilar numbers are penalized, for use in a ranking technique. They then review fingerprinting strategies, and characterize them by the substring size utilized, i.e. granularity, character of the hashing function, the size of the document fingerprint, i.e. resolution, and the substring selection strategy.   In their experiments highest false match, HFM,  the highest percentage score given an incorrect result, and separation, the difference between the lowest correct result and HFM were the measures utilized in two collections, one of 3,300 documents, and the other of 80,000 with 53 query documents. The new identity measure demonstrates  superior performance  to the alternatives. Only one fingerprinting strategy was able to identify all human identified similar documents, the anchor strategy. The key parameter in fingerprinting appears to be granularity, with three to five words producing the best results. 











Individual Differences in Exploration Using Desktop VR
David Modjeska, Mark Chignell
Published Online: 8 Jan 2003

Modjeska and Chignell attempt to determine whether virtual reality desk  top environments have adverse effects on people with low spatial ability and difficulty in learning information structures. Both three dimensional visualization and representation by a series of 2 dimensional birds-eye snapshots were tested using the CityScape algorithm and VRML files viewed with a CosmoPlayer browser plug-in on twenty student subjects with web browsing experience. Subjects were divided by their scores on the Minnesota Paper Form Board test for spatial ability, the number of trials they attempted, their score on a local structure test, and their self reported sense of presence, ease of use, efficiency, and enjoyment. Navigational logs provided distance traveled as a percent of virtual world radius, average circle of proximity to target, and number of exits from the correct zone of proximity. Presence, ease of use, and efficiency correlate with enjoyment, and the objective measures correlate with each other but not the subjective measures. Perceived efficiency correlated with number of errors.  A factor analysis produced an objective measure factor which was affected by both level of spatial ability and by world design, and a subjective measure factor which was not so affected. Structural learning ability, as measured by ability to retain a hierarchical information structure after browsing it, significantly effected performance but no performance advantage could be shown for either visualization scheme. 









Multimodal Geographic Information Systems: Adding Haptic and Auditory Display
Wooseob Jeong, Myke Gluck
Published Online: 8 Jan 2003

Jeong and Gluck test the efficacy of haptic perception as realized by vibro-tactile stimulation along with auditory stimulus while transferring choroplethic information (showing different magnitudes of variables at locations)  from maps in a geographical information system to 51 useable subjects. User performance is measured by task completion times for 36 tasks, and user satisfaction by asking users for their assessments of the various modalities. The mode of the trial, (haptic, auditory, or combined), the presence or absence of a map legend, the classed or un-classed nature of the map, and the task (identifying the highest or middle value of a set of data) constituted the independent variables.  Participants were asked to identify the number of dogs, cats, male nurses, et cetera, shown on 9 state maps. The force feedback mouse provides a vibration proportional to the level of data for that location and a sound is played in one of nine different volume settings. Haptic displays produced faster and more accurate performance than auditory or combined displays although the participants expressed preference for the combined display.









Stereotype-based Versus Personal-based Filtering Rules in Information Filtering Systems
Tsvi Kuflik, Bracha Shapira, Peretz Shoval
Published Online: 8 Jan 2003

Kuflick, et alia, test whether an e-mail filter based on personally designed rules will be as effective as one whose rules are designed to reflect the average user in a specified group of users. Using a prototype filtering system ten subjects were interviewed to construct their own personal rules and were also assigned to one of four predefined rule sets generated by cluster analysis from 40 interviews using the same instrument with like subjects. Assignment was based upon social parameters such as education, profession, and computer knowledge level in the data gathered.  The rules led to assignment of a relevance number in the range 1 to 7 to each message based upon the participant chosen values of goal, length, and history parameters of the message.  A set of e-mail messages was then supplied to the 10 subjects who ranked them as to relevance. Pearson coefficients between personal rule ranks and user ranks are consistently lower than the correlations between user ranks and the stereotype ranks but in only three cases significantly so. 










Self-citation and Self-reference: Credibility and Promotion in Academic Publication
Ken Hyland
Published Online: 8 Jan 2003

Hyland examines self referencing practices by analyzing their textual uses in 240 randomly chosen research papers and 800 abstracts across 80 expert selected journals from 1997 and 1998 in eight disciplines, as a key to their author's assumptions as to their own role in the research process and to the practices of their disciplines. Scanned texts produced a corpus of nearly 1.5 million words which was searched using WordPilot for first person pronouns and all mentions of an author's previous work. There were 6,689 instances of self reference in the papers and 459 in the abstracts; on the average 28 cases per paper, 17% of which were self citations. There was one self mention in every two abstracts. Nearly 70% of self reference and mention occurred in humanities and social science papers, but biologists employed the most self citation overall and 12% of hard science citations were found to be self citations. Interviews indicated that self citation was deemed important in establishing authority by fitting oneself into the research framework. Self mention arises in four main contexts: stating the goal or the structure of the paper, explaining a procedure, stating results or a claim, and elaborating an argument.












from Here to Obscurity?: Media Substitution Theory and Traditional Media in an On-line World
Barbara K. Kaye, Thomas J. Johnson
Published Online: 8 Jan 2003

Kaye and Johnson are interested in the effect of interaction with the Internet on time spent with more traditional media by persons with a strong interest in politics, a topic on which previous research has provided conflicting results.  They posted a survey on the Web requesting respondents from within the United States which was advertised in politically oriented newsgroups, political chat rooms, politically oriented web sites, and posted on 40 search services and which collected data from the responding convenience sample of 442 in 2000 and 307 in 1996. Respondents were asked about change in their time spent with traditional media, as well as their trust in government, self efficacy, interest in politics, reliance on traditional sources, reliance on the web, other internet based electronic information sources, and their demographic characteristics.  Over the time period female respondents increased from one quarter to one third. The average age increased by 10.8 years, and those reporting a high level of trust in government increased from 11.9% to 35.2%. Reported high self efficacy rose from 44.7% to 74%. Internet users are spending significantly less time reading news magazines in the second survey, and while the difference in television viewing is not significantly different, the decrease in radio listening for political information is.  Respondents also report spending significantly less time talking about politics in 2000. About one half the respondents report spending less time with traditional media while the other half claim the internet has not effected their use of these sources.


