|
EDITORIAL |
| |
In This Issue Bert R. Boyce |
1087
|
RESEARCH |
| |
A New Method for Analyzing Scientific Productivity John C. Huber Published online 14 September 2001In this issue's examination of author productivity Huber treats
publication date as a continuous variable and estimates position by using a paper's page number relative to the maximum page number for the journal publication year from a nine year sample of the Journal of Applied
Physics, and The Journal of Experimental Biology. These are supplemented by authors names beginning with ``Ba'' from ten years of the PsycInfo database (where a random number is used to supplement the publication year),
a combined set of samples of earlier collected 19th century Physics papers, eight thousand six hundred and forty one patents issued to New York state General Electric employed inventors, composers with new works
preformed by professional orchestras for a forty four year period, and 16 years of papers on mathematical logic. The combined distribution of author productivity fits the exponential distribution, and the samples each
have a mean productivity of about .05 papers per year although that of inventors is nearly twice that, and composers about half. Author career length is also exponential with most authors having short publishing
careers, and very few with long careers. Empirically the samples conform to random production, Poisson distributions over time, and exponential distributions for productivity and longevity. They do not demonstrate
cumulative advantage and a very large number of authors produce at a constant rate.
|
1089
|
| |
The Non-Gaussian Nature of Bibliometric and Scientometric Distributions: A New Approach to Interpretation Ludmila E. Ivancheva Published online 14 September 2001In a second
bibliometric paper Ivencheva utilizes the work of Stankov, who claims to have discovered that all natural phenomena follow one general regularity, the Universal Law, which declares energy exchange as directly
proportional to absolute time and inversely proportional to space, in order to explain the skewed nature of bibliometric distributions. We thus would see the hyperbolic distribution as a wave process of energy
information exchange. For example, author productivity would be interpreted as the energy of a spherical wave whose amplitude corresponds to the number of papers produced by an author in a year. The ``nucleus'' is small
because it emits significant energy; while the low productivity space is large because the energy output is low.
|
1100 |
| |
Ask-an-Expert Services Analysis Joseph Janes, Chrystie Hill, and Alex Rolfe Published online 10 September 2001Janes, Hill, and Rolfe develop and execute a
methodology for the evaluation of Web-based services which permit users to question experts for needed information. Ten commercial and ten non-commercial sites were asked 240 questions in ten subject areas believed to
be typical of such inquiries. The commercial sample excluded sites that would not accept questions without a charge, chat based services and those with limited subject coverage. They were each asked ten ``fact,'' ten
``source'' and one ``out of scope'' question. The non-commercial sample sites were subject specific, smaller, and were asked one question of each type. Sites were characterized, questions submitted by researchers using
pseudonymous identities, and response times and requests for clarification recorded. Average time to present a question was 4.75 minutes over three entry methods: web form, e-mail, and bulletin board. The
overall response rate was 70%, and commercial sites were significantly more responsive. Fact questions had a significantly shorter response time. Average response time was two days, seven hours and 45 minutes. Three
sites answered the question asked 90% of the time, two around 70% and the rest between 40% and 60%. |
1106
|
| |
Information Technology and Interests in Scholarly Communication:
A Discourse Analysis Neil Jacobs Published online 14 September 2001Jacobs views both information technology and
scholarly communication from the viewpoint of Social Construction Of Technology, which stresses the instability and constructed nature of both, and thus he questions both technological determinism in scholarly
communication and the study of such communication in terms of its artifacts. Instead the proper focus of research is seen as the interests of the social groups involved. Scholarly communication includes informal
networks as well as journals and citations, but also meta-communication that takes place as research on the topic. This can be addressed by way of discourse analysis. Three analyses are presented from a
series of semi-structured interviews held with academic researchers, librarians, and document suppliers as part of the FIDDO project's investigation of UK document delivery options. Category membership was likely to be
relevant to responses to questions on technology and scholarly communication.. The category ``researcher'' was used by the librarian and the document provider as an explanatory resource in that the supporting of the
researchers interests was a constituent of their own categories. All participants claimed membership in a category, spoke so as to maintain the integrity of that category, and offered accounts that would be accepted as
answers.
|
1122
|
| |
MetaSpider: Meta-Searching and Categorization on the Web Hsinchun Chen, Haiyan Fan, Michael Chau, and Daniel Zeng Published online 19 September 2001MetaSpider preforms post
retrieval document clustering and display after preforming the traditional meta-search functions of collating the high rankings subsets of multiple other search engines while eliminating duplicate and non-functional
pages. Chen, Fan, Chau, and Zeng report MetaSpider returns are displayed in merged engine ranks, with only those containing the exact phrases actually retrieved for processing. Noun phrases are extracted and displayed
with their frequency of occurrence. Documents associated with these phrases may be selected, and if a phrase is not deselected it is used to form a self organizing map of clustered pages where block size indicates term
depth and proximity concept relatedness. An evaluation used six topics from TREC-6 in a comparison with MetaCrawler and Northern Light. Thirty students each did three searches, one on each system. Searchers
described pages returned by composing themes, short phrases describing the topics of the pages. These themes were then compared with those earlier produced by judges to create recall and precision measures. Session
time, number of documents browsed, and number of switches between lists and documents were also recorded. There was no significant difference in switching, documents browsed, session time, or recall. MetaSpider
preformed significantly better than Northern Light in the precision measure.
|
1134 |
| |
Citation Mining: Integrating Text Mining and Bibliometrics for Research User Profiling Ronald N. Kostoff, J. Antonio del Rio, James A. Humenik, Esther Ofilia Garcia, and
Ana Maria Ramirez Published online 19 September 2001In order to determine the impact of specific published research on varied disciplines and to determine research
user characteristics, Kostoff, et alia, test the viability of analyzing the free text fields of about 300 Science Citation Index records of papers that cited a fundamental paper on sand pile vibration. Abstracts were
collected and a taxonomy of phrases and terms was created by manually analyzing the single words, word pairs and word triples extracted from the records. The same phrases were automatically clustered by the Mutual
Information Index (co-occurrence frequency over the squared product of frequencies) for the high frequency phrases; and for the low frequency phrases those phrases associated with each high frequency phrase whose
co-occurrence divided by its total occurrence exceeded .5. Extra-discipline basic research papers range from 15% to 25% of total citing papers each year with no evident latency period. There is a four year latency
period for applications papers.
|
1148 |
| |
Extracting Macroscopic Information from Web Links Mike Thelwall Published online 19 September 2001Thelwall investigates whether any of four web link calculations can be
shown to correlate with university research productivity as shown by a government research assessment exercise. U.K. university web sites were indexed by a crawler designed for comprehensive coverage of their pages and
sub domains. From the results of the crawl lists of all pages linked to at least one page from another U.K. university were extracted with counts of the pages linking to them. For a sample of 25, target pages were
classified by information type, and for each university a summary was created of the number of external links from other U.K. universities, to each of the classed types. The results were used to calculate web impact
factors, first all back links normalized by full time faculty FTE, then only those back links classified as research related with the same denominator. These were compared with the 1996 official rating exercise with the
all link measure attaining a significant .8 Pearson correlation coefficient, and the research only numerator yielding a significant.9. A search was also made using Alta Vista's advanced query syntax to acquire the
number of pages to which back links exist, and these used to create web impact factors for the sample universities. The correlation with the external ranking was a significant .78. Using AltaVista page counts for
denominators, web impact factors still significantly correlated with the external rankings although with a lower coefficient.
|
1157
|
| |
Seeking Explanation in Theory: Reflections on the Social Practices
of Organizations that Distribute Public Use Microdata Files for Research Purposes Alice Robbin and Heather Koball
Published online 19 September 2001Finally a survey, website analysis, and follow up by Robin and Koball of 20 survey research organization's methods in use to limit disclosure of
confidential material indicates that despite the availability of extensive research on statistical disclosure limitation methods to minimize such risks, few such precautions are taken. Risk is reduced by data
conditioning methods which restrict data by eliminating sensitive variables and using grouping and interval techniques. It may also be reduced by restricting access. Only one of the 20 organizations had instituted any
form of restricted access to longitudinal data. Linked administrative data was sometimes suppressed, summarized or injected with error. The work culture of the various organizations and their changing staff meant that
strict rules were not normally applied to restrictions on longitudinal data. The language and practice of Statistical Disclosure Limitation is not universally known in survey organization staffs.
|
1169 |
BOOK REVIEWS |
| |
Knowledge Management: Classic and Contemporary Works, edited by Daryl Morey, Mark Maybury, and Bhavani Thuraisingham John Cullen Published online 11 September 2001 |
1190 |
| |
Peer-to-Peer: Harnessing the Benefits of a Disruptive Technology, edited by Andy Oram Lisa A. Ennis Published online 19 September 2001 |
1191
|
| |
LETTER TO THE EDITOR |
1193 |
|