Journal of the Association for Information Science and Technology

Index
Table of Contents

Volume 54  Issue 7


 

In This Issue

593

 

In this issue
Bert Boyce
 

 

Research Article

594-602

 

 

 

 

 

 

 

 

The Connection Between the Research of a University and Counts of Links to Its Web Pages An Investigation Based upon a Classification of the Relationships of Pages to the Research of the Host University
Mike Thewall and Gareth Harries
Published online 13 March 2003

Given evidence that patterns of web linking between Universities can be strongly associated with research productivity, Thelwall and Harries attempt to identify reasons for this association by looking at characterization (the nature of linked to pages) and the effect of different link counting models. Using Thelwall's database of the link structures of 107 major universities, a file for each university was created containing all pages in its web site that were targets of links from other universities in the database. All pages that had at least 5% as many links as the home page were categorized into a four part classification not locally created, not academic content, high link pages like databases or gateway pages, and other. From this grouping new link structure databases for each category were constructed, and 56 lists of link counts were created and correlated with published research productivity scores for 108 UK universities using several different counting models. Both the models and the categorization effect the correlation coefficients, and it appears that choosing categories most related to a university's research will result in stronger associations.
 

611-620

 

 

 

 

 

 

 

 


 

 

 

Adapting Measures of Clumping Strength to Assess Term-Term Similarity
Abraham Bookstein, Vladimir Kulyukin, Timo Raita and JohnNicholson
Published online 13 March 2003

Bookstein, et alia, construct measures of semantic term association based upon a statistical model of language and the capture of the peculiarities to be found in text generation. The theory is that terms that share the same content will be found together, or clump within a document in the places where that semantic content is discussed, and that variation from random term distribution will indicate such text segments. If one computes a clumping measure for a term only over those portions of text where a second term is present, it is likely to differ from that same measure computed for the first term over the whole text. In fact, if they carry the same content when measured together, the first term may appear to occur at random by the clumping measure in the context of the second term, even though it is strongly clumped outside this context, and thus a comparison of the two measurements should indicate term association. A basic association is shown in this manner which takes into account not only the number of documents in which a term occurs, but also the number of occurrences, although it is also possible to design a measure that takes into account the clumping measure that is generated when the second term is specifically excluded. This would cover the case where the first term's clumping strength is dependent upon the second term's strength. An experiment using twenty content terms from the Columbia Encyclopedia data base found their association scores with all other terms and the 100 pairs with the highest scores. Judges then ranked the term associations as "successes," "failures," or "can't says." Precision type measures were then computed both with "can't says" not counted, and counted as failures and were quite high. It appears to be possible to distinguish between symmetric and asymmetric associations purely on a statistical basis since each term in a pair may either influence the others clumping behavior in the same manner or one may influence the other but not the reverse.
 

621-624

 

 


 

 

 

 

 

SPECIAL TOPIC SECTION WEB RETRIEVAL AND MINING
Guest Editor Hsinchun Chen

Introduction to the JASIST Special Topic Section on Web Retrieval and Mining A Machine Learning Perspective
Hsinchun Chen
Published online 13 March 2003

This special issue consists of six papers that report research in web retrieval and mining. Most papers apply or adapt various pre-web retrieval and analysis techniques to other interesting and challenging web-based applications.

The Web has become the world's largest knowledge repository. Extracting knowledge from the Web efficiently and effectively is becoming increasingly important for various Web applications.  The current Web still consists of more information than knowledge. Also, most of the Web mining activities are still in their early stages and will continue to develop as the Web evolves.  We hope this collection of research papers will help advance our knowledge and understanding of this fascinating and evolving field of web retrieval and mining.
 

625-637
 

 

 

 

Client-Side Monitoring for Web Mining
Kurt D. Fenstermacher and Mark Ginsburg
Published online 17 March 2003

Client-Side Monitoring for Web Mining, by Fenstermacher and Ginsburg, proposes a client-side monitoring system that is unobtrusive and supports flexible data collection. Moreover, the proposed framework encompasses client-side applications (such as standard office productivity tools) beyond the Web browser.
 

638-649
 

 

 

 

 

 

Relevant Term Suggestion in Interactive Web Search Based on Contextual Information in Query Session Logs
Chien-Kang Huang, Lee-Feng Chien and Yen-Jen Oyang
Published online 13 March 2003

Relevant Term Suggestion in Interactive Web Search Based on Contextual Information in Query Session Logs, by Huang, Chien, and Oyang, proposes a query log-based term suggestion approach to interactive Web search. Using this approach, relevant terms suggested for a user query are those that co-occur in similar query sessions from search engine logs, rather than in the retrieved documents. Their experiments showed that the proposed approach can exploit the contextual information in a user query session to make useful suggestions.
 

650-659

 

 

 

 

DocCube Multi-Dimensional Visualisation and Exploration of
Large Document Sets
Josiane Mothe, Claude Chrisment, Bernard Dousset, and Joel Alaux
Published online 13 March 2003

DocCube Multi-Dimensional Visualization and Exploration of Large Documents Sets, by Mothe, Chrisment, Dousset, and Alaux, presents a novel user interface that provides global visualization of large document sets to help users formulate query and access documents. Concept hierarchies are introduced to facilitate browsing..
 

660-670


 

 

 

A Novel Method for Discovering Fuzzy Sequential Patterns Using the Simple Fuzzy Partition Method
Ruey-Shun Chen and Yi-Chung Hu
Published online 28 March 2003

A Novel Method for Discovering Fuzzy Sequential Patterns Using the Simple Fuzzy Partition Method, by Chen and Hu, proposes a fuzzy data mining technique to discover fuzzy sequential patterns.
 

671-682
 

 

 

 

 

Automatic Generation of English/Chinese Thesaurus Based on a Parallel Corpus in Laws
Christopher C. Yang and Johnny Luk
Published online 28 March 2003

Automatic Generation of English/Chinese Thesaurus Based on a Parallel Corpus in Laws, by Yang and Luk, describes a project that aims to address cross-lingual semantic interoperability by developing a cross-lingual thesaurus based on an English/Chinese parallel corpus. Their experiments showed that such a thesaurus is useful in suggesting relevant terms in a different language.
 

683-694

 

 


 

HelpfulMed Intelligent Searching for Medical Information over the Internet
Hsinchun Chen, Ann M. Lally, Bin Zhu, and Michael Chau
Published online 13 March 2003

HelpfulMed Intelligent Searching for Medical Information over the Internet, by Chen, Lally, Zhu, and Chau, describes an intelligent, web-based medical portal that supports meta searching, vertical search engine creation, term suggestion, and knowledge map browsing, all in an integrated web-based architecture. Initial user evaluations of the system were promising in comparison to other traditional medical search engines.
 

 

Book Review

695-697
 

 

Looking for Information A Survey of Research on Information
Seeking, Needs and Behavior, by Donald O. Case
Reviewed by Reijo Savolainen
Published online 24 March 2003
 

 

Calls for Papers

698-699

Published online 24-28 March 2003


ASIST Home Page

Association for Information Science and Technology
8555 16th Street, Suite 850, Silver Spring, Maryland 20910, USA
Tel. 301-495-0900, Fax: 301-495-0810 | E-mail:
asis@asis.org

Copyright © 2003, Association for Information Science and Technology