 |
| |
In This Issue |
593 |
In this issue Bert Boyce |
| |
Research Article |
594-602 |
The Connection Between the Research of a University and Counts of Links to Its Web Pages An Investigation Based upon a
Classification of the Relationships of Pages to the Research of the Host University Mike Thewall and Gareth Harries
Published online 13 March 2003Given evidence that patterns of web linking between Universities can be strongly associated with research productivity, Thelwall and Harries attempt to identify reasons for this
association by looking at characterization (the nature of linked to pages) and the effect of different link counting models. Using Thelwall's database of the link structures of 107 major universities, a file for each
university was created containing all pages in its web site that were targets of links from other universities in the database. All pages that had at least 5% as many links as the home page were categorized into a four
part classification not locally created, not academic content, high link pages like databases or gateway pages, and other. From this grouping new link structure databases for each category were constructed, and 56 lists
of link counts were created and correlated with published research productivity scores for 108 UK universities using several different counting models. Both the models and the categorization effect the correlation
coefficients, and it appears that choosing categories most related to a university's research will result in stronger associations.
|
611-620
|
Adapting Measures of Clumping Strength to Assess Term-Term Similarity Abraham Bookstein, Vladimir Kulyukin, Timo Raita and JohnNicholson Published online 13 March 2003 Bookstein, et alia, construct measures of semantic term association based
upon a statistical model of language and the capture of the peculiarities to be found in text generation. The theory is that terms that share the same content will be found together, or clump within a document in the
places where that semantic content is discussed, and that variation from random term distribution will indicate such text segments. If one computes a clumping measure for a term only over those portions of text where a
second term is present, it is likely to differ from that same measure computed for the first term over the whole text. In fact, if they carry the same content when measured together, the first term may appear to occur
at random by the clumping measure in the context of the second term, even though it is strongly clumped outside this context, and thus a comparison of the two measurements should indicate term association. A basic
association is shown in this manner which takes into account not only the number of documents in which a term occurs, but also the number of occurrences, although it is also possible to design a measure that takes into
account the clumping measure that is generated when the second term is specifically excluded. This would cover the case where the first term's clumping strength is dependent upon the second term's strength. An
experiment using twenty content terms from the Columbia Encyclopedia data base found their association scores with all other terms and the 100 pairs with the highest scores. Judges then ranked the term associations as
"successes," "failures," or "can't says." Precision type measures were then computed both with "can't says" not counted, and counted as failures and were quite high. It appears to
be possible to distinguish between symmetric and asymmetric associations purely on a statistical basis since each term in a pair may either influence the others clumping behavior in the same manner or one may influence
the other but not the reverse.
|
621-624
|
SPECIAL TOPIC SECTION WEB RETRIEVAL AND MINING Guest Editor Hsinchun ChenIntroduction to the JASIST Special Topic Section on Web Retrieval and Mining A Machine Learning Perspective Hsinchun Chen
Published online 13 March 2003 This special issue consists of six papers that report research in web retrieval and mining. Most papers apply or adapt various pre-web retrieval and analysis techniques to other
interesting and challenging web-based applications. The Web has become the world's largest knowledge repository. Extracting knowledge from the Web efficiently and effectively is becoming increasingly important for
various Web applications. The current Web still consists of more information than knowledge. Also, most of the Web mining activities are still in their early stages and will continue to develop as the Web
evolves. We hope this collection of research papers will help advance our knowledge and understanding of this fascinating and evolving field of web retrieval and mining.
|
625-637 |
Client-Side Monitoring for Web Mining
Kurt D. Fenstermacher and Mark Ginsburg Published online 17 March 2003Client-Side Monitoring for Web Mining, by Fenstermacher and Ginsburg, proposes a client-side monitoring system that is unobtrusive and supports
flexible data collection. Moreover, the proposed framework encompasses client-side applications (such as standard office productivity tools) beyond the Web browser.
|
638-649 |
Relevant Term Suggestion in Interactive Web Search Based on Contextual Information in Query Session Logs Chien-Kang Huang, Lee-Feng Chien and Yen-Jen Oyang Published online 13 March 2003Relevant Term Suggestion in Interactive Web Search Based on Contextual
Information in Query Session Logs, by Huang, Chien, and Oyang, proposes a query log-based term suggestion approach to interactive Web search. Using this approach, relevant terms suggested for a user query are those that
co-occur in similar query sessions from search engine logs, rather than in the retrieved documents. Their experiments showed that the proposed approach can exploit the contextual information in a user query session to
make useful suggestions.
|
650-659 |
DocCube Multi-Dimensional Visualisation and Exploration of Large Document Sets Josiane Mothe, Claude Chrisment, Bernard Dousset, and Joel Alaux Published online 13 March 2003DocCube Multi-Dimensional Visualization and Exploration of Large Documents
Sets, by Mothe, Chrisment, Dousset, and Alaux, presents a novel user interface that provides global visualization of large document sets to help users formulate query and access documents. Concept hierarchies are
introduced to facilitate browsing..
|
660-670
|
A Novel Method for Discovering Fuzzy Sequential Patterns Using the Simple Fuzzy Partition Method Ruey-Shun Chen and Yi-Chung Hu Published online 28 March 2003A Novel Method for Discovering Fuzzy Sequential Patterns Using the Simple Fuzzy Partition Method, by
Chen and Hu, proposes a fuzzy data mining technique to discover fuzzy sequential patterns.
|
671-682 |
Automatic Generation of English/Chinese Thesaurus Based on a Parallel Corpus in Laws Christopher C. Yang and Johnny Luk Published online 28 March 2003Automatic Generation of English/Chinese Thesaurus Based on a Parallel Corpus in Laws, by Yang and Luk,
describes a project that aims to address cross-lingual semantic interoperability by developing a cross-lingual thesaurus based on an English/Chinese parallel corpus. Their experiments showed that such a thesaurus is
useful in suggesting relevant terms in a different language.
|
683-694
|
HelpfulMed Intelligent Searching for Medical Information over the Internet Hsinchun Chen, Ann M. Lally, Bin Zhu, and Michael Chau Published online 13 March 2003HelpfulMed Intelligent Searching for Medical Information over the Internet, by Chen,
Lally, Zhu, and Chau, describes an intelligent, web-based medical portal that supports meta searching, vertical search engine creation, term suggestion, and knowledge map browsing, all in an integrated web-based
architecture. Initial user evaluations of the system were promising in comparison to other traditional medical search engines.
|
| |
Book Review |
695-697 |
Looking for Information A Survey of Research on Information Seeking, Needs and Behavior, by Donald O. Case Reviewed by Reijo Savolainen Published online 24 March 2003 |
| |
Calls for Papers |
698-699 |
Published online 24-28 March 2003 |
|