Friday

Continuing Education Courses

9:00

01010101010101010101010101010101010101010101010101010101010101

TOP

 

Practical Text Mining

Knowledge Discovery in Databases (KDD) focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them.  While most work on KDD has been concerned with structured databases, there has been little work on handling the huge amount of information that is available only in unstructured textual form.  In this course, we will present the general theory of Text Mining and will demonstrate several systems that use these principles to enable interactive exploration of large textual collections.  We will describe generic techniques for text categorization and information extraction that are used in these systems.  Systems to be presented are are:

  • KDT -  Knowledge Discovery in Texts
  • FACT - discovers associations amongst keywords labeling the items in a collection of textual documents
  • Document Explorer - provides high level language for interactive exploration of textual collections

We will present a general architecture for text mining and will outline the algorithms and data structures behind the systems.  We will give special emphasis to incremental algorithms and to efficient data structures.  The course will cover the state of the art in this rapidly growing area of research.
 

 

Prerequisites:

None
 

 

Instructor:

Ronen Feldman

Ronen Felman is a senior lecturer at the Mathematics and Computer Science Department of Bar-Ilan University in Israel, and the Director of  the Data Mining Laboratory.  He received his B.Sc. in Math, Physics and Computer Science from Cornell University in N.Y. His main research is in the area of Machine Learning and Data Mining.  In particular, her is pioneering now the application of data mining techniques to textual collections.  He is the recipient of several research grants for research and development of dedicated text-mining systems. These systems work on plain text collections and on the Internet.  He has authored numerous papers on scheduling, theory revision, text mining and association generation.  He serves as a consultant to leading Israeli companies such as El-Al, Telrad, Bezek, Israel Electric Company, and the National Coal Company.

 

9:00

01010101010101010101010101010101010101010101010101010101010101

TOP

 

Thesauri for Indexing and Retrieval

Design and development of information retrieval thesauri are the primary focus of this workshop. Formulation of terms, relationships and other navigation mechanisms for ANSI/NISO standard thesauri are included.  Specialized thesaurus management software packages are discussed and one or more packages will be demonstrated. Since the importance of text retrieval software is increasing rapidly today, the place of thesauri with such software is considered. Thesauri may be used for both indexing and retrieval, and  new technologies the balance of emphasis is changing to make application to retrieval more important. The impact of this change on thesaurus design will be considered.

This introductory course is designed for database developers and editors.  Some knowledge of indexing  is useful, but no special background is required as long as you  have basic information management training.

 

Prerequisites:

 None
 

 

Instructor:

Jessica Milstead

Dr. Jessica Milstead is Principal of The JELEM Company, which consults in developing indexes and thesauri. She works with database publishers and clients on developing indexing schemes, thesauri, and end-user search tools. She has taught indexing as a faculty member and in continuing education programs. Jessica is on the Standards Development Committee of NISO and authored Thesaurus of Information Science and Librarianship (ASIS, 1994 and second edition 1998).

 

9:00

01010101010101010101010101010101010101010101010101010101010101

TOP

 

Information Product Development: Enabling Knowledge-based Systems

Strictly Limited to 30 Participants!
Two-Day Course

Mechanisms for selecting, identifying, and organizing data and information are critical to finding and using it later. These later uses may be for the same, but are more often for different purposes than those for which the data and information were originally created. Recent business theories related to issues such as innovation and intellectual asset management are leading to the view that business data and information are assets and that they need to be handled accordingly by companies.

Through a series of in-depth presentations and interactive group activities, participants in this workshop will:

  • Examine the relationships between real workplace needs and the data, information, and knowledge created within an organization every day.
  • Practice methods for identifying, describing, and categorizing existing data and information resources.
  • Learn about methods for automatically identifying, describing, and categorizing existing data and information resources.
  • Develop a knowledge-based information management solution for a real workplace problem.
 

Prerequisites:

None
 

 

Instructors:

Joseph Busch, Mark Butler, Ron Daniel and Paul O'Leary

Joseph Busch, Vice President for Information Product Development, DATAFUSION, Inc.
Joseph is a leading authority in the field of information science. His focus is on productizing DATAFUSION's digital library technologies for real-world applications. Joseph comes to DATAFUSION from the Getty Information Institute, where he was a Program Manager for ten years. He has been widely published in the field of information science, and maintains active participation in key professional organizations and standards committees. He has also brought information software products to market, and has extensive project management experience. Prior to joining the Getty, Joseph was a Manager at Price Waterhouse in their Boston office. He earned a B.A. from Portland State University and a Master of Library Science degree from the State University of New York, Albany.

Mark Butler, Information Scientist, DATAFUSION, Inc.
Mark leads the design and development of the DATAFUSION metathesaurus and metadata repositories. He coordinates the development of tools to automate resource discovery, descriptive metadata, and classification. Mark earned a B.A. in Political Science, and a Masters in Library and Information Studies from the University of California, Berkeley. He is presently completing his Ph.D. in Library and Information Studies. Prior to moving to California, he was a Vice President at the Roper Organization, managing the production of Roper Reports, a syndicated service of national in-person surveys.

Ron Daniel, Senior Information Scientist, DATAFUSION, Inc.
Ron is an information standards leader. He is responsible for implementing the XML architecture for DATAFUSION repositories and KnowledgeMaps™, DATAFUSION's proprietary data visualization scheme. Before coming to DATAFUSION, Ron was a scientist at Los Alamos National Laboratory, where he was involved in projects addressing the lab's need for a large-scale, long-duration information infrastructure. He is an active participant in the digital library research community, specializing in identifiers and metadata, with a number of publications and invited presentations to his credit. Ron also represents DATAFUSION on a number of standards committees focussed on those areas, such as the Dublin Core effort and the W3C's Resource Description Framework. He earned his Ph.D. in Electrical Engineering from Oklahoma State University, and was a post-doctoral research associate at Cambridge University before returning to the USA and working at Los Alamos.

Paul O'Leary, Senior Information Scientist, DATAFUSION, Inc.
Paul leads the design of the DATAFUSION metathesaurus repository. He is an expert on vocabulary resources and also leads the development of XML tools for processing vocabularies for loading into DATAFUSION repositories. Prior to joining DATAFUSION, Paul developed search agents and knowledge bases for information retrieval and filtering applications in the Biomedical and Transportation industries. He is completing a doctorate at the School of Information Management and Systems at the University of California, Berkeley on retrieval system design and standards for electronic text systems.

 

Last Updated: Wednesday, July 07, 1999

[Home] [Program] [Location] [Travel ] [Registration]
[Continuing Ed] [Schedule] [Social] [Governance]
[Friday] [Saturday]