of the American Society for Information Science and Technology Vol. 28, No. 5 June / July 2002 |
|
|
|
|
|
Editor's note: Mr. Arora's paper placed second in the 2001 SIG/III International Paper competition. The paper has been condensed for
publication in the Bulletin. Network-Enabled Digitized Collection at the Central Library, IIT Delhi by Jagdish Arora Jagdish Arora is head of computer applications in the Central Library at the Indian Institute of Technology, Hauz Khas, New Delhi - 110 016; telephone: 91-11-6591452, 91-11-6591467; fax: 91-11-6862037, 6855227; e-mail: jarora42@hotmail.com. The emergence of the Internet, particularly the World Wide Web (WWW), as a new medium of information delivery, coupled with availability of powerful hardware, software and networking
technology, has triggered large-scale commercial and non-commercial digitization programs the world over. An increasing number of publishers are using the Internet as a global way to offer their publications to the international
community of scientists and technologists resulting in the large-scale appearance of STM (scientific, technical, medical) electronic journals on the Web. The number of electronic journals has grown in dramatic proportion from less
than 10 in 1989 to more than 8500 in April 2000. The 37th edition of Ulrich's International Periodical Directory
(1999) reports that of 157,000 serials listed in the directory 10,332 were available exclusively online or in addition to a paper counterpart. Internet and Web technology together provide an unparalleled medium for
delivery of information with great speed and economy. Moreover, Web-based electronic information products not only eliminate paper, physical storage and transportation costs, they also offer a host of other possibilities for
incorporating multimedia and hyperlink features into electronic documents hitherto impossible on paper media. Web-based electronic information products are exerting ever-increasing pressure on traditional libraries,
which, in turn, are committing larger portions of their budgets for either procuring or accessing Web-based online or full-text search services, CD-ROM products, online databases, multimedia products, etc. The libraries and
information centers, as consumers of electronic journals and online databases, are benefiting greatly from this technology-driven revolution. The information products of the technological revolution, in turn, have triggered
a major shift in the traditional practices and policies of buying, storing and accessing journals. During the past decade great progress has been made in both theoretical and practical research in digital libraries.
Besides acquiring and buying access to digital collections, academic and research libraries are making efforts to initiate digital library projects in their respective institutions to build their own digital collections. The
increasing commitment to the Web-based digitized collections at the Central Library, India Institute of Technology (IIT) Delhi coincides with installation of a fiber optics-based Campus-LAN connected to a 2 Mbps radio link with
VSLN, enabling faster Internet access for the academic community of the Institute. The availability of this high-speed Internet connection has led to a number of sponsored and unsponsored projects for building network-based
digitized collections within the framework of traditional library and information services at the Central Library, IIT Delhi. This article outlines the various constituents of its digital library program.
The Campus LAN & the Internet Connection at IIT Delhi The Campus LAN at IIT Delhi consists of a state-of-the-art switched and routed network with a fiber-optics backbone and enhanced CAT-5 UTP cabling. The LAN
consists of more than 1400 switched network access points, which are configured into 35 virtual LANs to cover each department, center, central facility and administration. Three routers have been configured in a hot standby mode to
interconnect these virtual LANs, to create a DMZ (secure) LAN, and a non-routed administration LAN. The old Institute-bridged LAN, consisting of more than 15 Thick Ethernet backbone segments, has been also connected to the new LAN
through one of the switched network access points. The Campus LAN is connected to the Internet through a PIX Firewall, a DMZ LAN, an external router, and the 2 Mbps radio link with VSNL mentioned above. The firewall
protects the Campus LAN from unauthorized user access from the Internet, does network address translation (NAT) from private and internal IP network numbers ( 10.0.0.0/8, Class A Network ) to legal IP numbers (202.141.68.0/22,
Class C Networks), and provides controlled access to the IIT Delhi WWW, mail and DNS services as virtual resources on the DMZ LAN. Access servers provide free PPP-dialup access to the Campus LAN and the Internet through 32 modems
(33.6 Kbps) and 32 internal lines of EAPBX to all the faculty resident on the campus. Twenty-one switches are interconnected in a tree topology, with Fast Ethernet trunking, providing 200 Mbps full duplex communication paths. In
anticipation of increase in Internet and cross-country traffic, a memorandum of understanding has been signed with ERNET Society for an additional 2 Mpbs terrestrial link, which will become operational soon. Cyber
Cafés are also operational in each of eight Institute hostels with 19 or 20 access points in each café. The Institute has provided 5 to 10 PCs for each Cyber Café, all of which are connected to the Institute backbone over fiber
links, with 200 Mbps full duplex communication paths. Eventually the network will be extended to each hostel room requiring an additional 3200 access points. The work has already been started. In the hospital, 33 network access
points are provided, out of which 11 will provide Internet access to doctors and the remaining 22 will be used for computerization of hospital activities.
Building the Digital Collection at the Central Library, IIT Delhi The Central Library at the IIT Delhi is using a multi-pronged approach to build up network-enabled digitized collections. The Library, by policy,
acquires material in electronic form in preference to print form wherever possible. Besides acquiring and buying access to digital collections, efforts have been made to initiate in-house digitization of documents. A number of
digitization projects are in various stages of execution. Major network-enabled digitized collections at the Central Library are described below:
Buying Access to Web-based Full-text Digitized Collections in the Library. The Library has been providing Web-based full-text access to several electronic journals since 1998. License agreements were signed with
electronic publishers after negotiations to get maximum benefits for the users. The license agreement signed with the Elsevier Science Publishers provides access to all 1100 journals on the ScienceDirect site with download options
without any restrictions. In all the full-text of about 1,450 electronic journals can be accessed. All IP addresses used in the Institute are authorized and enabled to access the above mentioned electronic collections.
Building a Digital Collection In-house: Converting Datasets That Are "Born Digital." Most libraries and the institutions implementing digital libraries invariably have datasets that were originally created in digital
format. Doctoral dissertations submitted to universities and research institutions are highly valuable documents that qualify to be an important component of any digital library implementation. In addition, the Institute has annual
reports, prospectuses, courses of studies, technical reports and other datasets that might be included in digital collection. The items listed above are invariably composed in a word processing program or desktop publishing
package. Such documents can be converted into HTML, PostScript and PDF using tools like Acrobat 5.0 or Acrobat Exchange. Online converters are also available through Adobe's site. Some publications, namely the
Prospectus, the Course of Studies (Undergraduate and Postgraduate) and IIT Delhi at a Glance have already been converted into PDF from their native format in PageMaker. The content pages of each of these
publications are linked to their respective descriptions using Acrobat Catalogue. These four publications are given to visiting dignitaries on CDs with a Web-based interface. Initiatives have also been taken for
electronic submission of theses and dissertations. Under this program old Ph.D. theses and dissertations would be scanned and made accessible on the Web as part of the Networked Digital Library of Theses and Dissertations (NDLTD)
initiative. Projects sponsored by the Department of Biotechnology (DBT) and the Ministry of Human Resource Development (MHRD) provide funds for scanning of Ph.D. theses submitted to IIT Delhi. Figure 1 illustrates the process
involved in digitization of Ph.D. theses and dissertations at IIT Delhi. Building a Digital Collection In-House: Conversion of Existing Print Media into Digital Format. Several digital library projects are
concerned with providing digital access to materials that already exist within traditional libraries as print media. Scanned page images are the only reasonable solution for institutions such as libraries to convert existing paper
collections (legacy documents) without having access to the original data in computer-processible formats convertible into HTML/SGML or in other structured or unstructured text. There are several large projects using page images as
their primary storage format, including project JSTOR (
Capturing page images is comparatively easy and inexpensive. It is also a faithful reproduction of the original, maintaining page integrity and originality. Scanned textual images, however, are not searchable unless they are
scanned by OCR, which is a highly error-prone process, especially when it involves scientific texts. The facility set up in the Central Library, IIT Delhi, for scanning deteriorating and fragile old volumes of journals consists of
The old, fragile and bound volumes of journals are first scanned using OmniDoc 1.1. Scanned images of articles from an individual issue are then exported as TIFF (ver.5) while its indexing part is exported as a plain
text file. While the TIFF files are preserved for archival purposes, a PDF is derived from the TIFF file using Acrobat Exchange version 3.0. The text files, consisting of the author, title and location information of an article,
are pulled together in a content page, which is coded in HTML and hyperlinked to the article images in PDF format. The images of the articles in PDF format along with the associated Web interface is put up on the Campus intranet,
which can be accessed by the users through the Library's home page. The project was sponsored by the All India Council for Technical Education. The process is shown in Figure 2.
Subject Portal at the Central Library Website. The home page of the Central Library serves as a structured and organized guide to the electronic resources available on the Internet. The portal site is updated regularly.
The home page provides more than 2500 links to electronic resources on the Web. It can be accessed both on the Internet and through the IIT Intranet at the following sites:
http://www.iitd.ernet.in/library (Internet) Other Digital Collections. The Central Library has acquired European Patent Office information, the Indian Standards database and many bibliographic databases on CD-ROM. In addition, its OPAC is a major resource. The Libsys package, bought in June 1998, has been fully implemented for computerization of all activities in the library including acquisition, cataloguing, circulation and serials control. All faculty, staff, researchers and postgraduate students are already enrolled for the computerized circulation system. The undergraduate students are being enrolled in the last phase, which will mark a complete transition from the manual to the computerized circulation system. The library's online public access catalogue (OPAC) is operational both on Intranet and Internet. It can be accessed online to search more than 130,000 bibliographic records, available in the library database through the Web-based search interface or with the Libsys Windows client. CD ROM-based Search Services through a CD NET System. With the advent of CD-ROM technology in the mid-1980s several bibliographic databases, which were earlier available only through online vendors, started appearing on CDs at an affordable price. CD-ROM-based search services were established at the Central Library, IIT Delhi in 1991. The library acquired three CD-ROM workstations and four important bibliographic databases on CDs, namely COMPENDEX Plus (1985+), INSPEC (1990+), METADEX (1990+) and World Research Database. The Advisory Committee for the Library made a conscious decision to discontinue the print version of indexing and abstracting services in favor of their CD-ROM counterparts, if available. The search for a suitable CD-ROM networking system was started in 1994 with the receipt of special grants to the library from the Ministry of Human Resource Development (MHRD) for developing CD-ROM search services. One requirement was that the selected CD-ROM networking system could be hooked to the then existing 10-base-T Ethernet-based campus LAN. A Web-based CD networking system was finally procured from Meridian Data, Inc., (USA) after a series of technical presentation and negotiations. This system enabled campus-wide access of the CD-ROM databases to which the library subscribed. These databases are mostly bibliographic. Silver Platter's Electronic Reference Library (ERL). The CD-ROM networking solution procured in 1998 had several limitations that included slow access, repeated failure of CD-ROM drives, a requirement to configure each client and to download the CD sharing application onto each client and a limitation in terms of the number of databases that could be made available online. The Library decided to replace this system with one that allowed the contents of a CD-ROM disc to be transferred onto the hard disc of a server. As a solution, IIT Delhi has recently adopted Silver Platter's ERL technology. Once ERL server is implemented fully, the CD-ROM networking solution mentioned above would be used only for databases that are ERL non-compliant. Web-Based Access to the Materials Science Collection. The IIT Delhi Library has Web-based access to a group of databases
called the Materials Science Collection (including Metadex) through M/s Cambridge Scientific Abstracts. The Materials Science Collection is made available against consortia subscription where National Aerospace Laboratories (NAL)
is acting as the leader of the consortium, and M/s Informatics India is executing the orders on its behalf. Each of the consortium members has gained substantially both in terms of savings in the subscription amount and in
accessibility of information in terms of number of databases made accessible under Materials Science Collection. The Materials Science Collection is accessible at http://www.csa.com/. All IP addresses used by the
Institute are enabled for access to the databases under this subscription. Web-based Access to the Databases Developed In-House on Micro CDS/ISIS. The Central Library has developed a number of databases
in-house using Micro CDS/ISIS package of UNESCO for specialized collections aimed at handling activities that cannot be handled with easily using Libsys. These databases have now been ported to the WWW/ISIS interface to facilitate
simultaneous access by users on Internet and Intranet. The databases, accessible at Directory of Online Interactive Courseware in IT A portal site on "Web-based Online Interactive Courseware in Information Technology" has been launched and is available at - Database of Online Courseware: The database is currently designed in Microsoft Access. Efforts are being made to export the database to a more robust RDBMS like Oracle or MySQL.- ODBC Driver: The ODBC (Open Database Connectivity) drivers for most of the important databases are built into the operating system. ODBC + ASP was preferred to CGI + PERL. - Browsing and Search Software designed in ASP: The directory provides a user-friendly browsing and search interface, designed using ASP, that displays courses for broad subject categories, deriving the data from the back-end database above. - Site Administration and Maintenance: Suitable interfaces have been developed to facilitate site administration, maintenance and update of the database and the website. A Web-based interface was developed and is being used for data entry of surrogate records from multiple locations. Administrative interfaces are available to edit records. An interface has been developed to generate administrative reports and statistics in various formats. Access Infrastructure for Digital Collections at the IIT Delhi An effective and efficient access mechanism that allows a user to browse, search and navigate digital resources becomes necessary as the electronic resources of a collection grow in number and complexity. The access infrastructure for digital resources at IIT Delhi thus consists of the following components as reviewed above: the Libsys OPAC/WebPAC, the websites for special collections, such as those developed for the Directory of Online Interactive Courseware in Information Technology and the scanned journals, linkages between bibliographic citations and full-text of journal articles, online access to many journals, and the Central Library's home page subject gateway. Conclusion The Central Library, IIT Delhi has intensified its computerization and Web-based activities and services with availability of faster Internet connections and willingness of authorities to provide additional funds for computerization of the library and for developing digital resources. The development has attracted appreciation and compliments from the users. The users are actively helping develop the subject portal. Several other initiatives are underway to further intensify the build-up of digital resources at IIT Delhi. Tools, techniques and protocols are now available for this purpose. The libraries need to identify collections that need to be digitized. It is important to join hands with sister organizations to begin collaborative digitization programs. |
|
|
|
|
|
|
||||||||||||||||
Copyright © 2002, American Society for Information Science and Technology |