Electronic Recordkeeping

Implementation of Imaging Technology for Recordkeeping at the World Bank

by Clive D. Smith
This article describes the evolution of an electronic document management system that includes recordkeeping components and how the Pittsburgh requirements were used to evaluate it at one crucial point in its development. Some key features of this effort from a recordkeeping perspective are the following:

The World Bank's Knowledge Base

The World Bank is a large international organization, headquartered in Washington, DC, with offices in 81 developing countries and Brussels, London, New York, Paris and Tokyo. When established in 1946, the main function of the bank was to lend money to developing countries for the purpose of improving their economies and reducing poverty. During the 50 years of its existence, the role of the bank has changed. Whereas, when it was established, it was the principal source of funds for development lending, now it is only one of many sources of such funds. However, it has become a unique repository of accumulated experience and knowledge, which is, of course, contained primarily in the bank's records and archives. Consequently, the retrieval of information from the bank's records has come to assume greater importance. This is evidenced in part by the adoption of a new disclosure policy, under which the bank opened a Public Information Center in January 1994. The Center makes available copies of many categories of bank documents.

Paper or Electronic?

The records that contain the sort of information in most demand are documents such as reports, correspondence, minutes, transcripts or agreements. For at least 10 years, those produced within the bank have been generated, for the most part, electronically. At the same time, however, most of the bank's correspondents have not been capable of receiving documents in any medium other than paper, and most of the documents received by the bank have been in paper form. This is still the case today.

Also, as many as 25 separate files may be used to generate a single complex bank document. Recent studies have shown that, even with improvements in technology, it is still easier to publish such complex documents from printouts generated from the various electronic files used to create them, than to try to combine those files. Consequently, paper has continued to be the medium for preservation, even for the documents created electronically. Furthermore, some members of the World Bank Group are engaged in litigation that can occur in almost any country in the world. Therefore, the bank's recordkeeping practices cannot be governed solely by U.S. rules of evidence, legislation or practice in the admissibility of records. For these reasons, the bank's official records are still primarily in paper.

Long Term Strategy for Electronic Document Management

Nevertheless, for many years the bank has been developing a capability to manage its documents (and its records) in electronic form. A paper prepared in 1989 envisaged a three-tiered architecture, allowing for the management of electronic documents at the institutional level (Tier 3), and within locally managed (i.e., departmental or work group) and individual (i.e., personal) systems (Tiers 2 and 1, respectively). The paper noted that the tools for Tier 1 were generally available and in place, but that Tier 2 required the greatest attention and experiment. For Tier 3, the principal problems were the lack of a single vehicle to deliver text services to all staff and the integration of text bases with other related information. Subsequent refinement of this architectural model has tended to focus on the state of documents at the various levels: documents in Tier 1 are considered to be an individual's working or reference copies; they are not records. Documents in Tier 2 are being worked on or used collaboratively and are records, but may not require preservation. Documents in Tier 3 are in their final form and are records that do require preservation.

As tools to support requirements have gradually become available (or promise to become available), the bank has moved to position itself to transition from paper-based recordkeeping systems to systems that would enable records to be created and maintained solely in electronic form. The implementation of a standard bank-wide network (the Enterprise Network) has solved the problem of a bank-wide delivery mechanism. It has greatly facilitated the sharing of documents and the standardization of applications and platforms. Document imaging technology has provided an additional means of implementing Tier 3 - capturing of final-form documents at an institutional level and making them accessible across the institution. The more recent development of reliable collaborative or work group tools promises to enable Tier 2.

Initial Imaging Projects: Tier 3 Developments

Over five years ago, the bank started looking at imaging technology as a means of improving the accessibility of many of its documents. The first collection of documents considered for scanning was the Bank Reports collection. This collection included economic and sector reports, country studies, project appraisal and evaluation reports, research and working papers. For many years, these documents have been systematically cataloged, abstracted and microfiched.

The technology selected was Electronic Filing System (EFS) from Excalibur Technologies. This was chosen primarily because of its search engine, especially its "fuzzy" search logic, which eliminated the need for clean text for searching purposes. More than 13,000 documents have now been scanned and included in this collection, and they are now available to bank staff via the bank's Intranet, although this does not support searches on the text of the documents. Staff can search by a range of profile fields, including country, sector, date, author, title, report number, loan number and type of document. They can view either the image or the text of each page or can place a request for a paper copy. Copies are printed from the image files and have proven to be better quality than copies reproduced from the microfiche.

An early decision was not to store the images and text files within EFS; rather they are stored on separate servers and EFS merely contains pointers to their location. This was not only to reduce dependency on proprietary software, but also to facilitate simultaneous use of other access software that might become available. This has already enabled the documents to be made available via the bank's Intranet, and is also expected to permit access to the same document from a variety of access points utilizing whatever technology or software is most appropriate for the user. Similarly, the master copy of the profile data is also maintained outside EFS in a suite of Oracle tables which can be accessed via other software that might be more appropriate for particular users or in particular situations.

Introducing Imaging into an Operational Unit

Initial enthusiasm with the use of imaging technology for the Bank Reports led to a pilot implementation for basic recordkeeping in one of the bank's operational divisions responsible for developing and supervising lending projects. Studies of the recordkeeping systems used by such divisions over the past few years had concluded that not only were there multiple sets of files documenting the same lending projects, but that none of these sets was complete. Implementing an imaging system for records had the potential not only to eliminate wasteful duplication, but also to improve capture rate. If staff could access the complete file at their desktop, there would be less incentive for them to retain the paper copies in their offices.

The studies of the recordkeeping systems had already resulted in a major initiative ("File Improvement Program") under which teams of records managers visited individual work units to establish a comprehensive filing system within each. The records managers analyzed the business processes, workflows and information needs of the work units in order to design relevant and useful filing plans, and this work has also provided a sound basis for the design of appropriate imaging systems for records.

EFS is built around the metaphor of a fileroom. Each EFS application is an electronic fileroom, containing electronic cabinets, which in turn contain electronic drawers, which hold electronic folders, which contain the documents themselves. To be filed in EFS, a document must fit into this hierarchy. With minor modifications, the filing plans that had been developed for paper records were easily adapted to fit this electronic hierarchy. This was an important factor in gaining user acceptance - the users were presented with a familiar environment. The fileroom structure for the bank's main operational vice presidencies is shown at Figure 1.

Metadata for Retrieval and Recordkeeping

Each document is profiled for the purposes of management and retrieval. This metadata consists of about 40 attributes (or control fields), about half of which are mandatory. Many of the mandatory fields default on selection of another attribute. For example, the selection of a Project Name will trigger defaults for Project ID, Loan #, Credit #, Trust Fund #, Sector and Task Manager. Some of the attributes, such as Date Stored and Entity ID, are supplied automatically by the system. In most cases, the data entry operator has to enter values for only five or six attributes, some of which can be selected from pick lists.

The profile attributes serve a number of purposes. While some are almost purely descriptive and are intended to facilitate retrieval, others are included specifically for records management or system management purposes. For example, Originating Unit tells us which organizational unit first received or created the document; Owner tells us which organizational unit is now responsible for the business process or transaction to which the document relates, as the bank is subject to frequent reorganization. Business Process will later facilitate appraisal and disposition, as will Document Type, although in practice over three-quarters of the documents are either memorandum or letter. Retrofit tells us whether the document was scanned in the normal course of business or was scanned as part of a backfile conversion; Action Flag is intended for use with the correspondence management system under development (see below) and will be used to trigger appropriate routing and tracking mechanisms depending upon whether a document requires action or is merely for information.

Different business processes require different combinations of profile attributes for document descriptions. For example, documents relating to general country information or the country portfolio performance are not related to lending projects, and so a number of profile attributes (Project Name, Loan #, etc.) are not required. Because these attributes are mandatory for describing project-related documents, it has been necessary to develop a number of customized data entry screens, so that the data entry operator has to select the screen appropriate to the relevant business process. A sample profile is at Figure 2.

In practice, the profile attribute that causes the most difficulty for the data entry operators is Document Name. The maximum length permitted for this attribute is 100 characters, and within this limit the operators have to construct a meaningful name for the document. The Document Date (in the format YYYY-MM-DD) is concatenated with Document Name to construct an EFS document label; the inclusion of the date enables EFS to display the list of documents within a folder in chronological order, the most rational order for a recordkeeping system. EFS assumes that within the same folder a document label is unique; the inclusion of the date within the label helps to achieve this, but operators need to bear this in mind also when constructing a document name.

The Pittsburgh Requirements

In 1995, the bank made a study of the EFS-based imaging system, as developed at that time, to assess whether, and to what extent, the system met functional requirements for recordkeeping. In the course of developing systems and assisting users of electronic documents, the bank had conducted several exercises to collect requirements, both from system users and from its archives and records management staff. However, for this exercise, the bank decided to use the ones then newly developed by the NHPRC-funded research project at the University of Pittsburgh.

Although the EFS system with its associated documents and Oracle databases is intended to be a recordkeeping system, it is important to note that records are not created within the electronic system itself; rather they are captured and stored within the electronic system after they have been created. Also, whereas EFS serves as a retrieval system for the images, the images and electronic texts themselves are separately stored, as are the profiles. The data, thus, is not actually attached to the document, as is envisaged in the Pittsburgh approach.

The Pittsburgh study recommended four tactics for meeting requirements: policies, design, implementation and standards. The bank concluded that all are necessary and also placed a high degree of stress on training.

Generally the study concluded that the electronic system met most of the requirements either wholly or partly. There was some uncertainty about the others. For example, the system came out well in the following areas:

However, the electronic system was less impressive in the area of Compliant Organization. It could only partly be said to comply with legal requirements. This latter conclusion is not particularly surprising considering the plethora of jurisdictions within which the bank operates. Given that the electronic system is not intended to include all records, these findings were sufficient encouragement to press forward.

Admittedly, some requirements are implemented somewhat awkwardly in the present system. For instance complete records requires that linkages between records be preserved. EFS has no way of linking related documents, such as a letter and an attachment, or a letter and a reply. Consequently, attachments and covering letters have to be treated as a single document, and the name needs to show that the document includes both, e.g., "Letter enclosing feasibility study for bridge on highway." This adds to the problem of creating a document name. Similarly, although a letter and a reply are treated as separate documents, it is often useful for the reply's Name to refer to the original letter, e.g., "Reply to Minister's letter of Apr 2 re cost overrun."

Extending the System to Other Electronic Media

Not all documents need to be scanned into the system. Documents that are created electronically, such as Word documents and electronic mail messages, can be imported into the system. EFS supports viewers for a number of word processing, spreadsheet and other formats, which enables the documents to be held in or accessed by EFS in their original formats. While this will inexorably increase the volume of documents to be migrated to new versions of the software in the future, it does have two short-term advantages: clean text for searching purposes and users can easily copy and re-use the documents in their original format. In compliance with the requirement that records be inviolable, EFS does not, of course, allow users to edit the original copies. However, to meet authenticity requirements, any documents that would normally be signed or initialed must be scanned as well, so that the signatures or initials are preserved. This practice may be relaxed when documents can be produced in secure systems with electronic signatures.

In units where the imaging system has been implemented, incoming documents, such as mail or faxes, are scanned before being forwarded to the relevant action officer. Because EFS does not now support document routing and tracking, the paper originals are still forwarded to the action officer. Action officers are then responsible for forwarding the original paper copy to the relevant paper files. The file copies of outgoing correspondence are forwarded to the scan station for scanning prior to filing. Electronic mail messages, of course, are received or sent directly by the action officers, who then must forward them electronically to the scan station for import into the system. At this stage, the paper files are, for the most part, still regarded as the records.

Dispensing with Paper Files in the Work Unit?

Nonetheless, at least one unit manager has expressed a determination to move to a paperless office, which means ceasing to maintain paper files. Paper documents will be forwarded directly to the archives once the action officer has taken the appropriate action. Procedures are still being developed - the archives do not have the resources to maintain a substitute filing system. However it is important that the paper copy of any document can be retrieved quickly and easily if needed, particularly since some documents, especially voluminous reports, are only partly scanned. The procedures are likely to include the archiving of bundles of scanned documents, the electronic profile of each document containing a reference to the archive location of the original. It will not, however, be a simple matter to retrieve all the papers relating to a particular project or business transaction should that be required, as these will be scattered through many such bundles.

Adding Workflow to Create Tier 2

The bank has begun the process of converting its electronic mail system to one based on Lotus Notes. The adoption of Lotus Notes, however, has facilitated the development of a correspondence management (document routing and tracking) system that can access the stored images and text files. When this is deployed, not only will it give managers greater control over the management of their projects and business transactions, it will also enable paper documents to be archived immediately after they are scanned. Incoming correspondence will no longer need to circulate because action officers and responsible managers will be notified of its arrival in an electronic mail message that will contain a pointer to the document. An electronic log will record all subsequent action taken with regard to the document. This development will add some Tier 2 capability. Tools for collaborative authoring, version control, etc., are now under active investigation, in particular to assess how they manage recordkeeping requirements, and these will further enhance the development of Tier 2.

The implementation of an imaging system for correspondence and documents, however, has not resulted in a reduction of the workload in managing records. Instead of individual items being classified and filed, they are now scanned and described. In fact, database descriptions are now at the individual item level, rather than at the file level. If anything, the number of information professionals seems likely to increase. This, however, is offset by other savings. Documents no longer need to be copied and physically distributed. Only one paper copy is retained, eliminating a large volume of duplication in the archives. As many staff as need can have simultaneous access to the images, and their search times have been reduced significantly. Office space requirements for the storage of paper documents have also been reduced significantly. Further economies are expected when field offices are added to the system.


Current electronic recordkeeping systems in the bank are still evolving to meet the full requirements for preserving the evidentiary value of records in its complex environment. The bank still relies heavily on paper records, but paper record systems and procedures that have been specifically designed to facilitate movement to an electronic environment. These systems have lacked any comprehensive document or correspondence registration or control from the moment of creation or receipt. The current systems rely entirely on staff forwarding record items to filing centers for inclusion in the recordkeeping systems, and there is no way of knowing whether all items have been so forwarded. Consequently, we expect the eventual bank-wide implementation of the imaging system, combined with the proposed correspondence management system, to significantly improve the bank's recordkeeping in all its aspects, and consequently the availability of information.
Clive Smith has been the bank group archivist for The World Bank Group since August 1991. Prior to that, he was chief archivist at Westpac Banking Corporation in Sydney, and had previously worked for Australian Archives in Canberra. He has held various positions, including president, in the Australian Society of Archivists. He may be reached at The World Bank, 1818 H. St. NW, Washington, DC 20433; 512/473-5214; e-mail:csmith@worldbank.org.

Figure 1

Typical Fileroom Structure for a Vice Presidency

<Country or Area>_Cofinancing & Aid Coordination_General <Organization>
_Country InformationGeneral
_Country Program<Country Assistance Strategy

Country Economic Memorandum

Country Portfolio Performance Review

Country Policy Framework Paper

_Grants & Trust Funds<Project Name>
_Sector Work<Project Name>
_Topical Information<Sector or Program Objective>
<Project Name>Bank Reports

General Correspondence


Names for entities in < > are supplied by control fields.
Names are sometimes prefixed with an underscore to force sorting to the top.

Figure 2

Sample Document Profile

Project Name MX-2nd Decentralization & Regional Development
Doc. Date 1996-10-18
Date Stored 1996-10-25
Loan # 3790
Credit #
Country/Area Mexico
Project ID MXPA7702
Doc. Type memorandum
Task Mgr. Sant'Anna, Anna--LAMXC
Sector Public Sector Management
Category General Correspondence
Bid #
Ext. Fin. #
Fund #
Report #
Security Official Use Only
Bus. Process A.2.a.5. Supervision - Accounting & Auditing
Accession #
Box #
Orig. Unit LASLG
Organization IBRD
Doc. Version
Label 19961018 Memo fr Sant'Anna re audits & suggested measures
Folder General Correspondence
Drawer MX-2nd Decentralization & Regional Development
Cabinet Mexico
Fileroom 2
Doc. Name Memo fr Sant'Anna re audits & suggested measures
Volume #
Entity ID 00008124296102512474800
Profile Type 00
Doc. Source C
Division Tab
Sub Tab
Alt Task Mgr
Action Flag