Government Documents and Libraries:
The Impact of the Digital Revolution

by William Miller and Margaret S. Walker

Radical changes some beneficial, others potentially disastrous are now occurring in the way the U.S. Federal Government is making information available to researchers and to the public. Under pressure from both the Executive and Legislative branches, federal departments and agencies are rapidly converting their publications to electronic format. As they do so, some are also turning away from the central publishing agency, the U.S. Government Printing Office (GPO), with its standard distribution channels. These changes raise many questions about how available U.S. government information will be in the future.

Brief Background

The U.S. government has had a 150-year history of disseminating its publications free of charge (though they are expensive to print, ship, and process) to the general public through a network of designated depository libraries, now numbering approximately 1,400. Most of these Depository Libraries are at colleges and universities, and scholars have come to depend on this system to provide unfettered access to economic and statistical data, scientific studies, and almost anything else either generated by or funded by the U.S. government. Libraries have been able to acquire large federal documents collections for the cost of processing the materials and providing reference assistance in their use. Even for small depositories which have not been able to afford to receive and process many of the available publications of the GPO (which has been the world's largest publisher for some time), cooperation within the Depository Library Program has meant almost unlimited access to federal publications for scholars, albeit with some time delays.

Advantages of Electronic Access

The abolition of the current print-based system is being hurried along by Congress to save millions of dollars in printing and shipping costs, and the abolition of printed documents will certainly represent a small but real budgetary saving for the federal government. More importantly, the ubiquity of electronic government documents, like that of all other electronic material, offers great potential advantages. Documents could be made available to many researchers simultaneously, at locations most convenient to them, and no one will be able to misshelve a volume, steal it, or rip out a vital page.

Along with eliminating the problems of accessing the limited universe of physical items, electronic access would potentially offer a speed of access which would be a boon to many, especially those seeking current economic, demographic, or scientific data. Having immediate access to the newest Consumer Price Index Detailed Report the day it is published, without having to wait for the item to be printed, packed, shipped, mailed, received, processed, and shelved, would obviously be highly desirable. Online access to the Commerce Business Daily, the Federal Register, and recently passed laws provides up-to-date information to scholars and businesses all over the United States on equal terms, without giving anyone an advantage based on geographical location or position on a shipping and delivery schedule. Such aspects of electronic information have the capacity to enhance scholarly inquiry substantially.

Free Books: Are they worth what they cost?
Information Alliances

Desirability of Print

Before dismissing paper publication of documents as unnecessary, however, we must consider several factors. First, the availability of documents in electronic form will not eliminate printing in many cases, but will instead simply shift the cost of printing from the GPO to libraries and individual scholars. Current experience shows that electronic access does not decrease the need to print. Just as no one wishes to read Moby Dick on a computer screen, few will care to read or analyze long documents without printing them out first. Only now libraries and their users will be bearing the larger costs for printing these copies. Much material, such as the Economic Report of the President, would be desirable in all formats one for immediate access, and one for long-term referral.

Long-term readability. Moreover, print is the only proven long-term storage mechanism. Printed volumes, and perhaps microfiche, will remain readable for hundreds of years, and perhaps for much longer, but we do not know if CD-ROMs, floppy disks, and internet information produced today will be available to scholars in the future. New generations of equipment, which evolve at an ever-faster pace, make older generations of storage media difficult or impossible to use. For example:

  • The 1960 Census of Population tapes are now unreadable, and we can depend only on the printed volumes.
  • The U.S. Public Health Service destroyed 200 reels of 17 year-old computer tapes of its own data last year because "no one could find out what the names and numbers on them meant."
  • Federal Depository libraries have received compact disks and floppy disks that can only be used on Apple or McIntosh computers. The future accessibility of the information on these products is now somewhat in doubt, considering the problems that Apple Computer is in.

Preservation responsibilities. Part of the responsibility of libraries and of academia is not only to make current information available, but also to preserve information and keep it available for future generations. Information posted on the Internet today is suddenly gone tomorrow. Commercial entities and governments show little interest in long-term maintenance and availability of research information. Electronic information is easily wiped out or altered; a printed volume is a good guarantor of data integrity, both literally and figuratively. It has been demonstrated that electronic data spontaneously degrades over time, physically, and degradation may take many other forms.

Governments and their political leanings change. Will shifting political fortunes determine what electronic version of a given text will prevail, and wipe out one version in favor or another? When there is no longer a version of record, or when the version of record can be changed with the stroke of a key, how will we know?

Immediate Challenges

In the short run, the transition of government information to electronic form poses some more mundane but nevertheless very real challenges. In order to be a player in a digital environment, organizations are going to have to make a set of investments. Beyond the obvious need for terminals and Internet access, libraries will need to be able to receive and download very large files, on a faster, more reliable Internet, and on campus pipes which are wider. Some agency publications also require the purchase of costly commercial software such as GIS. Some electronic government documents cannot be printed without a laser printer, a relatively costly item, especially compared to the ten cents it would have cost to photocopy a page.

Libraries are now scrambling to keep up with the hardware and software required to access materials which formerly needed only processing and shelving. As for their Internet access problems, they are dependent on the overall institution's ability to upgrade its data transmission capability.

Many people still believe that "electronic" means "free," and this is especially true in the world of government documents. In the new electronic world, however, government agencies are increasingly going their own ways, leaving out the Depository libraries and (in response to cost-cutting and downsizing efforts) offering access to their Internet databases on a subscription basis only.

Use of commercial producers. Some agencies have bid their publications out to commercial producers for distribution and sale. Quasi-independent agencies such as the National Technical Information Service have already been selling certain classes of government-funded research material on a cost-recovery basis.

The Bureau of the Census recently advertised but then quickly pulled the advertisement fora free demonstration of its population electronic series on the Internet which was to precede a fee-based service.

The Department of Commerce offers a subscription to STAT-USA on the Internet, where one can find "the best business, trade and economic information that the federal government has to offer." Despite the fact that your taxes have already paid to gather and disseminate the information, you will have to pay again if you actually want to use it. Depository libraries can have one free subscription, are prohibited from sharing the one password, and can only network it by paying additional charges based on the number of "potential" users. These charges can run into thousands of dollars.

Yet one more example: recently, Depository libraries have been notified that the National Trade Data Bank (NTDB), one of the most popular government-produced CD-ROMs, and one to which the Department of Commerce frequently refers business people, must now be used on a stand-alone computer workstation, and can no longer be networked without paying an incremental fee also.

Government Information Availability in the Future

The timetable for this conversion of most government publications from paper to electronic form, originally set for two years, was recently lengthened to a more realistic five-to-seven years. This change will allow for a more orderly transition, and may alert governmental officials to major problems in time to ameliorate them. Because most individuals do not have computers with Internet access, a pell-mell rush to eliminate all print documents is ill-advised. After all, the Depository System, with its free availability of materials, was created to share with the public the fruits of research and other basic information which the public itself has already paid for.

"It would be a comparatively small investment to continue to publish government information in a permanent form, while also making that which is appropriate available electronically."

One course of action would be to create a selective system of electronic information which supplements the traditional availability of government documents, rather than replacing it wholesale. Print will continue to have an important place in the preservation and furtherance of knowledge. It would be a comparatively small investment to continue to publish government information in a permanent form, while also making that which is appropriate available electronically.

Nevertheless, pie-in-the-sky futurism and budget cutting make it likely that much research material which is currently available at little cost or trouble will increasingly be lost, and that we will experience some dislocation and gaps in the information record to the detriment of research capabilities, before the importance of print and the undependability of electronic formats come to be fully understood in government circles. Hardware, software, subscription, and training costs are being passed along to libraries, along with the preservation challenge. The National Archives and Records Administration, the agency responsible for preserving the government's permanent records, has been overwhelmed by electronic information and has postponed acceptance of some agencies' electronic records because of the amount of information being generated. Certainly there are projects on the drawing board to preserve snapshots of the entire Internet at various moments, but these are not yet reality.

The outcry from college and university faculty members, beginning a year or two from now, is already predictable, as they discover that the information they need and used to find readily available, free of charge in their libraries, is suddenly either unavailable, or costly to retrieve. Meanwhile, one can only hope that we will learn from our impending mistakes, that the mistakes will not be widespread or irreversible, that there will be a place for all formats in the government depository program of the future, and that economics will not override the U.S. government's admirable tradition of sharing information with the academic and general public with as few barriers as possible. -- William Miller is Director, and Margaret S. Walker is Documents Librarian, Florida Atlantic University.


JSTOR: The Vanguard of the Retrospective Digital Library

by Richard DeGennaro

No one in his right mind would do what you are proposing. "That was the advice a prominent publisher gave William G. Bowen, President of the Andrew W. Mellon Foundation and founder of JSTOR, when he described the JSTOR concept to him in 1994.

History records many instances where significant breakthroughs were made by the naïve visionary on the periphery who does not know that what he is proposing "cannot be done." Bill Bowen had no idea how difficult it would be to bring his idea to fruition, but he was convinced that JSTOR would be a boon to scholars, publishers, and librarians. With the support of the Mellon Foundation he turned an impossible dream into an idea whose time had come.

JSTOR's Beginnings

Originally a project of the Andrew W. Mellon Foundation, JSTOR is now an independent not-for-profit organization with a mission to help the scholarly community take advantage of advances in information technologies. JSTOR's initial objective is to develop a trusted archive of core scholarly journal literature, with an emphasis on the retrospective conversion of the entire backfiles of key journals. It is anticipated that in the future the objectives will be expanded and other related projects will be initiated.

In pursuing its mission, JSTOR is taking a system-wide approach, taking into account the needs of those involved in the field of scholarly communication: librarians, publishers, and individual scholars and students. This system-wide approach has required a willingness to reach compromise in order to accommodate the sometimes conflicting perspectives of JSTOR's constituents. Publishers needed terms that would preserve their copyright and their paper subscriptions. Librarians and publishers alike needed to define the limits of interlibrary loan, the definition of "authorized users" and the permitted use of JSTOR materials by those authorized users. With the resolution of these and other critical issues, JSTOR is now offering the first comprehensive collection of copyrighted retrospective journals in digital form to libraries and library users.1

Principal goals. JSTOR's principal goal during the coming three years is to create a comprehensive database of the complete back files of core journals in 10-15 fields in the humanities and social sciences. The page images are scanned and bit-mapped to preservation standards at 600 dpi. The text is also optically scanned and the product is manually upgraded to 99.95 percent accuracy. This ASCII text serves as an index to the contents of the journals. The tables-of-contents are keyed and provide a convenient means of browsing the journals. In order to protect the publishers paper subscriptions, there is a 3 to 5 year gap between the back file and the current year. Each year another volume will be added to the database and archive.

Agreements reached. Agreements have been negotiated with the publishers of some 45 core journals the first 22 of which became accessible to Charter Participants in January 1997. The JSTOR collection is being actively marketed to libraries in the U.S. and Canada. The pricing schedule is based on the Carnegie Classification of Institutions of Higher Education with some modifications. JSTOR is committed to making a minimum of 100 journal titles available by the year 2000 as Phase 1 of a continuing effort.

Why Digitize These Journals?

In recent years there has been a good deal of discussion and speculation about the feasibility and usefulness of digitizing large quantities of printed books and journals in library stacks. There are some who believe that it is technically and economically feasible and intellectually desirable to digitize "everything," and there are others who think that the value of retrospective materials is limited and would not justify the high cost of digitizing and maintaining them. JSTOR takes the middle way.

The JSTOR project is based on the conviction that it is intellectually desirable and economically feasible to digitize, maintain, and distribute a carefully selected body of core journals and other retrospective resources provided that the cost can be shared among a very large number of libraries. And it is also assumed that the cost savings to participating libraries will more than offset the fees that they pay. The Andrew W. Mellon Foundation has made grants of over $4 million to establish JSTOR with the expectation that JSTOR will become an on-going, self-supporting, not-for-profit enterprise serving the needs of scholars, librarians, and publishers.

The JSTOR core journal collection is important to libraries for two reasons:

  • it greatly enhances user access, and
  • it provides significant savings in library space and operational costs.

Enhanced User Access

The JSTOR database is to a library's collection of journals what a library's catalog is to its collection of books and journals, but whereas the catalog merely points to the book or journal, JSTOR indexes every significant word or phrase and instantly delivers the text to the user's desktop and enables printing on demand. This is what makes JSTOR a transforming scholarly resource.

In this easily accessible electronic form these core journals will acquire an importance that they did not have as bound volumes in library stacks. The JSTOR database is becoming a new and vital library resource. With desktop access, users will mine the contents of these journals in ways and to an extent that was simply impossible and inconceivable with the bound volumes. In addition, the JSTOR archive is complete while many (if not most) of the paper sets in libraries are incomplete with missing and mutilated volumes and defaced pages.

Digitized information tends to be more useful and valuable than print information. Experience with card catalogs and electronic catalogs has shown that users prefer the convenience of the electronic catalog and neglect the cards. The same will be true of those journals that have been "jstored"; they will be used in preference to those that are only available in bound volumes. The use and value of the back files of paper journals will diminish as the number of digitized journals increases. This is why JSTOR is being so careful in the selection of the titles to be included in the database.

Savings in the Cost of Library Space and Operations

The JSTOR database meets preservation standards and faithfully replicates the contents of the journals including all advertising pages and membership lists, etc. But JSTOR is much more than a mere replication of the contents of the paper or microfilm copies of the original journals. As has been noted, the Internet-accessible JSTOR database is a totally new and unique research resource that will supersede the paper and microfilm sets of the journals it contains. To be sure, it is essential that some major research libraries retain their bound sets of these journals for archival purposes, but most libraries could discard them or send them to secondary storage and free space in the bound journal stacks. Of course it may take some time for librarians and library users to gain enough experience with and confidence in JSTOR before taking this now seemingly drastic step. This decision will be much easier for that large number of libraries whose sets of these journals are largely scattered and incomplete.

"The cost of JSTOR will come out of the library budget, but the cost of the space it saves will accrue to the institution. Space is important to librarians because there is never enough, but they have little appreciation of its true cost."

JSTOR's original purpose was to save valuable stack space in libraries by substituting digitized versions of core journals for the bound sets that occupy large amounts of space in thousands of libraries. Saving space is still a major JSTOR goal and a compelling argument can be made for it if the cost of library space is viewed from an institutional perspective. Presidents, provosts, and chief financial officers are painfully aware of the cost of library space.2 Because the cost of building and maintaining library space is usually not part of the library's operating budget, librarians have little incentive to factor the cost of space into their decision making. The cost of JSTOR will come out of the library budget, but the cost of the space it saves will accrue to the institution. Space is important to librarians because there is never enough, but they have little appreciation of its true cost. Space is still seen as a free good in most libraries, but there is some evidence of a trend toward charging space costs to library budgets. This has already happened at Harvard and a few other libraries.

At Harvard the cost of maintaining existing library space and of using space in the Harvard Depository (HD) is charged to the library's operating budget. Since Widener Library is filled to capacity, for every new volume it acquires, a volume must be sent to the Depository. The annual cost of storing an average volume in HD is 25 cents. Added to this is a $2 accession fee and a $3 charge for each volume retrieval. This does not include the staff cost for selecting and processing the volumes to be stored.

Brian Hawkins, Vice President for Academic Planning and Administration at Brown University, has been questioning the financial viability of the traditional research library in a series of papers. He makes this sobering assessment in his most recent paper:

"While the problems associated with the acquisition of new information are alarming, focusing on this set of costs masks the magnitude of the real problem. If we proceed with the library model as we have known it, the costs associated with storing and archiving the information will bankrupt our institutions of higher education."3

In the same paper Hawkins estimates the cost of physically housing a single volume at $20, assuming new building costs at $170 per square foot. In additional, he estimates annual maintenance costs at approximately $1 per volume at Brown.

We estimated that there will be 6,400 volumes in the 100 titles in Phase 1 of JSTOR. These volumes would normally occupy 568 square feet of library shelving space. It would cost $113,600 ($200/sq.ft. x 568 sq.ft.) to construct the space occupied by JSTOR volumes plus $1 per volume or $6,400 per year to maintain it.


This discussion is limited to space and building maintenance costs in a single library. There are other potential savings in library operational costs such as savings in binding, preservation, repairs, retrieval, and reshelving. When these potential cost savings for a single library are multiplied by the number of libraries in a consortium, a state, in the United States, or the world, the numbers are truly staggering. Add to this the "have not" libraries that could gain access to this rich resource for the first time through JSTOR. The Andrew W. Mellon Foundation has a strong commitment to supporting higher education internationally. Its global perspective on the value of JSTOR accounts for its keen interest in and generous funding for this pioneering initiative.-- Richard DeGennaro, former Librarian of Harvard College Library, is Senior Library Advisor and member of the Board of Trustees of JSTOR.


