Vol. 23, No. 6 July 2003

Can Electronic Scholarship Survive?

by Deanna B. Marcum and Gerald George

As a college president, provost, dean, or head librarian, what would you think if your institution’s groundskeeper came rushing into your office with the following exclamation: “Do you know that all over this campus faculty members are caching library material in flimsy containers underground?!”

If you decided the groundskeeper had not simply gone insane, you likely would ask—Why?! Why are faculty members housing library material outside the library? And why underground instead of where others can easily find and use the material? And why in flimsy containers that put the material at risk?

Crazy as this sounds, it is happening. That is, something analogous is becoming commonplace in colleges and universities. The analogy may be less than exact, but the key point is that certain intellectual assets of higher-education institutions are at risk. Here is how:

Digital technology is enabling—indeed, encouraging—faculty members to develop resources in electronic formats, for use in research and teaching. Faculty develop these resources on their local computer systems outside the library. These resources are electronic equivalents of, and sometimes digital surrogates of, books and other traditional library materials.

Although not in cardboard boxes, these e-resources are nonetheless at risk because their electronic formats are less enduring than paper, and may quickly be rendered unreadable as the software and hardware on which they have been generated become obsolete. For potential users beyond the creators, such e-scholarship might as well be in subterranean caches, because few know how to find it, and accessing it may be difficult for those who do.

This matters to academic administrators because much potentially valuable e-scholarship, produced using college resources of time and technology, is relatively inaccessible now and may become totally lost to the future.

Types of E-scholarship

What kind of “e-scholarship” are we talking about? Here are some simple examples:

A professor creates an electronic database of documents to facilitate research on a particular topic.

A teacher develops a new course with an online syllabus linked to electronically accessible, assigned readings.

Collaborating scholars set up a Web site for use in reporting and exchanging research results quickly.

A campus learning center creates new, computer-generated classroom audiovisual materials.

A department of art or architecture assembles a collection of digitized images for study and teaching.

A college museum puts a virtual exhibit online.

Some resources of these kinds may be of little future value. But those that are belong in a responsible repository that can preserve and make them widely accessible for a long time. Traditionally that is the campus library’s role.

Campus libraries, however, do not often know about Professor X’s research database of searchable digital documents or Professor Y’s teaching collection of re-combinable digitized images, let alone have a chance to assess the value of these resources. That is, libraries are unaware of such resources until Professors X and Y, realizing their own inability to manage, preserve, and provide access to their e-collections, beg libraries to take responsibility for maintaining them. That, too, is increasingly happening.

Existing E-resources at Risk

More expensively produced, complex kinds of electronic resources, also may be at risk. Some clearly have enduring value. Here is a generalized example:

History. Historians at a particular university decide to establish a center for the study of a major phenomenon in modern U.S. history, e.g., the Great Depression of the 1930s, the Civil Rights Movement, or the “9/11” terrorist attack. As part of the documentation, they wish to gather personal reminiscences from individuals involved, ordinary Americans as well as leaders, about their participation, their impressions, and the effects of the events on their lives.

The historians wish to include materials of multiple kinds: written and printed, photographed and filmed, artistically depicted in various media, and orally recorded (such as oral-history testimonies). If the historic event took place within the computer era, the history center also may seek relevant e-mail messages, word-processing files, and other digital documents. And the center may wish to combine electronically created material with digitized material into a collection that can be electronically searched, recombined for many kinds of analysis, and drawn upon for classroom use and scholarly publication, in electronic as well as printed formats. The center may want to include hypertext, which can be electronically linked to other documents for immediate access, and hypermedia, which can combine texts, graphics, sound, and video. All this will make possible new kinds of research and new tools for teaching. But who will ensure that it remains accessible into the future?

Physics. Or consider this example, generalized from actual developments. Scientists at a particular university establish an institute for research into a particular branch of physics. The physicists involve historians of science to document what the institute is doing. Because modern physics advances less through experiments by individuals than through massive collaborative projects, documentation of projects as they develop will be more useful than individual biographies written later. Scientists and historians collaborate to put working papers and annotated documents online, along with interviews collected from many of the scientists on research teams. Because scientists themselves are asked to help document developments, software is customized to make it relatively easy for them to do so.

This documentary project has features shared currently by many. The project is experimental, designed to take advantage of technologies that, themselves, are under development. The project is open-ended, adding digital material as the scientific work progresses. The project is interactive, involving exchanges of input and output within computer systems. The project makes intensive use of software, which is critical for understanding and using the content. The project embraces multiple kinds of documents and file formats. And the results are designed for dissemination and use via the Web rather than for formal, traditional, peer-reviewed publication.

But again, who will perpetually save these resources?

Creators vs Preservers

The developers of such electronic resources understandably concentrate on creating, not preserving, the content. They are concerned with such questions as these: What kind of computer system will accommodate the desired data? What software will facilitate bringing diverse materials into the special collection being assembled? Content creators make use of proprietary formats, customize the software, or do whatever else it takes to get the stuff into their aggregations. Only then, if ever, might their attention shift to how to get the data back out in forms that can be used for more than their own purposes and permanently preserved.

Preservation for persistent access may require creation of a lot of metadata—information about the structure, format, and other characteristics of digital material (subject, title, author, file size). Metadata is needed to manage a project’s electronic content, preserve it, and provide access to it. And metadata, itself, needs preservation. Taking time to create such metadata may seem a burden to scientists and historians busily engaged in creating collections for their own purposes.

If scholars decide to seek a library’s help to preserve such collections, they may be unable or unwilling to meet libraries’ needs to receive material in standard formats that facilitate preservation. Outside of computer science departments, how many faculty members, assembling digital databases for their own use, concern themselves with “persistent identifiers” (a name that does not change as a digital object moves from one computer system to another), “presentation forms” (the forms in which a user sees data), and “administrative metadata” (information about data of use in managing it)? But such arcane-sounding technical considerations require attention if a repository is to ensure the integrity, authenticity, and accessibility of electronic resources into the future.

A Web site, or a database containing texts, images, and sound, linked electronically to other texts, etc., so that computer users can go back and forth among them, is not just another kind of journal or book. A library cannot simply buy a digital database, log it into the library catalog, and put it on a reference shelf. To preserve electronic resources and provide electronic access to them, libraries must invest in computer infrastructures, tools, and software for “ingesting” data in preservable formats, “migrating” it to new media as old tapes and disks deteriorate and original systems become obsolete, maintaining its integrity, and verifying its authenticity. The library must help users find the e-resources they need, “present” such material in readable forms on the monitors of faculty and students, and electronically restrict access to those authorized to get it. Multiplying forms of e-scholarship are not resources that libraries preserve traditionally.

Who should, Who can, Preserve E-scholarship?

The major issues are technological and financial. Who should—and who can—persistently preserve e-scholarship created by scholars, or research institutes, or learning-service centers, or publishers, or individual libraries themselves? Who should bear the cost of making e-scholarship persistently accessible? One frequently mentioned option is that discipline-based learned societies of scholars, working across institutions, take responsibility for the preservation of e-scholarship at least in their individual fields. Another option is for governments and private foundations to exercise leadership in organizing and financing preservation.

However unsettled the responsibility questions may be, some libraries already are beginning to acquire e-scholarship. Despite uncertainties about how long current methods can preserve digital information in complex forms, several large universities are building electronic repositories for digital resources that they create and lease, and a few are also attending to the preservation of electronic resources created outside of libraries by faculty and even students.

Electronic Repositories

Typically, the electronic repositories that libraries are building are not just rooms down in the basement somewhere with shelving for computer tapes and disks. They are combinations of computer hardware called servers, which control access to resources; software for managing the resources’ content; and interfaces that permit faculty members or other resource creators to submit electronic files through a Web page into the software for preservation. The electronic files then reside on a hard drive attached to a server, which is maintained with others, not usually in a campus library, but in a secure, centralized “machine room” on the campus, where, among other things, there are backup electricity generators in case of power outages.

Harvard. Among its various schools, Harvard University has more than 100 libraries, which are individually responsible for selecting what will be preserved. The libraries have access to central services, including preservation services from an electronic-resource repository. For digital resources created outside the library system, the repository also offers preservation services to varying degrees. Resources that come to the repository in specified, normative formats may be preserved for use; resources in other formats may be preserved by keeping bits in order without responsibility by the repository, itself, for ensuring that their original appearance can be reproduced on future computer screens.

University of California. Through the California Digital Library, the libraries of the University of California are developing a digital preservation program that will serve all of the system’s campuses. Libraries, research institutes, laboratories, and museums within the system can provide their local e-resource developers with preservation services, supported by a system-wide infrastructure that meets shared preservation needs. Each unit is able to draw upon centrally held digital materials and customize them in ways that their particular patrons need. And through an E-Scholarship Program, the libraries have begun to take in, for long-term management, certain faculty-created resources, such as data sets and “preprints.”

Stanford. Similarly, university libraries at Stanford are building a digital repository for content considered worthy of preservation from several sources. Such content may come from digital content already owned by the libraries, from new digital-conversion projects, from externally purchased digital content (such as e-journals and e-texts), from archival and other donations, and from sources of continuing submissions such as Stanford’s course-management system. Eventually, Stanford hopes to offer preservation services to publishers and others off-campus. Already Stanford is testing a preservation system for electronic journals called LOCKSS ( Lots of Copies Keep Stuff Safe). LOCKSS provides software through which participating university libraries receive journals as published and preserve them in caches that are capable of checking each other for content validity and completeness, and replenishing each other if needed.

MIT. One of the most ambitious repositories under development is MIT’s “DSpace.” This repository primarily serves faculty members who contribute content using a workflow submission system or contract with the repository to process intake. MIT is collaborating with several other universities’ libraries to develop a federation that will use DSpace to preserve electronic resources for research and teaching developed by their faculties. In collaboration with the Hewlett Packard corporation, MIT is exploring ways for the DSpace repository to accept material in non-standard formats. And the MIT team hopes to develop “personal archiving” techniques to help faculty members prepare for preservation even as they create digital materials.

Librarians at MIT find much awareness among faculty members there that digital materials are at risk for the future. These librarians see DSpace as a way not only to provide for preservation but also to enlist faculty members as partners in doing so. Also, MIT librarians are keeping track of DSpace costs to inform a business plan that may include revenue-generating activities for services such as reformatting and metadata creation.

Smaller schools. Large institutions are far from the only ones in which digital-resource creation is flowering. Many librarians in other colleges and universities would like to help their faculties preserve their creations but so far lack the financial means and technological infrastructure to do so. Even if it were possible, however, would it be advisable for every college and university to develop the full range of technological tools and services required to preserve electronic scholarly resources? Or would a few large repositories, offering preservation services to multiple institutions, be adequate?


Such issues are under discussion within the National Digital Information Infrastructure and Preservation Program (NDIIPP). In 2001, the U.S. Congress appropriated $100 million to enable the Library of Congress (LC) to work with other agencies, federal and private, on a national strategy for the preservation of digital materials of all kinds. With $5 million of that appropriation, LC has commissioned studies and sponsored meetings involving electronic-resource creators, distributors, and users. The resulting plan1 envisions the incremental development of a national digital preservation infrastructure, built collectively within a network of participating institutions, and capable of supporting preservation activities by multiple institutions and communities of all sizes.

The NDIIPP plan is less a solution than a roadmap toward one; it does not alleviate the need for academic administrators to grapple with preservation issues on their individual campuses now. In a new report that analyzes the issues involved in preserving electronic, “new-model scholarship,” Abby Smith concludes,

Everyone who has a stake in access to digital information has a stake in the preservation of digital data . . . . To continue investing heavily in creating digital information assets without shoring up their long-term accessibility is like building castles on sand.2

What can be Done Now

Ms. Smith’s report suggests several things that higher-education administrators, librarians, and faculty members could do to help the survival of e-scholarship.

By Faculty. Faculty members could help by calling their e-scholarship projects to the attention of their campus libraries at the outset, letting librarians know what use is intended for each project, what audiences will be and could become users of the resources, and how long into the future the resources might be useful. Also from the beginning of e-scholarship projects, faculty members could adopt, so far as possible, standard and non-proprietary data formats, which would make it easier for libraries to preserve digital data. Ideally, faculty members would find out whether libraries on their campuses would be interested in preserving their e-scholarship, and if so, what technical specifications projects would have to meet to enable the library to take in, preserve, and provide access to the results.

By Librarians. Librarians could help by being willing to work with faculty members and other data creators in all phases of their projects, by clearly identifying policies and requirements for accepting e-scholarship for preservation and access, and by actually taking some material into custody for preservation experimentation. The experiments could include risk assessments of digital formats favored by e-scholarship creators, and tests of possibilities for reducing format vulnerabilities. Librarians could identify policies on their Web sites and aggressively invite faculty to consider e-scholarship preservation. Librarians could enter into partnerships with scholars, learned societies, and scholarly publishers to seek common approaches to the creation and preservation of digital material. Librarians at schools too small to support e-scholarship programs could help faculty find other places to preserve their e-scholarship.

By Administrators. Campus executives could encourage their librarians and faculty members to join in these efforts to preserve and extend access to e-scholarship, and reward them for doing so, making clear that what is at stake is the need to extract maximum benefit from resources that are part of the intellectual assets of the individual college or university as a whole. Executives also could promote the training of faculty and students in online research skills needed to find and make use of such digital scholarly resources. Also, because costs of digital preservation and access may exceed the budget capacities of many institutions, their administrators could seek collaborative solutions with other colleges and universities.

If nothing else, everyone concerned can keep abreast of the progress of e-scholarship preservation projects such as those described above to see what develops that can help us all deal with this universally growing need. — Deanna Marcum is President and Gerald George is Special Projects Associate, Council on Library and Information Resources (CLIR), Washington D.C.


1Library of Congress, Preserving Our Digital Heritage. Plan for the National Digiial Information Infrastructure and Preservation Program. Washington, D.C.: Library of Congress, 2003. Available at

2Abby Smith, New-Model Scholarship: How Will It Survive? Washington, D.C.: Council on Library and Information Resources, 2003. Available at

