|Editors: Ann P. Dougherty, Mountainside Publishing;
Richard M. Dougherty, University of Michigan, Emeritus
Contributing Editors: William Miller, Florida Atlantic University; Barbara Fister, Gustavus Adolphus College;
Mignon Adams, University of the Sciences in Philadelphia; Kathleen Miller, Florida Gulf Coast University;
Steven J. Bell, Temple University; Larry Hardesty, Winona State University; Mark Tucker, Abilene Christian University
Libraries and e-Books: Perception and Reality
by William Miller
Not a week goes by without some seemingly important new development in the world of electronic books, mostly centered on the Kindle or on Google’s mass digitization project. In academia, another common concern right now is how quickly the shift to electronic textbooks will occur. Academic libraries are not normally the focus of these discussions, but for libraries, the shift is as profound and inevitable as the shift to electronic journals was.
A Myriad of New Issues to Resolve
Libraries must have the systems infrastructure to make e-books easily available to users via their web sites. Interlibrary lending protocols will need to be completely rethought because in many cases electronic books are leased or rented rather than purchased. There are major advantages to researchers, such as the fact that books will increasingly remain in print, electronically speaking, and the universe of books available for research will burgeon. Storage needs for academic institutions will ease. Options for the visually impaired will improve. Cooperative collection development and inter-institutional storage of lesser-used print materials will also gain momentum. Libraries which experience hurricanes, floods, or fire will enter the electronic book age willy-nilly, and rapidly be able to reconstitute the bulk of their collections.
There are certainly issues for society as a whole to consider as we shift to e-books. Overall access will be broadened, but the digital divide between those who have access and those who do not may only increase. Dependency on e-book readers, cell phones, and other devices which are not bio-degradable continues the ecological issue posed by computer hardware, and malfunctioning e-book readers will cause sudden loss of access to books previously downloaded and paid for, in some cases.
What role will academic libraries play in this new e-book environment? It is understood that academic libraries are the major repository of printed books, but some may think that they are in the process of becoming outmoded because of Google’s efforts and the materials available from Amazon, Borders, and Barnes and Noble. Will users now be able to do without libraries? For the public library users who want only current popular material and are willing to pay for it, they will certainly be able to satisfy most of their needs without libraries. But what about the academic library user? Do Google and the commercial sites now offer everything, or will they? If all e-books will soon be available online, will academic libraries still be needed for access to books? The answer, in a word, is “yes.”
The Electronic Book Landscape
What are the main components of the e-book universe? There are obviously the newer, mass-market, “born digital” books or new books which are first printed and then soon issued electronically by their commercial publishers, with an eye for sales to individuals to use on personal computers and mobile devices; there are a host of free, full-text electronic books created by volunteer organizations such as Project Gutenberg, the Open Content Alliance, the Internet Archive, and university libraries everywhere; there are more specialized academic volumes being created by scholarly publishers such as Cambridge, Taylor & Francis, and Springer, for whom the primary customer is libraries; and then there is the massive Google Books Project. Some of these suppliers are better known than others, but academic libraries are providing books from all of these major categories, and remain the primary avenue of access for most students and faculty.
Follow the Money
Academic libraries will continue to play a crucial role in the provision of e-books for students and scholars, as they continue to do for journals, for monetary reasons. Questions of availability of hardware aside (and libraries are still the primary location for internet access for many), the first and fairly simple thing to consider is cost.
“Electronic” is not a synonym for “free.” Intellectual property is the coin of the realm, and copyright is the guarantor of the currency. Academia and governments provide digitized information “for free” because it is paid for by tuition or taxes.
With truly rare exceptions, however, publishers do not make their books available free of charge. One need only look at the Amazon or Barnes and Noble sites to realize that current, copyrighted publications are not available for free, though in some cases they may be less costly in electronic form than they are in paper. The profit motive is very robust. Early studies on electronic textbooks indicate that in many cases they are not much less expensive than the paper versions, and have the disadvantage that students cannot sell them back after the term is over, nor can they share them as many do now with the printed texts.
Regardless of how inexpensive electronic books might be, copyrighted texts will have to cost something. Anyone who deals with students knows that they will rarely buy a book that they do not have to purchase, and sometimes that includes even their required textbooks. Nor can faculty afford to buy the vast majority of the books that they might need to conduct their research.
Degrees of Accessibility
Not all electronic books are created equal, or offer equivalent functionality. A person can look at a book’s table of contents using Amazon’s “Look Inside!” function but in reality he or she can’t really look very far inside the book, with access to just a few pages and no text searching functionality at all. The entire site is designed simply to entice one to buy items, not to read them without charge or conduct research using them. The same is true, to a lesser extent, with Google’s Book Search product. The free materials created by libraries and the highly expensive materials created by academic publishers offer much more functionality, because they are designed to promote research, not just to entice purchases.
The Google Books Product
The Google Books product consists of three components, each with a different degree of accessibility: full-text, searchable out-of-copyright books scanned by Google (and provided to it by libraries); copyrighted books which Google has scanned willy-nilly without publisher permission, and for which “snippets” of several sentences are available; and “publisher partner” books from which Google has publishers’ permission to provide more extensive sample pages.
The out-of-copyright books in Google are really the most useful ones, from a research perspective. These books are in PDF form, which captures a photographic image of the original, and also in Optical Character Recognition (OCR) mode. The search function allows one to look for specific words and phrases in the text, but does not offer Boolean or other complex searching capability such as proximity searching, and has a limit of 100 results. If a book is clearly out of copyright, Google makes the entire text available for both reading and downloading, along with the limited word searching functionality.
The books which are copyrighted and which Google digitized without permission of the copyright holder are much less accessible. There is still the availability of text searching, but the results are limited to several “snippets” of a few lines each for the reader to see. There is nothing approaching a comprehensive text search or display of the book itself. These are the books which caused publishers to cry foul and sue Google for copyright infringement.
The “Publisher Partner” program. Finally, there are those copyrighted texts for which Google already has permission to provide extensive sampling, in collaboration with the publishers. Google is essentially acting like Amazon for these titles. These are the more current texts which publishers allow to be displayed up to 20% of the original full text. This limited full-text option will sometimes be enough to satisfy one’s needs, but it is a crapshoot. The copyrighted books are available only in “preview” format, which means that most of the pages are arbitrarily removed in no rational or predictable way.
Reality check. What does this mean, in practice? Take my edited collection of articles, College Librarianship (Scarecrow, 1981). Looking at the preview on Google, the fifth page of the six-page introduction is missing—not too bad. But it becomes impossible to read much in the first half of the book (which is all there is in the preview), because of frequent omitted pages. Concerning my own article, “Faculty Status in the College Library,” which appears on pp. 118-134 in the book, only the first two pages (pp. 118-119) are in the preview. The preview ends with two random pages (140 and 142) from the following article, and the subsequent 142 pages are omitted.
I would not consider that the equivalent of browsing through the book in a bookstore (which, in any case, one could not do because the book is long out of print, though the publisher retains the copyright and has not produced a digital edition or given anyone else permission to do so). In other words, one might, by chance, find what one wants in the Google previews, but then again, one often will not, and this is not a recipe for conducting successful research.
One will inevitably want the entire text, which means either buying the book (which originally cost $15 and can now be had for $48 for a used copy on Amazon or Borders) or clicking on “Find in a library,” in the “Get this book” section on the left. My hunch is that only a library (or my mother) would actually pay $48 for this book, so realistically, libraries are one’s inevitable choice for meaningful access to an out-of-print, copyrighted title.
A valuable service. For the moment, at least, Google is providing quite a service by making all three of these categories of books available. This is especially true for the out-of-copyright materials easily available in full text. Google developed proprietary technology for mass digitization of these books, and while the result is occasionally a picture of a hand rather than a page of a book, these books on the whole are well done and can be wonderful resources for people needing access to materials not widely available in most libraries.
For example, I used nineteenth-century illustrated editions of Robert Burns, not owned in paper at my library, to help my American students understand the heavy Scottish dialect in Burns’ poetry. As long as Google continues to make such materials freely available, it is performing a valuable service for students and scholars.
Adding Value to e-Books
The efforts of commercial entities such as Amazon and Google are admirable, but the utility of their products cannot match the value that some non-profits, for-profit vendors, and libraries are adding to electronic books through such labor-intensive activity as keyboarding and proofreading, and through enhancements such as proximity searching (find X near instances of Y).
Project Gutenberg. This is a private, non-profit effort begun in 1971 at the University of Illinois, has so far managed to key in 30,000 texts (of non-copyrighted works), and proofread them also, using an army of volunteers. Such efforts greatly enhance the accuracy offered by OCR only, and set the stage for automated, machine-generated audiobooks, automated translation into other languages, and in-depth searching for specific words and phrases anywhere in the book, as well as for comparison among books.
The Early English Books Online Text Creation Partnership (EEBO-TCP). This is a project, carried out primarily by libraries at Oxford and the University of Michigan, with funding from the Council on Library and Information Resources and subscription fees from about 100 academic libraries. It is based on the Proquest Information and Learning database, Early English Books Online, which provides digital reproductions of every book in the Short Title Catalog of Early English Books printed between 1473 and 1700. It is possible to purchase this database by itself, and many libraries have done so. The actual books in this product are too rare or fragile to have been given to Google to digitize for its Books product.
“ ‘Electronic’ is not a synonym for ‘free.’ Intellectual property is the coin of the realm, and copyright is the guarantor of the currency. Academia and governments provide digitized information ‘for free’ because it is paid for by tuition or taxes.”
The EEBO database, with its digitized photographic representations of these rare and valuable books, is quite useful, and makes it possible for people to see the originals without expensive travel to the handful of libraries which own them. However, many of these early texts are poorly printed and next to unreadable except to the most learned and experienced, because of their odd fonts, confusing layouts, and inconsistent spellings.
Unlike the more modern titles being digitized by Google, visual reproductions of these texts are not sufficient to enable reading, in many cases. In OCR versions, the marginal glosses so common in older texts become arbitrarily run into the text. Stray marks become letters. “Research” might become “hesearch,” “Clarendon” might become “Clakendon” [these are actual examples] and thus be lost in a search for instances of the words “research“ and “Clarendon.” So OCR scanning is often nearly useless in helping to create readable, searchable versions of these older works.
Therefore, starting with OCR of the digital reproductions, EEBO Text Creation Partnership staff at Oxford and Michigan improve upon it by creating keyed-in text versions of these early English works, and proofread the results for additional accuracy. This effort, though very costly, results in texts of very high usefulness and accuracy, and forms the basis for scholarly projects such as concordances and creation of standard editions of major works.
The searchability also enables scholars to study specific themes in ways previously impossible to explore. Phase One of this project resulted in creation of 25,000 texts, and an additional 25,000 are planned for the next phase, if funding can be secured from library partners.
Intelex. There are commercial vendors, also, which do provide comparable searchability and quality texts, and they depend primarily on libraries to purchase these value-added products on behalf of their users. Of particular note as an exemplar is Intelex and its Past Masters series. Creating authoritative electronic editions of major figures like Aquinas, Aristotle, Augustine, Darwin, etc., this vendor takes care to base its electronic full-text books on standard scholarly editions, often licensing them from university presses and other reputable publishers; in cases where a standard scholarly edition does not yet exist, Intelex works with scholars who are in the process of creating one.
Using such carefully created texts enables the reader to go far beyond what one could find in an index; one can search every word in a text, and the searches are linked to specific pages. Creating concordances and exporting text to a word processor goes from being a laborious process to being a relatively trivial one. For its editions of non-English works (such as Wittgenstein’s Nachlass or Descartes’ Oevres Completes), Intelex also provides an English translation.
As in the case with the EEBO product, the Intelex databases enable the scholar to follow such things as the orthographical eccentricities of particular authors, examine word usage in context, and cross-search one work against another by the same author, or by any other author in the database.
Alexander Street Press. Of similar note is “The Romantic Era Redefined,” a product of Alexander Street Press, which reproduces 100,000 pages of critical editions of minor authors of the British Romantic era, along with accompanying letters, diaries, speeches, lectures, and journal articles. Searchability features are similar to those in the Intelex products, along with page images of the original texts themselves. A scholar could, for instance, search for the presence of a particular word or phrase in all of the works by women, or compare one author’s use of a theme with that of another author or authors. This kind of research would be crushingly tedious with either printed works alone, or with PDFs that offered very little searchability.
Library Book Collections: Living like Janus
Academic libraries are not in a mad rush to divest themselves of their printed collections, and they still buy printed books, but it might surprise some to learn that there are many smaller academic libraries today whose book collections are predominantly electronic, and this trend will certainly intensify. We are not talking here only about older, standard titles, but also about frontlist, current titles in all fields.
Electronic collections. OCLC’s NetLibrary offers 200,000 electronic books in 20 languages, and in every field of study, licensed from today’s publishers—titles like Freakonomics.
To add books about computing and web design, the easiest method is to subscribe to O’Reilly Media’s Safari Books Online product, which offers immediate access to books on Dreamweaver, C++, Photoshop, etc. As an added bonus, the electronic Safari titles will not immediately be stolen, as is so often the case with printed books in high-demand areas. Pages will not be ripped out, text can be downloaded, and if the library licenses it, multiple users can access the same book at the same time.
For areas in which current information is most crucial, such as medicine, who would not prefer Elsevier’s MDConsult, which continuously updates basic texts, to printed texts which are outdated as soon as they appear?
Other major publishers and their imprints, from Cambridge University Press to Springer and IGI Global, to Taylor and Francis provide thousands of titles in every field of current endeavor. Libraries can also order “bespoke” collections, commissioning ebrary or other companies to digitize works of targeted interest to specific academic needs, as has been done extensively in England where universities have used ebrary to create tailored digital book collections.
Findable and discoverable. One problem with library e-book collections up till now has been “findability”; even with our online catalogs, academic library collections have been difficult to navigate and much that we pay for has not been easily “discoverable” (a new library buzzword). Many of our electronic books (such as those included in aggregations) might not be individually cataloged and therefore they might be hidden to the casual users.
Google’s one-step searching has outshone library catalog search options, and most average users have followed their instinct for ease of searching vs. quality of results. This disparity will change over the next few years, however, as libraries implement a new generation of Google-like “discovery tool” which will enable much more comprehensive searching of our holdings.
Some things that will not change. Privacy, a continuing concern in the internet age as vendors gather more and more information about who you are and what you are reading, will remain a priority for the academic library, as will intellectual freedom. Libraries may suggest additional materials to you, based on what you are currently using, but they will not use this information to create a dossier on you for future reference, and they will pay for the additional materials they suggest for your use, not charge you for them. The needs of users, rather than quarterly profit and loss statements, will be the overriding goal, achieved through facilitation of research and individualized assistance.
The crucial point to remember regarding electronic books is that, as with traditional print books, libraries will continue to be the primary medium of access. Yes, some users will purchase some books, including textbooks, on Kindle or its forthcoming competitors, but most users do not and really could not purchase all of their own scholarly resources, either in print or in electronic form. The cost of premiere products such as those from Intelex and Alexander Street Press would be beyond the means of all but a few dedicated researchers. Some out-of-copyright books and certainly many other classes of materials will be available via the Internet, free of charge (or at least it seems so at present).
Most books, however, will cost something,
be they in print or electronic. So the more things change, the more they will
remain essentially the same, though the medium of transmission will change.
—Bill Miller <email@example.com>
Library Issues: Briefings for Faculty and Administrators (ISSN 0734-3035) is published bimonthly beginning September 1980 by Mountainside Publishing Co., Inc., 321 S. Main St., #213, Ann Arbor, MI 48104; (734) 662-3925. Library Issues, Vol. 30, no. 1 © 2009 by Mountainside Publishing Co., Inc. Subscriptions: $84/one year; $144/two years. Additional subscriptions to same address $26 each/year. Address all correspondence to Library Issues, P.O. Box 8330, Ann Arbor, MI 48107. (Fax: 734-662-4450; E-mail: firstname.lastname@example.org) Subscribers have permission to photocopy articles free of charge for distribution on their own campus. Library Issues is available online with a password at http://www.libraryissues.com
Produced by Mountainside Publishing Co., Inc.