Book Review

“Ambition in the Stacks: Google and the History of Academic Libraries”: Scott Richard St. Louis on Deanna Marcum and Roger C. Schonfeld’s *Along Came Google: A History of Library Digitization*

The Book

Along Came Google: A History of Library Digitization

The Author(s)

Deanna Marcum and Roger C. Schonfeld

At a moment of widespread digital disillusionment, it can be surprising to recall just how near in time we remain to far brighter attitudes about the role of technology in democratic life. Consider, for example, Bill Clinton’s words during his presidency: “It is a time to build … an America where every child can stretch a hand across a keyboard and reach every book ever written, every painting ever painted, every symphony ever composed” (January 1998 State of the Union Address).[1] Optimistic ambition, at a time when the Web must have felt so fresh and new, surely was easy to come by.

The Google Books project, announced just a few years later in December 2004, was very much a product of such thinking. Writing in The Atlantic in 2017, James Somers called it “the most significant humanities project of our time” with good reason: “You were going to get one-click access to the full text of nearly every book that’s ever been published. Books still in print you’d have to pay for, but everything else – a collection slated to grow larger than the holdings of the Library of Congress, Harvard, the University of Michigan, at any of the great national libraries of Europe – would have been available for free at terminals that were going to be placed in every local library that wanted one.”[2] With such a captivating ideal in mind, two big questions emerge. What went wrong? Where does such a vision fit within the broader intellectual history of libraries and publishers? For the late Deanna Marcum and Roger C. Schonfeld, these questions animate Along Came Google: A History of Library Digitization. Marcum, at one time Associate Librarian for Library Services at the Library of Congress, and Schonfeld – currently Vice President for Organizational Strategy and Libraries, Scholarly Communication, and Museums at Ithaka S+R – bring considerable expertise to the project of situating Google’s efforts within a history forged by librarians and technologists, publishers and scholars alike. This concise and lively book, enriched by interviews with several key figures in the Google Books saga, charts the rise and fall of the project from original announcement to vision-altering legal conflict. It amounts to essential reading for anyone seeking to understand in context a missed opportunity of great significance in the ongoing story of how scholarship circulates.

To begin, Marcum and Schonfeld observe that the late twentieth and early twenty-first centuries “marked the transformation of libraries from builders and preservers of collections to information nodes that connect information seekers with resources from all over the world” (page 2). This transformation, a century in the making, began humbly, with the development of interlibrary loan. In 1886, University of California Berkeley librarian U.L. Rowell initiated a partnership with the California State Library, enabling faculty to request delivery of state library materials to their campus (page 16). Rowell continued building an interlibrary loan network into the 1890s, when the American Library Association created a “profession-wide form … that all libraries could use to request materials from other libraries in the United States” via the postal service: “Even though the process was time-intensive, it gave scholars and researchers an opportunity to request materials that would otherwise be unavailable to them” (page 17). Such were the beginnings of a lasting staple in the scholarly resource-sharing infrastructure that is all too easy to take for granted today.

After World War II, higher education boomed in the United States. Robust funding enabled scholarly publication to proliferate, and the most prestigious research institutions built huge library collections, prompting some campus administrators to regard their library systems as excessively burdensome from a financial standpoint. This perception led in turn to academic librarians paying more attention to arrangements for sharing resources across institutions, connecting patrons with the materials they needed without allowing acquisition costs to grow unchecked (pages 21-24). Budget-conscious administrators looking for opportunities to make an impact? Librarians keen to manage resources judiciously while carefully molding perceptions of their work’s value on the wider campus? Plus ça change, plus c’est la même chose!

Of course, collaboration among libraries before the Web extended beyond interlibrary loan. Indeed, such collaboration “became much easier when automation pioneer Henriette Avram” – formerly of the National Security Agency – “introduced Machine Readable Cataloging (MARC) at the Library of Congress in the 1960s” and thus enabled libraries far and wide to begin making “the mental transition from stand-alone institutions with locally developed rules for bibliographic control to nodes in a national network” (pages 28-31). Central to galvanizing such a transition was the creation of the Ohio College Library Center by Frederick Kilgour and colleagues in 1967; just more than a decade later, in 1978, this organization would change its name to OCLC, Inc., today known for maintaining the essential WorldCat resource (pages 31-32). The first chapter of the book thus establishes vital historical context: “These opportunities arose from nothing more than the digital availability of what we today call the metadata about our collections. No digitization of the underlying content had yet been undertaken, but the stage was set for thinking of libraries as a national network of information resources” (page 32). Librarians in the United States had accomplished much in their first century of professionalization. Still, much remained to be done.

In the second chapter, Marcum and Schonfeld explore digitization-related efforts other than Google Books. Covered in this section are Gloriana St. Clair and Raj Reddy of the Million Book Project at Carnegie Mellon University (pages 40-43), Brewster Kahle of the Internet Archive (pages 43-47), James H. Billington and Laura Campbell of the Library of Congress (pages 47-50), Paul Courant and Wendy Pradt Lougee of the University of Michigan (pages 50-54), Patricia Battin of the Digital Library Federation (pages 56-64), and William G. Bowen of the Mellon Foundation and JSTOR (pages 64-69). The primary purpose of surveying this cast of characters – hailing from academic administration, computer science, economics, grantmaking, librarianship, and strategic planning – is to illustrate that any effort toward realizing the dream of a comprehensive digital library, centralized or not, would demand contributions drawn “from a variety of different dreamers” capable of seeing “real possibilities of being able to make library collections easily and freely accessible” (page 70). No one way of thinking alone can prove sufficient; instead, interdependence is essential.

Chapter three introduces Google to the story. In December 2004, Google announced a partnership with five major library systems: the University of Michigan, Stanford, Harvard, the New York Public Library, and Oxford University. These partnerships were expected to allow for the digitization of ten million books (pages 88-90). Why would Google invest any of its vast resources into a project so obviously replete with legal risk from an intellectual property perspective? Marcum and Schonfeld explain that “Google’s vision was ‘to organize the world’s information and make it universally accessible,’ which conveniently enough also served the interests of the advertising-driven business model that would propel the company and its founders to unimaginable riches … This was something new: a strong technology partner with the deep pockets and audacious mind-set to try to make something tangible happen … and library leaders willing to take big risks in order to achieve their goals” (pages 73-82). The early years of the twenty-first century therefore played host to serious potential for dramatically expanded access to some of the world’s greatest library collections.

In chapter four, Marcum and Schonfeld explain that the “biggest beneficiaries of the Google book digitization project stood to be those outside academia – individuals who previously had at best limited access to the long tail of society’s cultural and intellectual heritage. Mass digitization stood to make a meaningful, if incomplete, improvement in this access. Many were enthusiastic about the possibility” (page 96). Chapter five, however, reveals that not everyone paying attention to the endeavor cheered it onward wholeheartedly: “In this moment of possibility, the academy and its allies were not universally supportive. Without any question, some were jealous about the ability of a commercial enterprise to take such a bold step toward universal access. Others were concerned about the displacement of libraries and librarians. But many also feared the prospect of a single major entity gaining outsized influence over the digital future of society’s shared intellectual and cultural heritage” (page 105). Conflict was on its way. Would the original vision prevail?

Chapter six details the legal trouble into which the Google Books project fell, and includes the workings of an ill-fated settlement forged by the various concerned parties:

“With the [copyright] status of approximately 75 percent of recorded texts being uncertain, neither publishers nor librarians had been willing to take the risk of investing large sums to digitize these materials. Google was willing to take the risk … Public domain works would be available in their entirety, with every page made available in full. For those books that were still in print, Google promised to work with publishers to determine what parts of their books would be accessible and under what conditions. For works for which it had no permission to display and which might still be under copyright, Google would only show ‘snippets.’ The copyright owner could opt out of participating at any time after establishing ownership of the copyright … Asking publishers to opt their titles out, rather than accepting the burden of working with publishers to negotiate for which titles would be included, appears to have enraged the publishers” (pages 135-141).

Consequences came swiftly: “The Authors Guild, along with a handful of individual authors, filed suit against Google in the United States District Court for the Southern District of New York on September 20, 2005. Five publishers – McGraw-Hill, Pearson, Penguin, Simon and Schuster, and John Wiley & Sons – sued Google in the same court one month later on October 19, 2005. Ultimately, the two cases were consolidated into a single action” (page 142). Google’s technology-intensive approach to a longstanding utopian dream collided with the stubborn realities of copyright in the American legal context. Here, some words from former Harvard University Librarian Robert Darnton are germane: “I’m in favor of copyright, don’t get me wrong, but really … to keep most American literature behind copyright barrier … I think is crazy.”[3] This barrier demanded attention, crazy or not.

Even so, all was not yet lost. Parties to the lawsuit worked out a settlement:

“publishers would be able to participate in a collective licensing regime for out-of-print books. Google would be able to display and sell the digital versions of these … books, but 63 percent of the revenue would go into escrow for the Book Rights Registry. The registry’s funds would be distributed to rights holders as they came forward to claim their rights. In cases where ownership was not clear, registry funds would be used to sort out who actually owned the rights. The proposed settlement had the virtue of achieving the authors’ and publishers’ dreams of being paid something for their work and having their works read by larger audiences” (page 144, emphasis in the original).

Richard Sarnoff, then chairman of the board for the Association of American Publishers, “frankly admitted that the publishers had a great deal to gain from mass digitization. They had been concerned for a long time about the great number of books that were out of print but still in copyright … Digitization was costly. Publishers were not set up to do that work. Most importantly, they did not know if the demand for these older works would justify the expense of digitizing them” (page 145). Add these concerns to the onerous warehousing costs diminishing publishers’ profit margins, detailed by Marcum and Schonfeld on pages 134-135, and it becomes clear that this settlement was promising for publishers and Google alike.

So, why did the settlement fail? In a nutshell, antitrust concerns expressed by the federal government: “With the Justice Department recommending against the settlement, the Court rejected it and sent it back to the parties to renegotiate … a revised settlement agreement that was far more narrowly scoped … Regrettably, the bold idea the founders of Google had for making all the world’s knowledge accessible was not to be realized” (pages 150-151). Chapter seven begins on a solemn note: “With the failure of the ambitious settlement agreement, the prospect of radically democratizing access to these digitized materials ended” (page 160). Copyright issues brought Google to court, and antitrust considerations dealt an additional blow that was perhaps more surprising.

What, then, is the ultimate significance of the Google Books project? Looking back on mass digitization efforts through a historical lens, for Marcum and Schonfeld, “it is clear that while many individuals and organizations played vital roles, none was more significant than that of Google. Even though the project that resulted and the impacts that it had were ultimately limited relative to the vision, millions of books have been digitized, the information they contain was made more discoverable, and access to many of them improved dramatically” (page 5). Chapter eight strikes a more hopeful tone: “instead of a single coordinated program to provide digital access for the entire intellectual and cultural record that is easy to use and ubiquitously accessible, perhaps we can imagine another model for the universal library: one that is the accumulation of many efforts, all of them ultimately incomplete, controlled by an array of different actors” (page 194, emphasis in the original). From interlibrary loan and Google Books and the Internet Archive to HathiTrust and beyond, such an accumulation is already taking form. Marcum and Schonfeld discuss HathiTrust extensively (pages 165-187). Progress doesn’t always sprint ahead, but sometimes – when enough talented and well-meaning people get involved – it stumbles forward. However, we remain far away from a world of universal digital access even just for those in academia, let alone for the curious public at large.

In an epilogue, Marcum and Schonfeld resist facile conclusions about the ultimate fate of the Google Books project, remarking instead that we are left with a persistent challenge on the road to a comprehensive accumulation of freely accessible digital library services: “The current system of local funding for libraries makes it difficult to justify large expenditures to address national problems” (page 210). When collaboration is necessary to address system-level issues and goals, who dedicates their resources first? Who gets to ride along for free? The consequences of such constraints are potentially grave: “It is not clear that new generations will be able to distinguish trustworthy knowledge from misinformation” (page 210). To this bleak illustration of high stakes, I might add – sadly – that it often seems large segments of all generations already are ill-equipped to make this distinction for themselves. Libraries, as institutions advocating for information literacy and energized by civic purpose, surely have a role to play in alleviating the social ills of our digital age. What role the dream of a universal library for everyone might play in such alleviation remains up for discussion. Sometimes, though, dreaming big – from interlibrary loan to MARC to mass digitization – might be more practical than it seems.

[1] Marcum and Schonfeld include Clinton’s words in their book on page 151. For a full transcript, visit the following resource from the Miller Center at the University of Virginia: https://millercenter.org/the-presidency/presidential-speeches/january-27-1998-state-union-address

[2] James Somers, “Torching the Modern-Day Library of Alexandria,” The Atlantic, April 20, 2017, https://www.theatlantic.com/technology/archive/2017/04/the-tragedy-of-google-books/523320

[3] Ibid.

About the Reviewer

Scott Richard St. Louis holds an MS in information science from the University of Michigan. He represents only himself with his posts.