By Paula J. Hane
The HathiTrust Digital Library announced on June 18, 2013 that it will partner with the recently launched Digital Public Library of America (DPLA) to expand discovery and use of HathiTrust’s public domain and other openly available content. DPLA provides an online portal to freely available digital material held by libraries, archives, and museums across the U.S. This collaboration is noteworthy for bringing together two complementary efforts—HathiTrust with its academic and research library focus, and the DPLA with a more public library focus.
Of HathiTrust’s nearly 11 million volumes, the 3.5 million that are in the public domain or have been made publicly available by rights holders, will become a DPLA “content hub”—accessible to DPLA users via the sharing of metadata records. HathiTrust will continue to host and preserve the digitized volumes.
According to HathiTrust executive director John Wilkin, the partnership reflects the complementary nature of the two organizations. “The first priority of HathiTrust has always been preservation,” he said. “But to fulfill the preservation mission, we must provide access: content that can’t be found and used risks being forgotten.” Wilkin stressed that HathiTrust will continue to enhance its own discovery and access platform, first launched in 2008. But DPLA puts HathiTrust’s collection before a broader audience, alongside innovative search and use tools, including timelines, maps, and a growing number of apps.
A notable part of the announcement is the support from OCLC, the worldwide library cooperative, for the contribution of records possibly derived from its WorldCat database. The HathiTrust metadata will be contributed under the terms of a Creative Commons “CC0” license. Sandy Yee, chair of the OCLC Board of Trustees, explains that DPLA’s Data Use Best Practices, which request that users provide attribution to metadata providers, are in keeping with OCLC community data norms. Yee said, “We are very pleased to support the discovery of this rich aggregation of freely available texts via the DPLA. Their work and that of HathiTrust amplifies and extends the efforts of the thousands of library contributors to the OCLC cooperative.” (OCLC normally recommends use of the Open Data Commons Attribution License (ODC-BY).”
The partnership officially began on June 18 and the data is in the process of being transferred from HathiTrust to the DLPA. DPLA will be working to add a special interface for books to supplement its novel map and timeline browsing interfaces, but the HathiTrust content will be available through the current site as soon as the data is loaded.
Dan Cohen, DPLA’s executive director, said, “Over the last five years, HathiTrust has built an incredible digital infrastructure to store the scanned holdings of its many university and library partners, and we in turn look forward to providing a large general audience for these valuable works, and new pathways into them.”
For a recent interview with Cohen, view the video on This Week in Libraries, TWIL #98.
A subset of some 1 million volumes from The HathiTrust has been searchable since April 2013 in StackLife, which was developed independently of the DPLA as an example of how people can create their own “front end” to the DPLA’s collection, mashing it up with other collections. That makes the DPLA especially useful, since it doesn’t have to supply the only way of accessing its data. StackLife was created by Harvard’s Library Innovation Lab. StackLife shows you a book on the shelf (spine) as it would be in a physical library.
HathiTrust began in 2008 as a collaboration of the thirteen universities of the Committee on Institutional Cooperation, the University of California system, and the University of Virginia to establish a repository to archive and share their digitized collections. HathiTrust has quickly expanded to include additional partners and to provide those partners with an easy means to archive their digital content.
The initial focus of the partnership has been on preserving and providing access to digitized book and journal content from the partner library collections. This includes both in-copyright and public domain materials digitized by Google, the Internet Archive, and Microsoft, as well as through in-house initiatives.
Wilkin successfully guided HathiTrust’s defense when the Authors Guild sued HathiTrust and a handful of its partner libraries in 2011. The court ruled that the HathiTrust’s book digitization project is protected by fair-use principles. Recently the Guild filed an appeal; this was followed by coalitions of libraries, colleges, and universities filing friends of the court briefs supporting the HathiTrust defendants. The Guild and its co-plaintiffs will be filing their reply brief later this month. Wilkin has been named Dean of Libraries and University Librarian at University of Illinois at Urbana-Champaign, effective Aug. 16, 2013, and will be stepping down from his position at HathiTrust.
The DPLA, which officially launched on April 18, 2013, is a large-scale, collaborative project working toward “the creation of a unique and consolidated digital library platform, ensuring America’s cultural and scientific record is free and publicly accessible online through a single access point, available anytime and anywhere.” The new portal delivers millions of materials found in American archives, libraries, museums, and cultural heritage institutions to students, teachers, scholars, and the public.
DPLA represents “[m]any decades in the visioning, two and a half years in the planning, with a small steering committee and an incubation hub at the helm, and featuring dozens of great libraries, universities and archives involved in hundreds of meetings, workshops, plenary meetings, and hackathons, attracting thousands of volunteers backed by millions of foundation and government dollars,” according to Doron Weber, vice chair of the DPLA Steering Committee and vice president of programs at the Alfred P. Sloan Foundation, a major, active funder of the project.
Partners are a critical component of DPLA. In addition to HathiTrust, DPLA is collaborating with Europeana, The National Archives, the New York Public Library (NYPL), the Smithsonian Institution, and more. There are many partners on board—a list is available here.
Paul Piper, writing in the March/April 2013 issue of Online Searcher, saw the importance of both of these worthy projects—well before the HathiTrust/DPLA partnership was announced.
One can’t help being deeply excited by the prospects these directions herald for the library and information field. Never before have collaborations of such enormous scope and complexity been possible. Whether they will succeed or not can only be measured in the future. How they will be measured is also up in the air.
Some would already call both projects a success, as they have identified the possible realms for future digital libraries. But one thing is certain: The collaborative and technical structures created by both of these projects will have major ramifications on how future generations do research. And that’s a very good thing.
Paula J. Hane is a freelance writer and editor covering the library and information industries. She was formerly Information Today, Inc.’s news bureau chief and editor of NewsBreaks. Her email address is firstname.lastname@example.org.