Impatiently Waiting for the Innovation to Begin: PDF & the Future of ‘Reading’

Rapid ebook adoption has been credited to both the rise of sophisticated mobile devices (led by smartphones, internet-linked ereaders, and tablets) as well as to the portability and convenience of having information easily accessible on-the-go. In the popular consumer market, EPUB and MOBI formats have made significant inroads. However, the rather static PDF still dominates in most academic/library markets.

Ebooks continue their general popularity and growing acceptance, with sales rising 43% in 2012 as measured by BookStats the market research data from the Association of American Publishers (AAP) and the Book Industry Study Group. Although this self-reported data is far less than the triple-digit increases of the previous 3 years, it is still significant. Their analysis found that 457 million ebooks were sold in 2012—still significantly less than the 557 million hardcovers—and their update on first quarter 2013 found a 5% growth in ebook sales. BookStats data is based on information voluntarily submitted by 1,192 publishers (not all publishers participate) and although not comprehensive, it provides the best available measure of the market in the U.S. Nielsen Research reporting on global ebook sales, focusing on fiction, shows similar slowed growth; while still predicting fiction ebooks surpassing print in 2014.

However, today the scholarly ebook and publishing markets are still plagued by issues of functionality, accessibility, and DRM. On Sept. 5, 2013, Wiley Online Library  alerted customers that it would be imposing limits to user downloads as a first step to address “a growing number of deliberate attempts to gain access without paying” for “publishers’ and other content providers’ Intellectual Property (IP).” Christopher M. McKenzie, vice president, Global Intellectual Property Management at Wiley explained that, “effective immediately, there will be a limit on downloading activity by registered users on Wiley Online Library. These users will be restricted to 100 full-text article/chapter/encyclopedia entries per day based on the previous day’s usage. Currently customers who access content via IP authentication will not be limited. Going forward monitoring and alerting mechanisms will be rolled out to cover all access methods.

pdf - PDFs offer publishers a stable format that is easier to control and meter for access. Because of their page-view appearance, they were a natural choice for journal and book publishers, years ago, wanting to take advantage of new online markets and enabling production savings by publishing their works electronically rather than in print. PDF was stable, based on the same models and production systems as print materials, and quite easy to protect for copyright to the materials while still expanding their market reach. Economically, it served publishers well by ending the need for print inventories and a backorder process. PDF added the advantage of preventing any after-market of sales through used dealers or even tag sales.

Publishing conversion house Aptara’s Peter Rogers notes that “PDF has become the major e-distribution format for the following reason…. In the majority of markets, digital still remains the smaller part of the revenue and when producing paper you need the PDF format to print. Publishers can use this as their e product so keeping costs down. Slightly different in some publishing fields where search and republishing is an important aspect (STM, legal etc…) where XML first workflows are now the norm but when producing to multiple platforms in multiple formats there [are] extra costs that need to be factored in to the revenue equation. Another reason why there has been slow take-up (this really applies in Europe) is that the publishing arm[s] have not really accepted the layout of electronic formats and still feel that the paper layout is the thing of beauty that you cannot get in epublishing yet.”

The persistence of PDF using Adobe Reader as a format of choice now seems arcane—few linking or added media “bells and whistles,” static, difficult to port to newer mobile devices or to allow for annotations or other user-input. Rogers believes that change will come, but perhaps slowly in traditional library markets. “STM will lead the way as search is money in this sector; education will follow as the students are demanding more features and flexibility in their texts. Trade books see these features as ways to enhance and increase their markets. Again the push backs are cost of creation, hardware/software compatibility, and versioning.”

Given the costs of conversion and publisher concerns about digital rights, libraries will probably continue the need to support legacy PDF/Adobe Reader formats well into the future. The cost involved in converting to newer formats will remain an economic issue for publishers as we move into the future. Just as libraries are often saddled with CD-ROM discs that don’t play on most contemporary operating systems, archival collections of PDFs, created from traditional print processes, will probably fill libraries’ virtual shelves well into the future.

epub_logo - www.libraries.wright.eduThe Pace of Change is Accelerating

The year 2013 may prove to be pivotal for the transformation of formerly-print media with the release of EPUB3, built on HTML5. This offers developers a rich set of tools and options for creating new types of creative and informational media that may soon require that we find a new word for these products instead of the term “ebooks.” The transition of EPUB from version 2 to 3 maps directly to the transition from ereader models (where black and white text dominates) to tablets—with options for color, video, and interactivity.

EPUB, as defined by the International Digital Publishing Forum (IDPF), “is the distribution and interchange format standard for digital publications and documents based on Web Standards. EPUB defines a means of representing, packaging and encoding structured and semantically enhanced Web content—including XHTML, CSS, SVG, images, and other resources—for distribution in a single-file format. EPUB allows publishers to produce and send a single digital publication file through distribution and offers consumers interoperability between software/hardware for unencrypted, reflowable digital books and other publications.” How long journal and STEM publishers hang on to PDF as the format of choice is, today, anyone’s guess.

EPUB 2, released in 2007, was a major milestone for the industry and remains the standard in place for most ereaders and ebooks today. In 2011, the IDPF released EPUB 3. IDPF’s executive director Bill McCoy explains that today “EPUB is by far the most prevalent format used for ebooks.” PDF, he explains is fading fast in terms of today’s publishers. “PDF represents very small percentage overall ebook sales at this point, although it is still widely used for ad hoc document interchange and some categories of commercial publications (like journals and STM), but even that is rapidly changing. EPUB moves beyond the final-form limitation of PDF, provides robust accessibility support and EPUB 3, the latest version, converges portable documents with HTML5 and the modern web platform.”

Many conversion services exist for publishers today. Chris Paddilla, Convert A Book CEO, notes that. PDFs are already becoming scarce in the mobile environment. “PDF’s mobile presence will continue to decline in favor of newer technologies like ‘Enhanced EPUB,’ which not only allows for read-along technology, but also highlights each word as it is read aloud by the device. There is a push for content to be responsive. Responsive being a way of designing a website and content to look good on multiple devices and platforms according to their screen size. I see something a bit more coming and it might just be through EPUB 3.”

More might be coming to readers, but how will libraries be involved? Some book publishers (not willing to be quoted here) still suggest that libraries, with their role in preservation and access, may not have a central role in future information distribution in the digital age.

We Are Both Part of the Problem and Part of the Solution

“PDFs have two claims on publishers,” publishing consultant Joseph Esposito explains, “one is cultural, the other economic.

The cultural issue is that the PDF is a fixed format. Publishers (and librarians) are married to the idea of the fixed text. It becomes the version of record; it stands for all time, etc. Contrast this to the shifting shapes of the web. The economic issue is that investing in formats beyond PDF doesn’t always yield more money—you end up selling the same number of units to the same number of customers regardless of what format the content is in. So moving away from the PDF is an innovation without a clear economic component…these issues are hard for publishers to overcome.”

Good e-Reader blogger Mercy Pilkington believes emphatically that consumer ebook developers have now “moved beyond PDF! All Kindle books are MOBI files and all Nook, Kobo, Sony, Smashwords, and most other retailers/distributors are in EPUB today.”

Creative development appears to be running far ahead of issues like DRM. At this point, it would appear that decades of our collections may remain in PDF (or digital print) versions unless libraries will be able to pay for the costs of conversion—as many have, buying digitized newspaper backfiles—whenever these might become available.

As newer types of media and publication are developed, our former “maps” to information will need to change as well. Page numbers may cease to have any relevance. Dates of publication may have little meaning with the ability to constantly update information. How will we find, cite, and use these new resources? As information becomes more fluid in the future, academe will need to find ways to adapt to these changes. This may prove to be every bit as challenging as adapting to the new technologies themselves.

Nancy K. Herther is Sociology/Anthropology Librarian at the University of Minnesota, Twin Cities campus.  Nancy’s email is [email protected].