Guest blog by John Parsons
The rise of digital STM publishing, and the ongoing discussion about open access and subscription-based models, has led some to conclude that these changes inexorably lead to lower overall publication costs. Reality is more complex.
In my last blog, I discussed the open access or OA publishing model for scholarly, STM publishing. In a nutshell, OA allows peer-reviewed articles to be accessed and read without cost to the reader. Instead of relying on subscriptions, funding for such articles comes from a variety of sources, including article processing charges or APCs.
There are many misconceptions about OA, including the mistaken notion that OA journals are not peer reviewed (false) and that authors typically pay APCs out of pocket (also false). However, a more serious problem occurs when we fail to account for all the costs of scholarly publishing—not just the obvious ones.
Digital Doesn’t Mean Free
Behind the scenes
The obvious publication costs of scholarly publishing—peer review, editing, XML transformation, metadata management, image validation, and so on—can be daunting.
Part of the problem is the Internet itself. Search engines have given us the ability (in theory) to find information we need. Many non-scholarly publishers, particularly newspapers, have published content for anyone to read—in the misbegotten hope of selling more online advertising. The more idealistic among us have given many TED Talks on the virtue of giving away content, trusting that those who receive it—or at least some of then—will reciprocate.
What may work for a rock band does not necessarily work in publishing, however. This is partly because publishing is a complex process, with many of its functions unknown to the average scholar or reader.
Behind the Screens
The obvious publication costs of scholarly publishing—peer review, editing, XML transformation, metadata management, image validation, and so on—are daunting for anyone starting a new journal. If they want to be considered seriously, publications using the “Gold” open access model have to be able to handle these production costs over the long term. They also have to invest in other ways—to enhance their brand, and provide many of the services that scholars and researchers may take for granted.
The first of these hidden costs is the handling of metadata. The OA publishing model—and digital publishing in general—resulted in an explosion of available content, including not only peer reviewed articles, but also the data on which they are based. Having consistent metadata is critical to finding any given needle in an increasing number of haystacks. Metadata is also the key that maintains updates to the research (think Crossref) and tracks errata.
The trouble is that metadata is easy to visualize but it takes work and resources to implement well. Take for example the seemingly simple task of author name fields. The field for author surname (or family name, or last name) is typically text, but how does it accommodate non-Latin characters or accents? Does it easily handle the fact that surnames in countries like China are not the “last” name? The problem is usually not with the field itself, but with how it’s used in a given platform or workflow.
Another hidden metadata cost is the emergence of standards, and how well each publishing workflow handles them. More recently, the unique author identifier (ORCID) has gained in prominence, but researchers and contributors may not automatically use them. There are many such metadata conventions—each representing a cost to the publisher, in order to let scholars focus on their work without undue publishing distractions.
Another hidden cost is presentation. From simple, easy-to-read typography to complex visual elements like math formulae, the publisher’s role (and the corresponding cost) has expanded. What was once a straightforward typesetting and design workflow for print has expanded to a complex, rules-driven process for transforming Word documents and graphic elements into backend XML, which fuels distribution.
The publishing model has drastically changed from a neatly-packaged “issue publication model” to a continuous publication approach. This new model delivers preprints, issues, articles, or abstracts to very specific channels. The systems and workflows that support the new publication model requires configuration and customization, which all have associated production costs.
Automation Is the Key
Very few publishers can maintain the production work required in house. Technology development, staffing, and innovation are costly to maintain. The solution is to rely on a trusted solutions provider, who performs such tasks for multiple journals. Typically, this involves the development of automated workflows—simplifying metadata handling and presentation issues, using a rules-based approach for all predictable scenarios. This of course relies on a robust IT presence—something a single publisher or group typically cannot afford alone. Ideally, automated workflows involve an initial setup cost, but will improve editorial quality, improve turnaround times, and speed up time to publication.
By offloading the routine, data-intensive parts of publishing workflow to a competent service provider, publishers and scholars can spend more time on actual content and less time on the mechanics of making it accessible to and useable by other researchers.