AI and NLP for Publishers: How Artificial Intelligence & Natural Language Processing Are Transforming Scholarly Communications

A free report from Cenveo Publisher Services

You may have heard how artificial intelligence (AI) is being deployed within the information industry to combat fake news, detect plagiarism, and even recommend content to users. Until now however, AI has had minimal impact on the content creation and editorial functions of the publishing ecosystem. For scholarly publishers in particular, AI capabilities have advanced to a degree that they can actually automate significant portions of their workflows, with massive implications for their businesses, their authors and the research community.

AI is a method by which humans train machines to identify patterns and learn new patterns. It involves developing algorithms that enable machines to quickly process large swaths of data, recognize the patterns within that data, and make decisions or recommendations based on that analysis.

Natural language processing (NLP) incorporates grammar analysis into machine-learning. A computer program is trained to recognize the noun, verb, and object in a sentence, and to understand the structure of words in order to discern their meaning.

With NLP technology, publishers can automate simple editing and formatting tasks and focus their energy on adding greater value to the content. They can also manage more journal submissions or speed up tedious peer review without significantly increasing staff or production costs.

Traditionally, all articles submitted to an academic journal undergo a similar process with multiple rounds of corrections and changes before copyediting, formatting, composition and proofing. All told, this system could take several weeks before the article is published.

On the other hand, AI and NLP technology can implement pre-set grammar and formatting rules to analyze the content and score articles for quality. The technology will automatically correct minor errors like grammar and punctuation, and flag more complex issues that may need an editor’s attention. Journal submissions that are high-quality and can advance straight to the typesetting and composition stage.

AI & NLP technology can flag content that requires an editor's review

Because editing is often the most time-consuming part of the production process, fast tracking high-quality articles to the composition stage can save a significant amount of time for publishers—while also improving the author experience.

In our latest report, AI and NLP for Publishers, we explore how AI and NLP are being used today in scholarly publishing and how it may impact the evolution of research. We also explore how the technology works and how publishers like Taylor & Francis are, with the help of Cenveo Publisher Services, realizing the benefits of intelligent automation.

Download the free report.

 

Smart Suite 2.0 Released - A New Approach to Pre-editing, Copyediting, Production, and Content Delivery

Smart Suite Version 2.0 is a cloud-based ecosystem of publishing tools that streamlines the production of high-quality content. The system has a complete interface (UI) redesign and tighter integration with high-speed production engines to solve the challenges related to multi-channel publishing.

Smart Suite 2.0 is the next generation publishing engine that focuses on a combination of artificial intelligence, including NLP, and system intelligence that eliminates human intervention and achieves the goal of high-speed publishing with editorial excellence. Smart Suite auto generates multiple outputs, including PDF, XML, HTML, EPUB, and MOBI from a manuscript in record-setting time.
— Francis Xavier, VP of Operations at Cenveo Publisher Services

Offering a fresh approach to streamline production, the unified toolset comprises four modules that seamlessly advance content through publishing workflows while validating and maintaining mark-up language behind the scenes.

  • Smart Edit is a pre-edit, copyedit, and conversion tool that incorporates natural language processing (NLP) and artificial intelligence (AI) to benefit publishers not only in terms of editorial quality but also better, faster markup and delivery to output channels.
  • Smart Compose is a fully automated production engine that ingests structured output from Smart Edit and generates page proofs. Designed to work with both 3B2 and InDesign, built-in styles based on publisher specifications guarantee consistent, high-quality layouts.
  • Smart Proof provides authors and editors with a browser-based correction tool that captures changes and allows for valid round tripping of XML.
  • Smart Track brings everything together in one easy UI that logs content transactions. The kanban-styled UI presents a familiar workflow overview with drill-down capabilities that track issues and improve both system and individual performance.

Smart Suite is fully configurable for specific publisher requirements and content types. Customized data such as taxonomic dictionaries, and industry integrations such as FundRef, GenBank, and ORCID, enhance the system based on publisher requirements.

 

Download Brochure

Comment

Mike Groth

Michael Groth is Director of Marketing at Cenveo Publisher Services, where he oversees all aspects of marketing strategy and implementation across digital, social, conference, advertising and PR channels. Mike has spent over 20 years in marketing for scholarly publishing, previously at Emerald, Ingenta, Publishers Communication Group, the New England Journal of Medicine and Wolters Kluwer. He has made the rounds at information industry events, organized conference sessions, presented at SSP, ALA, ER&L and Charleston, and blogged on topics ranging from market trends, library budgets and research impact, to emerging markets and online communities.. Twitter Handle: @mikegroth72

Taylor & Francis Group Awards Full-Service Production for Global Journal Content to Cenveo

Cenveo’s Technological Innovation Aligns With Taylor & Francis’ Journal Publishing Vision

Cenveo announces a major increase in full-service content production for Taylor & Francis’ global journal production program. Taylor & Francis selected Cenveo as a core content service provider to support Taylor & Francis’s continued growth.

PR-quote_T-and-F.png

As a world-leading academic and professional publisher, Taylor & Francis cultivates knowledge through its commitment to quality. Taylor & Francis identified in Cenveo a shared vision to develop production workflows designed to improve the velocity of research dissemination. This planned strategic initiative enhances customer experience for Taylor & Francis' contributor base, particularly newer generations of researchers and scientists, without alienating its traditional market.

“The critical piece that convinced us Cenveo was the right partner was their technology stack supports our publishing model and provides real-world, expedited publication turnaround times using AI and natural language processing technology,” explains Stewart Gardiner, Global Production Director of Journals at Taylor & Francis Group. “The organizational and operational innovations Cenveo proposed to support a rapid scale-up in production volumes were something we haven’t seen from other providers and were clearly based on lessons learned in previous ramp-ups.”

In February 2018, Cenveo announced a financial restructure and reorganization to strengthen its fiscal health. Mr. Gardiner remarks, “Given the company is currently reorganizing following a Chapter 11 process, our legal and financial people looked at Cenveo closely and came to the view that this is a relatively straightforward debt for equity restructure. Refinancing of this sort is not out of line with what one might expect for a company in Cenveo’s market position, scale, and acquisition history.”

Cenveo and Taylor & Francis have shared a long work history prior to this fivefold increase in volume. The transition process has already begun and onboarding the additional Taylor & Francis work is scheduled to take place in structured phases throughout the remainder of 2018.

Given the company is currently reorganizing following a Chapter 11 process, our legal and financial people looked at Cenveo closely and came to the view that this is a relatively straightforward debt for equity restructure. Refinancing of this sort is not out of line with what one might expect for a company in Cenveo’s market position, scale, and acquisition history.
— Stewart Gardiner, Global Production Director of Journals, Taylor & Francis Group

“This major win is a result of considerable work and effort that we have put into the next generation of Smart Suite combined with a focus on operational excellence,” explains Atul Goel, EVP Global Content Operations and President and COO of India Operations at Cenveo. “We are grateful for the trust placed in Cenveo by Taylor & Francis and heartened that Cenveo’s long-term vision of innovative publishing workflows aligns with a global leader in publishing.”

Cenveo is consistently rated as one of the highest performing content service providers by its customers. Cenveo’s ongoing commitment to publishers and extensive experience with volume ramp-up is further demonstrated by its significant investments in technology and staff.

Comment

Mike Groth

Michael Groth is Director of Marketing at Cenveo Publisher Services, where he oversees all aspects of marketing strategy and implementation across digital, social, conference, advertising and PR channels. Mike has spent over 20 years in marketing for scholarly publishing, previously at Emerald, Ingenta, Publishers Communication Group, the New England Journal of Medicine and Wolters Kluwer. He has made the rounds at information industry events, organized conference sessions, presented at SSP, ALA, ER&L and Charleston, and blogged on topics ranging from market trends, library budgets and research impact, to emerging markets and online communities.. Twitter Handle: @mikegroth72

Accessibility FAQs

The topic of accessibility is a priority for all types of publishers in 2018 and we project it's the year the majority will invest in making content accessible for all readers.

Cenveo Publisher Services recently hosted a webinar on accessibility: "Digital Equality - The Importance of Accessibility in Your Publishing Strategy." If you did not catch the live webinar, you can stream it here. We received so many great questions during the webinar. However, we ran out of time before we could answer every one!

Following is a list of FAQs about content accessibility:

For decorative images, can you use alt text that reads something like “decorative image, yellow tulips.” Or is the null tag better?

A: Individuals who use read-aloud software or screen reader software frequently experience what’s called ‘audio fatigue.’ To prevent that, you want to limit what information they have to listen to. So if an image is purely decorative, it should be skipped completely.

If you are using HTML or PDF, use “” for null text.

In MS Word, you typically should leave the description field blank instead of using “”, only because Word will read it out loud as “begin quote, end quote,” and the reader will have to listen to it. The meaning will be understood but it's unnecessary and distracting.

Should alt text be limited to 130 characters?

A: Best practice is to use 4 to 10 words for short alt text and not exceed 100 characters in total. However, the long description should be detailed and describe the image in a meaningful way.

Who should write the alt text? Author, copyeditor, production editor? Our eproduction team is unsure whether we can just write alt text (especially when other people are reluctant to do so).

A: Writing alt-text needs understanding of alt-text writing parameters (accessibility) AND subject matter knowledge, especially for complex images. The best practice is to work with a service provider fluent in the process and then have the author review.

As a a beginner in this field, I'm interested in the basic technical details of what "semantic structure" means and which "metadata" should be accessible.

A: Semantic structuring provides meaningful tag names for key elements in the content (to facilitate search and discovery).  Metadata information comprises details about the book such as the title, author, ISBN, subject, etc. and the accessible qualities the product possesses. Appropriate semantic structuring and metadata depends on how the content is published and the formats produced. There are specific guidelines for web content, eBook (EPUB3), PDF, digital products (multimedia), etc.

If you would like more instruction and help, please click the Learn More link at the top of this page and we're happy to help.

Videos with audio should have captions, a transcript, and a video description. Is this a best practice recommendation or are all three required by law?

A: All three are part of the Section 508 requirements and WCAG 2.0. And so, yes, all three are required to make your video fully accessible to deaf, blind, and deaf-blind students.

What is the best way to make chemistry content accessible - in some cases thousands of molecular images? Considering ChemML is not broadly used or browser compatible, is it best to add alt text for each molecule?

A: Yes capturing the alt text for each molecule is the best approach considering lack of support by assistive technologies or screen readers.  A library of the molecules with alt text can be created for reusability of the molecules. 

Is there a standard for accuracy of closed captioned transcription of recorded educational/technical content?

A: The FCC closed captioning quality standards went into effect April 2014. This is of course for televised programming in support of the hearing impaired, but a lot of the standards apply to educational videos as well. More information can be found here.

How do I find out more about building accessibility in Adobe InDesign that transfers to Adobe Acrobat PDF files?

A: Here is a good resource from Adobe: Creating accessible PDF documents with InDesign CS6. We can help create validated accessible files or test ones you've created. Click the Learn More button at the top of this page for more information.

On a math test, if we describe an image of graph in alt text, we have technically answered the question. How would you make the image accessible to blind students without giving away the answer?

A: In that case, you would describe the visual appearance of the chart or the graph without interpreting the results. And you can find good examples of this at the Diagram Center website.

If you’re using a chart or a graph on a web page, you may want to provide an interpretation of the data so students will learn how to interpret. But if it’s on a quiz or a homework assignment, you only want to describe the visual appearance of the chart or the graph so that the student can draw inference themselves.

Can tables be accessible? Can you group a table and just give a summary? Or do you need to tag the table with header rows and table cells, etc.?

A: Tables can be made accessible. The tables should be tagged as per the accessibility guidelines, complex or large tables should be accompanied with a summary.

Depending on the technology you use or the software you use to create the table, tables are best for displaying data accessibly. MS Word does not allow you to provide column headers, so you should only use  simple tables.

You can create accessible tables using HTML. If you use a learning management system, it should have an HTML editor. In general, you should not have nested tables. You should break them up into several smaller, individual tables.

Do you know if publishers have a department devoted to making their products accessible?

A: The degree to which publishers are producing accessible products varies greatly. However, as regulatory deadlines kick in, more educational publishers are discovering that they risk losing substantial market share if they cannot provide content in an accessible format.

What is the breakdown of different disabilities among students that constitutes one-third to one-half of students with disabilities?

A: Please refer to this report, though this report was published in 2014 the information contained is useful: The State of Learning Disabilities.

Can you make a separate page for something that can’t be made accessible (say, using a Flash element)?

A: Absolutely. As long as you make the equivalent content readily available.

Does WCAG 2.0 cover dyslexic-friendly fonts?

A: No, it does not. The one success criterion that mentions typeface design is Level AAA, and even it only recommends sans serif typefaces and not even as a compliance issue.

What about dynamic Content Management Systems, like WordPress? Or eLearning authoring applications? Any recipes for Articulate, Camtasia, Lectora, or Adobe eLearning Suite?

A: WordPress can be made 100% WCAG 2.0 compliant. So can many other CMSs. We have a course and learning guide that goes through all of WCAG 2.0, including recipes for special platforms such as Articulate.

I’ve seen the Section 508 checklist. However, is there a checklist of things we can/should check for in the documents that you spoke about?

A: Yes. Essentially you need to walk through all the applicable WCAG 2.0 success criteria through the lens of a document. A simple checklist can be found at the following websites:

What if my website contains content that cannot be made accessible?

A: Some content, by its very nature, may not be made accessible. In such cases, the information provided must be made available to individuals with a disability in an equally effective manner. The Technical Guidelines provide suggestions for how to provide accessible descriptive content by which a person using accommodating technologies could understand what the inaccessible content is about. Note that using more established or more widely used technologies may be equally effective for all students, and allow for full accessibility.

Can I just cut and paste an image caption into an alt text field?

A: No. Alternative text should not be redundant with adjacent or body text.

We make content accessible only when required; typically after publication. Would it be more expensive to integrate accessibility for all titles at the onset of production?

A: Integrating accessibility at the onset of production is the recommended approach, it not only helps control the cost but also ensures the multiple products generated at the end of the production cycle will inherit the accessible qualities with no additional spending required to retrofit the product for accessibility. It is more expensive in the long-run to build accessibility into your workflow post publication.

What content requires a text equivalent?

A: Anything that is not text must have a text equivalent: pictures, image maps, video, sound, form controls, scripts, and colors.

Do all images need a text equivalent?

A: Any image that conveys information should have a text alternative. However, images that do not convey any information (decoration) should have an empty equivalent (in HTML, simply alt=""), so that people and assistive technologies know that they can be ignored.

How is Cenveo Publisher Services working with higher education publishers to move them towards accessibility?

A: Accessibility is integrated in our workflows to produce products that are born accessible. We endeavor to ensure all products are accessible and educate customers on the importance and benefits of accessibility as well as the legal compliance mandates. We have always recommended a born accessible product rather than retrofitting content for accessibility, which typically involves additional costs.

How can I get started making my content accessible?

A: Easy! Just grab a copy of our accessibility RFQ form, fill out, and return to info.psg@cenveo.com and we will get you started.

 

View Webinar


Related Reports


Download Brochures

Comment

Mike Groth

Michael Groth is Director of Marketing at Cenveo Publisher Services, where he oversees all aspects of marketing strategy and implementation across digital, social, conference, advertising and PR channels. Mike has spent over 20 years in marketing for scholarly publishing, previously at Emerald, Ingenta, Publishers Communication Group, the New England Journal of Medicine and Wolters Kluwer. He has made the rounds at information industry events, organized conference sessions, presented at SSP, ALA, ER&L and Charleston, and blogged on topics ranging from market trends, library budgets and research impact, to emerging markets and online communities.. Twitter Handle: @mikegroth72

Digital Solutions in India 2017 | A Special Report From Publishers Weekly

The annual report from Publishers Weekly (PW) that details service providers in India and the depth of solutions they offer in the global publishing market is now available. We are proud to take part in this special report that also captures a short list of accomplishments that Cenveo has experienced over the past year.

Recent Customer Success Stories

Cenveo Publisher Services recently worked with a global education publisher to develop an HTML5-based flashcard engine that offers flip card-styled content. “The end product combines terms and definitions with all types of media support to enhance user interaction and engagement,” explains marketing director Marianne Calilhanna, adding that the engine also “has complex assessment content built into the application to test knowledge about those terms and definitions learned.”

The entire application, which is WCAG 2.0 AA-compatible, was tested on three different browsers on three operating systems (iOS, OSX, and Windows). “It was also tested by an accessibility certification authority to ensure that the product is easily accessible by differently-abled users. The WCAG 2.0 AA compliance guidelines were thoroughly applied to the engine, including the colors used, color contrast, and settings panel. Then there was the use of large and well-spaced interactive elements or virtual controls, and the reinforcement of texts and visuals to ensure that no essential information was conveyed by audio alone,” says Calilhanna.

The next project from a major educational publisher was about creating and developing core content and supporting materials without hiring authors. “At first glance, it sounded like a cost-saving approach but it was actually more complex than that. Anyone involved with publishing educational content understands the deep and often hidden costs related to publishing and production,” Calilhanna says. “Our client, by partnering with Cenveo to develop and author higher-ed curriculum content, effectively bypassed ongoing royalties and permissions. This has resulted in lower costs and a positive P&L for the publisher, with savings passed on to students.”

Check out the full report:

Interesting to note the following observation, from PW

 
During PW’s trip to India to visit participants in this report early in the year, some digital solutions vendors—and their main U.S. clients in some cases—were already rethinking their business collaboration with plans of forming partnerships or joint ventures to sidestep the IT outsourcing/immigration issues. Some are looking into setting up branches in the U.S. to offer onshore and hybrid services, while a few more are checking out companies to take over and therefore have immediate U.S. representation.
— Publishers Weekly
 

At Cenveo Publisher Services, onshore and hybrid solutions have long been an option available from our portfolio of services. Whether it's full-service production management or peer review management services, we work with publishers to implement a workflow that best fits their content and their budget---offshore, onshore, hybrid.


1 Comment

Mike Groth

Michael Groth is Director of Marketing at Cenveo Publisher Services, where he oversees all aspects of marketing strategy and implementation across digital, social, conference, advertising and PR channels. Mike has spent over 20 years in marketing for scholarly publishing, previously at Emerald, Ingenta, Publishers Communication Group, the New England Journal of Medicine and Wolters Kluwer. He has made the rounds at information industry events, organized conference sessions, presented at SSP, ALA, ER&L and Charleston, and blogged on topics ranging from market trends, library budgets and research impact, to emerging markets and online communities.. Twitter Handle: @mikegroth72