Beyond Print – Maximizing Content Value through Digital Transformation in Publishing

Publishers see a future where digital solutions drive growth through new, innovative products. To link knowledge across silos and disciplines requires us to uncover, untangle, and make available existing content. There are huge opportunities to upsell and add value to in-house and 3rd party content through a Digital-First Strategy.

Estafet accelerates the organisation and extraction of knowledge from text such as journals and books. We also help streamline your process of publication to cut time-to-market. This enables publishers to link and add value across multiple content types to produce new insights. We already use AI to reveal patterns and support enhanced decision-making.

Vision

Our goal is to help publishers untangle, clean, and extract and published their content to:

  • Enable a digital-first strategy
  • Create new digital revenue streams
  • Break down silos to instead extract insights across multiple datasets
  • Explore and grow new markets
  • Respond rapidly to new opportunities

Think about the way data are organised in your own organisation. Readers and researchers are often interested in both books and journals, using books as a way into a topic and journals to keep up-to-date and understand specific research in greater depth. Many publishers get as far as making content available as, for example, a PDF but do not help the user follow an interest across disciplines, bringing book content together with journals, and offering recommended reading based on search history. This also helps both publishers and academics understand the impact of published works.

Our vision therefore includes both accelerating the publication process (by making it easier to create, review, publish and link published content) but also to broaden the impact of this work by bringing it to a wider audience. This often means breaking down silos – books are often available separately from journals – as well as (machine) reading the content to add tags and form knowledge graphs.

Whilst we know what some of our users need today, we can’t know what problems they will want to solve in future. We therefore need to build small components (or “services”) which solve a single problem (perhaps tagging content based on an ontology) or substituting a branded drug name with the generic name to enable linking of academic papers, so that we can then re-use these components in different ways to solve new problems. This cuts time-to-market, making it quick and cheap to test out new ideas for products. 

Where are you on the innovation curve?

We can think of the path to this vision as a number of steps, shown in the following diagram. First, we need to know what is available and then work out how to standardise it (e.g. through a canonical data model), clean it, and make it available regardless of the silo from which it originates. We can then focus on the path to publication, making it quick to categorise, link and subscribe to content. At this point, we might want to incorporate 3rd party or acquired data – the user is probably uninterested in who owns the data so long as they can trust its provenance. We might also want to include datasets along with journal content to support researchers, or help users make connections between different disciplines – think about how biology relates to organic chemistry, which itself relates to molecular physics. Next, we create foundational services that allow us to assemble new software products for users. For example, we have a current project that involves extracting chemistry structure diagrams from a daily pipeline of pharmaceutical drug patents, converting the meaning of each diagram into a computer-searchable form. This allows us our pharmaceutical customers to compare new drugs in development with approved and tested drugs to accelerate time to market and combat disease.

Approach

The most effective way to achieve a digital-first strategy is to take a thin slice through our approach described in the following diagram:

This slice would not look at all the content available to a publisher, so Discovery could select a few related domains, and think how the content could be made available, who would be interested and what value they would attribute to it. We can then look at tools and, in the Data Strategy, at mechanisms we would need, build an ontology that would allow us to link the data, and think about how 3rd party content could be integrated. We would also design the data pipelines: the automated process for refreshing the data platform as new information becomes available.

At the Innovation stage, we can start to build out reusable service and design mockups or wireframes to trial ideas with customers. We want to explore the underlying consumer need: do they want a faster search, or are they just asked for that so that they can find all the published content on a specific topic of interest? If so, could we automate that for them to deliver a much more useful product which we could extend across multiple silos of data?

We’ve identified a number of products in the Growth stage, all of which we’ve built already for previous customers. The purpose is to identify potential target markets, assemble new products quickly and trial them to see what brings most value.

Case Study:
Breaking silos and creating new insights

  • Combine interdisciplinary sources such as research data, public data, and free-form text from publications
  • Flexible cloud services that could collaborate on many different types of problems
  • Generate new insights through data enrichment, harmonisation, Natural Language Processing, Machine Learning and Knowledge Graphs
  • Products include Corpora-as-a-Service (all available, relevant information gathered on-demand into a single cloud platform)

Case Study: Cutting time-to-publish

  • Our customer needed to reduce time to publication for journal articles by enabling authors to find the right contract for each article. Search needed to be faster and more accurate. 
  • Elasticsearch autocompleted name matches, including alternate names, language support, and interchangeable terms across 100,000,000 documents
  • The result was a 30 ms search response on each keystroke across the 60,000 different organisations
  • With 2 million API calls per day, the overall time saving was huge with accuracy increased by 80%
  • This freed the customer to work on value-generating opportunities and increased operational efficiency.

Case Study: Smart search across journals and books

  • A single user experience across many registers of published works
  • A high quality, canonical data model
  • Improves the reach and quality of published works, so that customers find information they need and evaluate research impact.
  • The Works Registry ensures:
    • Published works are registered in a single location available to all
    • Duplicate works are not stored more than once
    • Best view of works for given search criteria are returned with confidence factor of linked works
    • There is a common identifier for all known works
    • Completeness for any missing works

Benefits

  • Enable Digital-First strategy
  • Achieve growth through new revenue streams
  • Maximise value of content
  • Combine interdisciplinary data to break silos
  • Experience consultants make it faster, cheaper, and lower risk for you
  • Rapid innovation with short time-to-market
Stay Informed with Our Newsletter!

Get the latest news, exclusive articles, and updates delivered to your inbox.