AWS for M&E Blog
Reach plc delivers impactful journalism with AI driven Guten powered by AWS
This post is co-authored by Lewis James, Senior Data Scientist at Reach plc and Dan Taffler, Group Director of Data and Analytics at Reach plc.
News occurs at breakneck speeds. Publishers need to constantly evaluate and innovate business processes to accelerate the process of publishing—from initiation to publication. Changing this became one publisher’s mission—creating their own generative AI product powered by Amazon Web Services (AWS) to help journalists ethically stay, moment-by-moment, on the leading edge of news.
The mission
As the largest commercial, national and regional news publisher in the United Kingdom (UK) and Ireland, Reach plc have an in-house editorial team with thousands of journalists. They cover over 120 brands (such as The Mirror, The Express, Daily Record, Manchester Evening News and Daily Star) that empower, enlighten and entertain audiences. Every month, Reach delivers news to 69 percent of the online UK audience, as well as 11 percent of the online US audience through the recent creation of US brands (such as The Mirror US).
As part of their mission to free up journalists to focus on high-value and impactful journalism, Reach’s Data Science team developed Guten. Guten is a product powered by foundation models implemented in Amazon Bedrock—automating manual, non-core tasks like adding article tags and drafting content for journalists to build on.
Guten product origins
The project that would become Guten started from thinking deeply about Reach’s underlying business metrics. Reach wanted to cut the time spent using multiple systems to do non-core manual tasks (adding hyperlinks to articles or combining different metrics across disparate analytics tools). They also wanted to accelerate the time to publication for breaking news, while increasing overall page views across all Reach brands.
With generative AI as a new strategic inflection point for the publishing industry, Reach decided to leverage it by starting with a proof of concept. This led to a deep collaboration with their newsrooms, working backwards to identify constraints, bottlenecks and repetitive tasks inhibiting journalists’ day-to-day.
Through user research, Reach identified three challenges:
- Frequent context-switching between different tools to perform their job, impacting breaking news and article performance
- Repetitive low-complexity tasks such as search engine optimisation (SEO) backlinking and copying content between systems
- Complex decision making for current trends, topic authority and source content availability based on intuition, rather than statistical analysis
How Guten has changed the game
It derives its name from the German goldsmith and inventor Johann Gutenberg, who created the printing press in 1448. Guten abstracts away complexity and democratizes access to data science and generative AI. Reach’s journalists don’t need to worry about prompt engineering, foundation model selection or evaluations. They are able to focus on adding the most value in their roles without needing extra technical skills to understand AI.
Starting with a handful of committed journalists from The Mirror and OK Magazine, Guten was successfully piloted. Within a year it was quickly scaled up and out—adopted across Reach’s other publications through a rapid iteration of the product.
Guten optimises every part of Reach’s editorial workflow including ingestion of news wires, wire article suggestions, article idea recommendations, and content generation. It even integrates with content management systems (CMS) and other business tools. Guten has significantly increased the release speed of breaking news, reducing time to publish from 9 minutes to 90 seconds.
For Reach’s Content Hub, a dedicated team of journalists produce traffic-driving content for multiple national and regional brands. Guten provides the ability to take a piece of Reach content and redraft it in the style and tone of their other publications without needing to rewrite every version of the same article from scratch.
Since its inception two years ago, Guten has assisted with driving over 1.8 billion page views in 2024.
Guten’s AWS architecture
Guten’s architecture leverages key AWS services (such as AWS Fargate, Amazon Elastic Container Service (Amazon ECS), Amazon OpenSearch Service, Amazon Bedrock, AWS Step Functions and Amazon Simple Storage Service (Amazon S3)).
The following steps describe a typical workflow for Guten users:
- AWS Step Functions orchestrates processing of human evaluated articles from an S3 bucket.
- Using the human evaluated articles, Guten’s model service generates vector embeddings using Cohere Embed v3. Vector embeddings are stored in an Amazon OpenSearch Service knowledge base, supporting semantic search with built-in algorithms like Cosine and Euclidian similarity.
- Journalists select the source content to generate a draft using the Guten UI, a web application running on AWS Fargate for Amazon ECS.
- Guten’s model service retrieves similar articles from the target publications by using the knowledge base and generates a draft using models (like Anthropic’s Claude Sonnet 4) on Amazon Bedrock.
- Journalists perform a human-in-the-loop review of all generated copy, make any necessary changes, then use the Guten UI to publish the final article to the content management system.
Leveraging responsible AI to scale adoption
When scaling out adoption of Guten, Reach made safety the guiding principle. Although highly capable and powerful, generative AI introduces unique risks, such as returning hallucinations that damage a reader’s trust or language that doesn’t capture a brand’s style.
Reach continually improves Guten to reduce hallucinations in the model generated output. Journalists understand this risk and help with the mechanisms to identify and resolve mistakes made by the models. Guten was designed to facilitate a human-in-the-loop review for all generated output. Journalists can easily amend or change content prior to final publication. This has accelerated Guten’s adoption across the business.
Another risk specific to the publishing industry is quotation fidelity—ensuring any quotations from source to generated copy remain unchanged. Misquoting sources can have significant legal ramifications and hence, Guten alerts the user in the UI to any detected changes in quotations. Moreover, Guten’s model evaluation incorporates scoring for quote fidelity, reducing the likelihood of changes in quotations between the source and generated copy.
Additionally, Guten highlights mismatches in entities extracted from the copy in both the source and generated text. These include categories such as people, places, and dates. Differences in extracted entities could be indicative of hallucinations and require special attention.
Capturing brand style
Capturing brand style and editorial preferences for over 120 Reach publications poses a significant technical challenge. Inaccurate style capture requires excessive editing post-generation—adversely impacting the time to publish an article.
Here are some of Reach’s publication’s style and tone differences:
- A complete spectrum of political leanings (such as between The Express and The Mirror).
- A wide variety of target reader demographics—publications like OK! Magazine focuses on specific topics such as celebrities and entertainment, while In Your Area focuses on local news, information and community.
- American publications need to localize for American language conventions.
Reach’s Data Science team tailors model outputs to the destination publication using both human and rules-based evaluations. The evaluations used are ever-changing and form continual points of experimentation. Guten then leverages publication-specific data to differentiate the model output, both for title and article body generation. Feedback from Reach journalists has been essential for establishing a tight feedback loop, it also builds the trust necessary for responsible AI adoption.
Currently, Guten provides three mechanisms to provide user feedback:
- Thumbs Up/Down feedback for headline and article generation serves as a user satisfaction signal when split-testing model variants
- Journalists can provide text-based feedback for a particular generated article
- Users can submit both bugs and product feature requests directly through the UI, helping shape Guten in the longer-term
Constantly innovating and evolving
The news never rests, and neither is Reach as they continue to innovate and evolve Guten’s capabilities. Reach has the following features in active development:
- Content recommendations using generative AI to save time matching wire content, publications and journalists
- Image recommendation and selection optimisation to improve SEO performance
- Optimising and orchestrating social media channels to influence organic page views
Conclusion
Guten integrates the application of data science, data engineering, and traditional software engineering techniques (powered by AWS services) with Reach’s editorial process. They are able to accurately accelerate the process of publishing, from initiation to publication—even between different publications. By working backwards from the editorial process and deeply understanding user feedback, Reach continues to empower journalists to deliver ethically impactful journalism with help from generative AI.
To uncover the power of AWS to empower publisher workflows, reach out to an AWS for Media & Entertainment representative to learn more.
Further reading
- Introducing Claude 4 in Amazon Bedrock, the most powerful models for coding from Anthropic
- Increase engagement with localized content using Amazon Bedrock Flows
- Enabling publishers to customize content while maintaining editorial oversight with Amazon Bedrock
- How UK’s Reach is using AI to help produce more content faster
- Reach plans to move 300 journalists into central traffic-driving content hub