AWS for Industries

Sonrai reduced research timelines by 70% with AWS HealthOmics

Bioinformatics workflows, spanning from RNAseq to metabolomics, have become increasingly complex and data-intensive. For researchers in biotech and pharmaceutical industries, extracting meaningful insights from vast amounts of biological data is a daily challenge.

Recognizing this, Sonrai, a frontrunner in precision medicine cloud technologies, set out to transform the landscape of large-scale bioinformatics analysis. We’ll discuss Sonrai’s journey in harnessing Amazon Web Services (AWS) HealthOmics to develop a more efficient, secure, and user-friendly platform for hosting diverse bioinformatics pipelines.

After deploying HealthOmics, research study timelines improved by 70 percent, while experimental costs plummeted by up to 98.6 percent. We’ll explore how this cloud-based solution not only streamlined complex workflows, but also empowered researchers to navigate the sea of biological data more quickly and at lower costs.

Opportunity: Simplifying precision medicine

Sonrai is a pioneering company that provides advanced analytics solutions tailored to the needs of the biotech and pharmaceutical industries. With its flagship product, Sonrai Discovery, the company empowers researchers to analyse and interpret complex-omics data, such as genomics, transcriptomics, and quantitative proteomics. This is crucial for understanding diseases and developing targeted therapies.

For many modalities, analysing data generated at the lab bench can present several hurdles. For instance, the sheer scale of some data sets, like Illumina short-read RNAseq data, can often be hundreds or thousands of gigabytes. This would require storage and processing power that could either exceed the capabilities of a typical lab workstation or take several days to complete.

When labs develop their own proprietary scripts for RNA sequencing analysis, they often skip crucial quality control steps, like adapter removal, or use outdated software versions. This lack of standardization leads to a critical problem: poor reproducibility of results. The consequences can be severe—published papers may need to be retracted and promising drug candidates could fail in clinical trials, ultimately wasting both time and investment.

Studies have shown that a significant proportion of RNA-seq samples can fail quality control (QC) due to inadequate lab processes or poor sample handling. For instance, in one study using FFPE samples, 40 samples failed at the pre-capture library step, and additional samples were flagged. Samples had QC failures based on bioinformatics metrics (such as low median sample-wise correlation, low number of mapped reads, or low number of detectable genes).

Furthermore, the common reliance of in-house pipelines on one or more individuals frequently leads to poor documentation, non-portable code, and technical debt—making them difficult to understand, maintain, and update. Lastly, these in-house pipelines rarely consider data security or pharma compliance, presenting a critical legal risk.

Sonrai recognized the need for a solution that could computationally scale to handle large datasets and offer standard, yet customizable, workflows built on sound software engineering principles. They also saw a need to provide an intuitive interface enabling customers to process data efficiently. “We want researchers to focus on the science instead of spending time on writing script and managing technical infrastructure,” said Dr. Matthew Alderdice, Head of Data Science at Sonrai.

The Sonrai solution

AWS HealthOmics accelerates scientific breakthroughs at scale with fully managed biological workflows. It enables the hosting of nf-core pipelines, which are community-curated, highly standardized bioinformatics workflows built using the Nextflow workflow management system. HealthOmics has enabled Sonrai to host their nf-core pipelines of choice, from over 60 pipelines available. This allows a huge number of data modalities pipelines (such as whole genome and targeted genomics, mass spec proteomics, and bulk and single cell transcriptomics) to be executed without the need for any custom scripting or development.

Furthermore, HealthOmics automatically scales storage and compute requirements when running these pipelines. This eliminates the need for Sonrai customers to determine resource requirements beforehand. Sonrai’s HealthOmics-driven platform is entirely AWS-native, utilizing a wide array of services to deliver high-performance analytics while maintaining strict compliance with industry standards.

Sonrai_ArchitectureFigure 1 – Sonrai High-level architecture

Key components of their AWS-powered solution include:

  • Automated infrastructure deployment: Using AWS Cloud Development Kit (AWS CDK), Sonrai verifies automated deployment of a secure, pharma-compliant environment—adhering to best practices in data security and governance including HIPAA, ISO and FedRAMP.
  • Cost-efficient data storage: Amazon Simple Storage Service (Amazon S3) provides scalable and cost-effective storage solutions essential for managing terabyte-sized datasets.
  • Enhanced analysis tools: Sonrai then leverage Amazon Athena to analyse these datasets, which is an interactive query service that streamlines data analysis in Amazon S3 using standard SQL. Athena is serverless, so Sonrai does not need to setup or manage additional infrastructure—only paying for the resources queried. Sonrai then leverage Amazon SageMaker Studio Lab, which integrates seamlessly with AWS HealthOmics, offering users a streamlined interface to access bioinformatics pipelines and analyse data.
  • AI-driven insights: Leveraging Amazon Bedrock, which offers a choice of high-performing foundation models (FMs) from leading AI companies, Sonrai automates the interpretation of pipeline results. This enables researchers to receive detailed, AI-powered insights without manual review.

Delivering business outcomes with AWS HealthOmics

By integrating AWS services, Sonrai has achieved several critical business outcomes for its clients, including:

  • Cost reduction: Clients experience up to a 98.6 percent reduction in costs compared to other analytics platforms for the execution of pipelines.
  • Improved research study timelines: Deployment of new R&D pipelines was reduced from weeks to days, with an average reduction in research study timelines of 70 percent.
  • Accelerated processing: With parallel and asynchronous data processing, AWS significantly reduces data analysis run times—enhancing operational efficiency.
    Streamlined infrastructure management: By automating resource scaling and infrastructure deployment Sonrai clients can leverage their services seamlessly.
  • Consistency and compliance: Standardized pipelines and robust compliance measures verify that every analysis is accurate, reproducible, and pharma-compliant.

What’s next for Sonrai

Sonrai is committed to deepening its collaboration with AWS, with future efforts focused on enhancing Sonrai Discovery. They intend on integrating additional AWS HealthOmics features (including the sequence store that utilizes Amazon S3 intelligent tiering for storage cost optimization). Incorporating more features will continue to help their clients advance the frontiers of biotech and pharmaceutical analytics.

“We are excited about our collaboration with AWS on HealthOmics. Bioinformatics is a field plagued by inconsistency and irreproducibility—it’s allowed us to deploy a huge number of workflows in an incredibly consistent, optimized, and reliable way to our customers. We have gone from new deployments taking weeks per pipeline to less than a day,” said Kai Lawson-McDowall, Senior Bioinformatician at Sonrai.

Through its use of AWS services and innovative approaches, Sonrai has democratized bioinformatics by creating accessible tools for users of all technical backgrounds. As Dr. Alderdice explains, “With AWS HealthOmics, we are able to provide biotech and pharma companies with advanced analytics to accelerate drug discovery and significantly cut research timelines by 70 percent.”

Contact an AWS Representative to know how we can help accelerate your business.

Further reading

Jonah Craig

Jonah Craig

Jonah Craig is a Startup Solutions Architect for AWS. He works with startup customers across the UK and Ireland focusing on developing AI, machine learning (ML) and generative AI solutions. Jonah regularly speaks on stage at AWS conferences (such as the annual AWS London Summit and the AWS Dublin Cloud Day). In his spare time, he enjoys creating music and releasing it on Spotify.

Kai Lawson-McDowall

Kai Lawson-McDowall

Kai Lawson-McDowall is a Senior Bioinformatician at Sonrai who has extensive experience in biotech organizations ranging from early-stage startups to industry giants throughout all stages of the clinical development pipeline. Kai has a specialization in the development and deployment of secure, scalable, and cost-effective cloud architectures for bioinformatics, and is passionate about helping organizations leverage these tools to get the most from their data.

Matthew Alderdice

Matthew Alderdice

Matthew Alderdice is Head of Data Science at Sonrai, where he leads a team building cloud‑native bioinformatics pipelines, clinical‑grade ML models, and real‑time analytics on AWS. He has helped lead the development of Sonrai Discovery, a precision‑medicine platform that accelerates pre‑clinical through Phase III trials for Top 20 pharma by unifying multimodal data and compliant workflows. With a decade at the intersection of bioinformatics and data science, Matthew has championed scalable, cost‑aware architectures that translate omics insights into improved patient outcomes.