AWS for Industries
Bayer imaging FM classifies drug targets using Amazon SageMaker HyperPod
This blog is co-authored by: Marc Osterland, Lisa Schneider, Adrian Wolny, and Vladislav Kim from Bayer AG, Pharmaceuticals
The pharmaceutical industry is starting to adopt artificial intelligence (AI) foundation models (FMs) to enhance research and development workflows. Bayer Pharmaceuticals (a division of Bayer AG), a multinational pharmaceutical and agricultural company headquartered in Germany) wanted to extract insights from their development data sets, so their data scientists turned to Amazon Web Services (AWS) for assistance.
By harnessing the power of Amazon SageMaker HyperPod, Bayer trained and utilized new FMs in just a few short months. Their scientific team can now process vast amounts of biomedical imaging data, train sophisticated machine learning (ML) models, and identify promising drug candidates based on phenotypic signatures. As Bayer continues to innovate, their work with AWS helps to pave the way for faster, more efficient pharmaceutical R&D. We’ll explore how Bayer uses AWS services to transform their research processes and drive innovation in pharmaceutical development.
Cell Painting is a technology for utilizing fluorescent dyes in biological imaging. It has widely been adopted in the pharmaceutical industry for high-content screening (HCS) and understanding how specific genetic, physiological, or drug-binding actions change cell function. The Cell Painting assay, paired with therapeutic molecules, can reveal subtle phenotypic changes in the drug discovery workflow—leading to mechanistic insights or new drug targets.
In digital pathology, FMs are addressing the scalability challenges faced by biopharma organizations that need to analyze millions of histopathological images. Traditionally, morphological profiling uses human-engineered feature extractors such as shape, size, and textures to obtain a vector representation of histopathological images. This is a human-intensive computational process not generalizable across datasets.
Instead of relying on traditional human-engineered feature extractors, AI models can process complex morphological data with remarkable consistency and speed. Initiatives like BigPicture and industry collaborations, such as the Bayer AG partnership with Aignostics, underscore the growing recognition that AI-powered analysis has become indispensable for modern pharmaceutical R&D.
The Challenge: Analyzing millions of large images
Training FMs for cell painting and digital pathology requires processing millions of images. By leveraging Amazon SageMaker HyperPod, Bayer research scientists trained multiple large-scale self-supervised imaging foundation models. They trained DINO, and MAE (from Meta AI), and SimCLR (available from Cornell University), as well as Cell Painting Gallery, (available from the Registry of Open Data on AWS). As shown in Figure 1, these FMs allowed the computer to recognize features across three types of cell treatments.
Figure 1: DINO self-attention maps. Cell Painting image crops and self-attention maps of the DINO attention heads in the last layer. Example images for DMSO, FK-866 and NVS-PAK1-1. The color scale in the self-attention maps represents the level of attention from the DINO [CLS] token, with lighter areas indicating higher attention. DINO was trained on the multisource data with the ViT-S architecture. This image is courtesy of Scientific Reports and Bayer AG.
Similarly, the Bayer digital pathology and histopathology teams faced significant challenges: analyzing millions of microscopy images to understand morphological and phenotypical changes. Bayer decided to explore if large-scale self-supervised imaging FMs—which do not require image segmentation when it comes to morphological profiling—are a more efficient alternative.
These imaging foundation models can also be applied to other drug discovery workflows at Bayer, such as analyzing enormous (10,000 x 10,000 px) slides, to assist digital pathologists to identify cancerous cells in human tissues. However, these imaging FMs require training on extensive datasets, a task that demanded substantial computational resources and scalability.
Solution overview
Bayer needed a way to provide a flexible, high-performance environment for FM development and training. Enter Amazon SageMaker HyperPod: its seamless integration with current Bayer infrastructure makes it appear as another resource in their network. SageMaker HyperPod allowed for reservation of a cluster of four ml.p4de.24xlarge Amazon Elastic Compute Cloud (Amazon EC2) instances, each with 8 NVIDIA A100 GPUs, with 80 GB of GPU RAM for each device. The science team pre-trained the FMs with 50 TB of data from cell culture images and histopathological slides. They trained the FMs continuously for three weeks to learn features and segmentation patterns.
SageMaker HyperPod provides a persistent, robust cluster for FM training and job queuing. It also provides a developer-friendly environment to debug and inspect the running jobs to verify the full utilization of GPU resources. Bayer uses SageMaker HyperPod to maintain deep infrastructure control. The builders securely connected using Session Manager (a fully managed tool of AWS Systems Manager) to manage the ml.p4de.24xlarge instances for advanced model training, infrastructure management, and debugging.
To maximize availability, SageMaker HyperPod maintains a pool of dedicated and spare instances, which minimizes downtime during critical node replacements. Through this ability, Bayer was able to automatically swap out any failing nodes and restart the model training from the last saved checkpoint. This freed up time for the Bayer Research team.
Bayer team members also needed observability tools to help monitor and manage the workload. The SageMaker HyperPod health monitoring agent continuously monitors and detects potential issues. This includes memory exhaustion, disk failures, GPU anomalies, kernel deadlocks, container runtime issues, and out-of-memory (OOM) crashes. Based on the underlying issue the monitoring agent either replaces or reboots the node. Integration of SageMaker HyperPod with other observability services (such as Amazon Managed Service for Prometheus, and Amazon Managed Grafana) offer the Bayer team deeper insights into cluster performance, health, and utilization They also help streamline development time.
Figure 2 is a high-level architecture diagram of the workflow Bayer researchers use with SageMaker HyperPod. It shows how the various cluster components interact with each other and other AWS services (such as Amazon FSx for Lustre and Amazon Simple Storage Service (Amazon S3).
Figure 2: Reference architecture
Benefits of FM workflows
The SageMaker HyperPod-powered workflow has already made a significant impact on the Bayer drug discovery process:
- Can analyze data from 100,000 compounds in HCS experiments
- Helps identify top therapeutic candidates from vast datasets
- Analyzes jobs run quickly
Overall, this new phenotypic imaging FM accelerates the Bayer drug discovery pipeline.
Looking ahead
As Bayer continues to push the boundaries of ML in drug discovery, they’re exploring various technical processes and mechanisms to do more data science with the same resources. The team has implemented dynamic workload scaling to accommodate growing demand from their 20-person team. The Bayer team are considering implementing better ML experiment tracking with the full managed MLFlows on Amazon SageMaker, and SageMaker training plans to schedule resources efficiently. The team is also exploring various Amazon SageMaker inference options based on requirements to serve their FMs to their digital pathology and histopathology teams.
Conclusion
Through the partnership with AWS, the Bayer Research team has been able to implement AI foundation model training to help accelerate their research findings. Bayer can now analyze data from 100,000 compounds in HCS experiments to identify top therapeutic candidates in a shorter timeframe than traditional solutions.
To learn more about foundation model training on AWS, please contact an AWS Life Sciences representative.
Further reading
- Guidance for Training Protein Language Models (ESM-2) with Amazon SageMaker HyperPod
- Maximize business outcomes with machine learning on AWS
- Announcing the new cluster creation experience for Amazon SageMaker HyperPod
- Self-supervision advances morphological profiling by unlocking powerful image representations