Validate radiology reports using Amazon Nova

Every day, radiologists around the world face a growing challenge: dealing with an increasing number of medical images while racing to deliver faster results. Adhering to the latest American College of Radiology (ACR) Appropriateness Criteria can be difficult, especially when patient care is on the line.

At Amazon Web Services (AWS), we’ve been thinking a lot about this. What if cloud-based AI solutions could help handle the increasing workload without sacrificing accuracy?

We will explore how generative AI services from AWS can transform the radiology workflow. We will show how they can help you deliver better patient care, while maintaining the high standards that matter the most for improved health outcomes.

AI-powered radiology report validation solution

We have developed an innovative approach using foundation models (FMs) to automate the validation of radiology reports. At the heart of our solution lies Amazon Nova Lite, a multi-modal FM with document understanding capabilities deployed on Amazon Bedrock.

Solution benefits

These are the key benefits you can achieve using our solution:

Automate identification of missing anatomical structures
Comprehensive guideline adherence verification
Clear appropriateness categorization
Structured feedback generation

To begin, we enabled the FM to parse the guidance document, allowing it to understand the criteria for appropriate imaging. We then pass a prompt to the FM, along with a radiology report for validation, assessing ACR guideline adherence, diagnostic completeness, and missing anatomical structures (Figure 1). We used MIMIC Chest X-ray (MIMIC-CXR) Database v2.0.0 (which is a large publicly available dataset of chest radiographs with free-text radiology reports) and the ACR Appropriateness Criteria® for Routine Chest Imaging to test the report validator.

It is possible to create a knowledge base using the appropriate guidance document and then interact with the same for validating radiology reports. Our solution, as a proof-of-concept, does not utilize a knowledge base at this time.

Sample code

The following Python code represents a sample snippet where we used the Converse API in Amazon Bedrock to interact with the model. It is a sample radiology report from the MIMIC-CXR dataset, that was validated against an ACR guidance criteria:

sample_report = "There is no focal consolidation, \
pleural effusion or pneumothorax. Bilateral nodular opacities that most \
likely represent nipple shadows. The cardiomediastinal silhouette is normal.\
 Clips project over the left lung, potentially within the breast. \
 The imaged upper abdomen is unremarkable. Chronic deformity of the \
 posterior left sixth and seventh ribs are noted."

The following prompt was used to validate the aforementioned mentioned radiology report (we appended the radiology report to the prompt):

prompt_postpend = "Does the above radiology report adhere to \ the ACR guidelines mentioned in the document? \ Is it detailed enough to provide a diagnosis? \ Is the report missing any key anatomical structures? \ Does the report meet the \ quality standards of the ACR guidelines? \ Please provide terse actionable feedback and do not \ try to summarize the report itself." prompt = sample_report + prompt_postpend

The validation document containing the ACR guidelines is read as follows:

val_doc = <validation_document.pdf>
print("Validation document: ", val_doc)
with open(val_doc, 'rb') as file:
    pdf_bytes = file.read()

We defined a message artifact, which can parse the PDF document and the prompt:

messages =[{
        "role": "user",
        "content": [
        {"document": {
                "format": "pdf",
                "name": "DocumentPDFmessages",
                "source": {"bytes":  pdf_bytes}}
        },
        {"text": prompt}]}]

Then we defined the model_id variable to be Amazon Nova Lite and the maximum number of tokens to be used for the model conversation:

inf_params = {"maxTokens": 200, "topP": 0.1, "temperature": 0.3} model_id = "us.amazon.nova-lite-v1:0"

The API used to converse with the model is as follows:

model_response = bedrock_agent_client.converse(modelId=model_id, messages=messages,\
            inferenceConfig=inf_params)
response_text = model_response['output']['message']['content'][0]['text']

Architecture

For our solution’s implementation path please review the high-level architecture, Figure 1. This is a sample implementation where a radiology report is uploaded to an Amazon Simple Storage Service (Amazon S3) bucket. As soon as the report is uploaded to Amazon S3 an AWS Lambda function is triggered, which calls the Amazon Nova Lite model through Amazon Bedrock.

Figure 1 – Architecture diagram of the radiology report validation solution

The workflow begins with text-based radiology reports, that undergo an ETL (Extract, Transform, Load) operation. The data is then stored in an S3 bucket. A guidance document like the ACR Appropriateness Criteria guideline is processed using Amazon Nova Lite for document understanding, which then feeds into Amazon Bedrock. The radiology report validation solution built on Amazon Bedrock, prompts the foundation model, along with the report, to be validated.

Depending on your use case, the actual implementation of this architecture may vary.

Evaluation

The validation solution automates chest X-ray report validation, providing comprehensive guideline adherence analysis. One of the critical tasks the solution demonstrates is it

As prompted, the solution also analyzes the report for missing anatomical structures based on the ACR criteria (Figure 2). In figure 2, you can see the missing structures in the report and their corresponding references in the source document. The sample output (Figure 3) on the right-hand side shows the level of adherence to the ACR guidelines with an explanation of the validation.

Figure 2 – Detection of anatomical structures in radiology report

Figure 3 – Appropriateness rating of a radiology report

The validation solution additionally assesses the radiology report for quality based on the appropriateness category indicated in the ACR criteria. The ACR guideline for the Appropriateness Category Name is primarily categorized into four categories:

Usually Appropriate: The imaging procedure or treatment is indicated in the specified clinical scenarios at a favorable risk-benefit ratio for patients.
May Be Appropriate: The imaging procedure or treatment may be indicated in the specified clinical scenarios as an alternative to imaging procedures or treatments with a more favorable risk-benefit ratio, or the risk-benefit ratio for patients is equivocal.
May Be Appropriate (Disagreement): The individual ratings are too dispersed from the panel median. The different label provides transparency regarding the panel’s recommendation. “May be appropriate” is the rating category and a rating of 5 is assigned.
Usually Not Appropriate: The imaging procedure or treatment is unlikely to be indicated in the specified clinical scenarios, or the risk-benefit ratio for patients is likely to be unfavorable.

Impact on clinical practice

Using the AI-based radiology report validator, the burden on radiologists can be reduced while also improving efficiency. The solution also provides structured feedback for residents in training, making it a valuable educational tool. Furthermore, this method can be quickly scaled across different anatomical regions and modalities, making it a versatile solution for various radiological applications.

Looking forward

While our results are inspiring, we recognize that real-world validation is crucial. The next step is to continue testing the solution’s effectiveness in clinical settings and expanding its capabilities to cover more imaging modalities.

In an era where efficiency and accuracy are paramount, AI-powered validation solutions represent a significant step forward. By automating the compliance checking process, radiologists can save time while improving patient care through more consistent and accurate diagnostic reporting.

This research demonstrates the growing potential of generative AI in healthcare, particularly in maintaining and improving quality standards in radiology. As we continue to develop and refine solutions, the future of radiology looks increasingly efficient and precise.

Contact an AWS Representative to know how we can help accelerate your business.

AWS for Industries

Validate radiology reports using Amazon Nova

AI-powered radiology report validation solution

Architecture

Evaluation

Impact on clinical practice

Looking forward

Further reading

Resources

Follow

Learn

Resources

Developers

Help