Building an AI Stack for Banking on AWS

The banks who are delivering value from generative AI (gen AI) and machine learning (ML) investments are building AI stacks that are scalable and well governed. These stacks accelerate proofs-of-concept/proofs-of-value into production reliably so that business needs are met.

Introduction

There is a lot going on in banks related to both gen AI and more ‘traditional’ ML. Many of Amazon Web Services (AWS) customers are at various stages of talking about it, experimenting with it, testing proof-of-concepts (POCs) and value, and putting business driven use cases into production. Not all of our customers are finding it easy to get to the production stage, delivering consistent value at scale with gen AI and ML technology.

Being able to break out of the cycle of multiple POCs that either fail to deliver the benefits sought after, or generate the necessary confidence in stakeholders (internal and external) for scaled production may seem elusive. These challenges are linked to the ‘how’ of technology adoption rather than the ‘what’ or ‘why’—which we’ll address.

Banks as regulated entities are used to dealing with large amounts of data (frequently personally identifiable information, or PII). This data itself is subject to regulations controlling how it is collected, stored, processed and used. As a consequence, bank executives, compared to those in other industries, require far more transparency on gen AI/ML models and the potential operational risks associated with them. Banks have strict obligations to understand, manage and report risk.

These challenges and complications can be addressed by developing existing enterprise risk management frameworks to reflect the characteristics of the technology, and by using technology tools and services developed to mitigate the risks. Many of our customers are prioritizing business use cases that focus on internal operations and ‘human in the loop’ solutions, rather than direct customer facing tools, while they develop more robust control frameworks. For example, some customers are increasing the automation of customer on-boarding, and Know Your Customer/anti-money laundering (KYC/AML) checks. They are also automating some application processes and delivery of material business benefits that are low risk.

Tools and services to extract and analyze vast document libraries of unstructured data to support credit decisions, or develop hyper-personalized content are also being used by customers. As are enhanced natural language chat bots and call centre assistants that can brief agents on incoming calls from historic account information or call transcripts, and suggest next steps.

Identifying potential business use cases isn’t the real challenge for banks. The challenge is being more consistent in how their AI stack delivers. Being inconsistent can lead to wasted effort on ideas with faulty production reliably. This can restrict a bank’s capacity and capability for innovation and limit the value of better customer outcomes and more efficient operations.

Customers making the most progress on their generative AI and ML journey are creating a fly-wheel effect of ideas, execution at scale and re-usable experience/patterns to solve problems that look similar. This can increase the speed and predictability of innovation while lowering both cost and risk of execution.

The pace of change in gen AI and ML technology is very fast, and this can add to the challenge as well. To give an example, agentic solutions, we believe, will undoubtedly push the boundaries of current business use cases in the next 12 months. However, it is fair to say that most banks are only experimenting with internal facing examples of gen AI agents at the moment, despite the technology’s potential. Parameta, the data business of TP ICAP is a notable exception here, automating client facing operational communications through email and reducing the time and effort needed to process customer requests.

Customers are also challenged with getting gen AI use cases into production consistently in order to deliver the scaled benefits that were originally sought after. This is often because of governance and control process failures, rather than the technology itself. This is a challenge linked to the ‘how’ of technology adoption rather than the ‘what’ or ‘why’.

A repeatable path to value, based on industry best practice, has emerged. It is based on a set of organizational skills and capabilities that help banks determine ‘how’ they might use their AI stack for long-term benefit. The ‘how’ needs unpacking however, as it requires the consideration of a number of different, critically important factors:

The evolving nature of regulatory compliance (both of the responsible use of generative AI technology, and of the banking sector itself)
The importance of data and data governance (Data Governance in the age of Generative AI)
The leadership, people and training development (Data and Generative AI- a window into your organisations soul)

These considerations do not even mention the evolving nature of gen AI technology and ML itself.

Unpacking the ‘how’ of an AI stack for banking

Customers, regardless of whether they are banks or not, typically think about the ‘how’ for their AI stack in terms of the following components:

Technology
Organizational structure served
Governance and control
Financial management
Security
Compliance and resilience

What follows is a top-down view of the first four components, leaving security, compliance and resilience for a more technical discussion another time. Each component should, however, be thought about as part of an integrated whole—each works hand-in-hand without blocking or stopping the others.

1 – Technology

The technology layer can be divided into three parts:

a) gen AI foundational components: Allow companies and their AI developers to build reusable micro-services to sustain any type of use cases
b) Blueprints and templates: Accelerate the development of the most common use cases for programming and set up
c) Ready-to-use applications: Enhance existing applications so end users can use the tools they use every day, but with new gen AI features that allow them to make their work more effective

Taking each in turn:

a – gen AI foundational components

Customers accelerating the use of gen AI in a well governed and controlled way build a technology stack with foundational components to ensure business outcomes are delivered and manage risks. These customers have retained focus on these components over time, refining them and improving their effectiveness through practical experience. They did not spend too much time and resource upfront to define them in full theoretical detail. This has helped those businesses to build early momentum and develop capabilities.

Foundational components are not limited to, but include:

Gen AI and ML gateway: Access to different models, cost controls on usage, with security and authentication.
Model evaluation: A way for a company’s AI developers and data scientists to compare the accuracy, explainability, latency and costs of their solutions. This includes when gen AI and ML is leveraging data, not just the model by itself. Tests are normally automated (using other ML models) or human evaluations. However, a tool called large language model (LLM)-as-a-Judge can now deliver testing results with human-like quality at a fraction of the cost and time of running human evaluations.
Guardrails/responsible AI: Protecting the application and the user from harmful, toxic and violent content. They can also add additional denied topics, and remove PII. Guardrails can help prevent attacks, such as jailbreaking, when hackers exploit vulnerabilities in gen AI systems to bypass their ethical guidelines and perform restricted actions. Guardrails and responsible AI can also help mitigate against hallucinations using automated reasoning (for example, to check compliance according to a specific policy).
Model monitoring: Verifies the solution is working as expected once in production and reports any anomalies.
Observability analytics: Collects data on production environments and makes it available to check things are working as expected and for creation of reports.
Automation with continuous integration and continuous delivery (CI/CD) of new production pipelines: Standardizes and governs the deployment process for faster deployments.
Agentic solution builder: Framework to help build and control agents.
Retrieval Augmented Generation (RAG) builder: Support the creation of embedding databases for faster document search and retrieval.
Prompt management: Store, share and version control complex prompts.
Model fine-tuning: Train models on specific tasks and model distillation (when a large model teaches a smaller model a particular task).

These foundational components are typically built in a modular/micro-services way, and should be independent of each other. This allows for quicker modifications and edits as gen AI research and technology moves forward—adding new techniques.

The cloud is the best option for this type of fast-paced innovation, as it allows services and capabilities to be constantly updated and improved.

b – Blueprints and templates

Blueprints and templates provide a framework for the most common use cases, saving a company’s AI developers and data science teams precious time on the development lifecycle.

Examples of these are:

Internal knowledge searches (using RAG) for retrieving information from internal documents/data sources
Text-to-SQL (fastest search for business analysts)
Translations (such as Japanese to English)
Summarizations (analyst research)
Document creation (customer emails, procurement or legal documents)
Image and video (marketing content creation)
Audio and speech (contact center and customer facing communication applications)

c – Ready-to-use applications

The last level of abstraction would be to either enhance existing business applications with gen AI/ML or create new ones that leverage these technologies and offer them directly to the business users. This might include using existing environmental, social, and governance (ESG) analysis tools, loan and mortgages processing solutions, customer 360 viewers, or by enhancing an existing claims processing tool to deliver greater business value.

Solutions, such as just described, can be built within 3-5 months by a company’s gen AI developers. We recommend customers have 2-3 use cases ready to onboard to start showing value immediately to the business. We also suggest that customers start with the minimum number of foundational components needed for those 2-3 use cases. You can build more components as they are needed and the complexity of the use cases increases.

For example, the bare minimum most early gen AI business use cases require are a Gateway, Model Evaluation, Guardrails and Model Monitoring. RAG, Agentic builder and prompt flows can be added at a later stage. This streamlines the complexity, increases the production speed and reduces upfront costs on the development.

2 – Organizational structure served

There is no right or wrong approach to how customers establish a gen AI development environment and the teams necessary to run it successfully. The best option is the one that suits the way your business is organized and managed. This is not a one size fits all solution. However, being purposeful with the design up front, so it meets the needs of most stakeholders, is key. Simplicity can be a feature that is worth making other trade-offs for.

Purposeful design considerations:

Centralized operating model: All data science and operations teams are centralized within a single team or organization for the benefit of the whole business. Concentrating expertise and delivery capability like this can be valuable to smaller organizations, or where gen AI/ML use needs to be tightly governed and controlled.
Decentralized operating model: Each business unit is free to choose and own how they build data and their gen AI development environment. Federated businesses operate like this by design. Decentralization is an option for very fragmented organizations that may not have the same technology stack or find it hard to pass costs between them. It is, however, still a best practice to keep the technology stack as similar as possible to avoid technical debt.
Mixed operating model: Smaller business units depend on central teams for expertise and re-usable best practices. Larger business units can have their own, but try to adhere to some enterprise-wide general policies and guidelines.
Gen AI/ML Center of Excellence (COE): COEs for the enablement of gen AI/ML applications and operations are a popular and effective way to build operational skills and disseminate best practices. As discussed in Designing a Cloud Center of Excellence there is no fixed pattern, beyond what works for your organization. They provide the most value when advocating for the use of gen AI and providing expert advice on ‘how’ to implement successful, defined governance and approval processes. They can also establish security and compliance standards—sharing best practices and evaluating new tools and approaches for delivery. COEs support the development of a gen AI environment that can scale the technology while reducing operational costs and risks. This frees the business to focus on the delivery of prioritized use cases. COEs do not need to be a permanent team, once a solution is up and performing as required, they can be folded back in to execution roles within the business.

Choosing the best organization operating model

Customers typically design their AI/ML operating model to explicitly reflect how their organisation is run and the resources that they have available. Being explicit and purposeful with this decision helps banks to deliver their AI strategy.

Figure 1 – Alternative organizational models or AL/ML adoption

3 – Governance and control

When it comes to gen AI and ML there are typically two important components to consider. The first is a centralized model registry that allows compliance, risk, business, solution owners, and auditors to know what models are in production (and which internal version it is). Registries can also catalog data lineage (how the models were built), record their business purpose, the risks they create and how that is governed and controlled, and who their owners are. A centralized registry is very important because there are strict regulations that apply, especially to Credit Risk Models, which are currently only ML models.

The second part of governance, which applies to all gen AI/ML use cases, is the approval process—before being moved to production. Companies need to adhere to regulations that apply to them, but also to internal policies created by the compliance and risk teams. Use cases frequently need internally approval. Companies can now use gen AI techniques like Automated Reasoning to accelerate this process. Automated Reasoning works by creating a mathematical model of a policy that can then be interrogated to see if a use case is compliant or not. It will explain why it is, or is not, with mathematical certainty, thus mitigating the risk of hallucinations within the answer.

4 – Financial management

Finance Operations (FinOps) is an operational framework and evolving cultural practice that creates financial accountability through collaboration between engineering, finance, and business teams. Gaining executive sponsorship to drive FinOps success is a critical first step in building a culture of financial discipline. As industries and organizations across the world adopt cloud capabilities and deploy technology solutions, such as gen AI/ML, increased attention has turned towards necessary collaborations across functions. It is important to make data-driven financial decisions and to track the business value delivered from the investment. For a deeper dive into this topic read Unveiling Blind Spots in FinOps.

Delivering change

Building a solution to support the ongoing execution of a gen AI/ML strategy, the delivery of sustainable levels of innovation and the reliable realization of business benefits that flow from them requires change management. AWS can help you through this by following a six-phased approach.

Change management

Robust change management processes enable businesses to manage the people, process and sometimes cultural change that is necessary to maximize the value of the technology being used and mitigate implementation risks.

Figure 2 – Three phases of technology deployment

MLOps solution delivery change management

Customers making the most progress towards their gen AI and ML goals typically deploy an ML Operations (MLOps) approach to streamline the lifecycle of ML models from development to deployment and monitoring. The approach helps to ensure scalability, reproducibility, and continuous integration/deployment of models enabling banks to efficiently manage production-level reliability, compliance and performance.

Figure 3 – Building MLOps capabilities

Working with your account team, industry specialists, and AWS Professional Services, we help our customer’s technology stack teams build the processes and best practices. Our six-phase approach helps to align interests and create shared objectives. AWS customers can also leverage the AWS generative AI Launchpad proposition to accelerate the delivery of gen AI business benefits and consistently generate returns from this investment.

AWS and you

AWS offers bank leaders a comprehensive range of generative AI/ML related services and the widest and deepest range of compute, storage, database, inference and analytical tools necessary to deliver the business value it can unlock.

Correctly configured technology can deliver secure, resilient, compliant, cost effective and scalable solutions. The ‘how’ of a bank’s adoption of technology, and implementation, to meet their organization’s needs is what separates successful banks from the rest. The adoption of an AI stack for banking is as much a business journey as it is a technology one.

AWS can help its customers:

Build awareness: Offering training and support from c-level data scientists.
Establish foundations: Advise on best practices for governance and control, along with the prioritization of use cases and AI/ML/gen AI operations.
Be aware of emerging capabilities and technology developments: Stay informed about vanguard innovations, making it quicker for you to adopt new technology and features.
Integrate into your business operations: Have access to the broadest set of services that integrate with your applications.

Conclusion

We discussed how challenging it can be for banks (or any industry) to move from the POC phase of gen AI and ML adoption to a scalable, reliable, compliant means of production delivering value based on business needs. Organizations need business leaders who understand that building an AI stack and supporting it with the right people, processes, governance and control is essential. Establishing the right AI stack for your companies needs can increase the speed and predictability of innovation while lowering both cost and risk of execution.

Contact an AWS Representative to discuss how we can help accelerate your business, or visit AWS for Financial Services to learn more.

AWS for Industries

Building an AI Stack for Banking on AWS

Introduction

Unpacking the ‘how’ of an AI stack for banking

1 – Technology

2 – Organizational structure served

3 – Governance and control

4 – Financial management

Delivering change

AWS and you

Conclusion

Further reading

Resources

Follow

Learn

Resources

Developers

Help