Designing compliant and secure betting and gaming applications on AWS

In the betting and gaming (B&G) industry, operators must navigate complex regulatory requirements while maintaining exceptional user experiences. Local governing bodies mandate specific rules about where workloads can run and how data must be handled, making traditional cloud deployments challenging.

Many gaming operators have found success by implementing hybrid architectures on Amazon Web Services (AWS) that combine edge computing with cloud services. Through this approach they can process data closer to players for ultra-low latency, while verifying compliance with local data sovereignty laws.

Operators of B&G workloads (such as online casinos and sportsbooks) are bound by strict regulatory requirements, put in place by the local governing body. These regulations provide prescriptive guidance on what workloads can be run in a specific jurisdiction, and even where they can be operated.

For B&G customers looking to leverage AWS Cloud capabilities, a common approach involves deploying a combination of AWS Local Zones, Wavelength Zones and AWS Outposts alongside the parent AWS Region. This hybrid strategy provides organizations a way to deliver low-latency performance for customer-facing applications, while adhering to data residency regulations.

The hybrid architecture on AWS offers a two-part solution:

Regulated components run at the edge to meet local jurisdiction requirements
Non-regulated workloads use Regional AWS infrastructure

The architecture allows access to the full range of AWS cloud services and scalability benefits. The hybrid deployment model, built on AWS Global Infrastructure, facilitates management and streamlines the architecture by providing operators with consistent APIs and core services across all locations

AWS hybrid and edge services provide B&G operators with a solution to meet data residency and network performance requirements. They represent a single physical location, making it essential for operators to design architectures that address resiliency needs. Using B&G customer implementations as a case study, let’s take a look at some of the architectural patterns for site-level Disaster Recovery (DR) using AWS hybrid and edge offerings.

The patterns for resiliency

The term resiliency refers to how an application can recover from infrastructure failures while meeting Recovery Time (RTO) and Recovery Point (RPO) Objectives. Unlike AWS Regions, which contain three or more Availability Zones (AZs) with native connectivity, a majority of locations where a Local Zone or Wavelength Zone is available today only consists of a single site. Furthermore, AWS Outposts are deployed in datacenters selected by the customer.

Failure to consider the impact of site-specific disruptions can result in prolonged service interruption. This can lead to significant revenue loss in the short-term and, if frequent, customer attrition in the longer-term.

Customers should leverage AWS Regions to host B&G workloads where possible. However, this option is not always viable. For example, operators in the United States have a requirement to host regulated workload components within the boundary of each state that they operate in. In states where we do not have an AWS Region, resiliency must be achieved by leveraging a combination of deployment options from AWS Local Zones, AWS Wavelength Zone, and AWS Outposts.

AWS Local Zones are managed AWS locations found in large metropolitan areas around the world. Similarly, AWS Wavelength Zones are similar to Local Zones, but are hosted in a partner Communications Service Provider’s (CSP) datacenter (such as Verizon in the United States). Lastly, Outposts bring managed compute to a datacenter location specified by the customer.

Ordered by customer preference, the following list outlines the various combinations that customers have access to. The order stems from key factors such as management overhead, the ability to scale as needed and cost:

Regional deployment (Highest Preference)
Local Zone and Wavelength Zone deployment
Local Zone and Outpost deployment
Wavelength Zone and Outpost deployment
Outpost for primary and secondary sites

For details on which offerings are available for B&G use in a specific jurisdiction, please consult with your AWS Account Team. In the subsequent sections, we focus on implementation details for each option and lessons learned from the field.

When selecting an option, additional tests should be run to verify that the latency matches your workload’s requirements. Furthermore, consider deploying non-regulated components (for example, those not bound by jurisdictional requirements) in the AWS Region.

1. Regional deployment

If an AWS Region is available and approved for use in a jurisdiction, it should be the first option for workload deployment. The Region provides the most scalability for compute and storage. Furthermore, AZs deliver the quickest path towards establishing resiliency, allowing stateful workload components (such as databases and cache) to be synchronized through native low-latency, high throughput inter-connectivity. In the United States, Ohio is an example where B&G operators opt to use the AWS Region to deploy regulated workload components.

Customers leveraging AWS Regions, in addition to hybrid and edge locations, for their complete deployment strategy should consider standardizing on a core set of services and capabilities that exist across all locations. This streamlines pipeline management and creates consistency in how workloads perform across different locations, regardless of the deployment modality.

This image depicts a simple architecture with two EC2 instances spread across two Availability Zones. An arrow flows horizontally between the two EC2 instances. This depicts that data can flow between the two instances.

Figure 1: AWS Region-based resiliency architecture leveraging multiple Availability Zones.

The preceding diagram shows a high-level view of how AZs can be used to achieve fault tolerance. AWS manages the connectivity and confirms physical separation between locations, so end users can focus on the workload. Workloads in one Availability Zone can communicate with workloads in another Availability Zone using native connectivity. No additional configuration is required.

In the example of Figure 1, the transactional database is representative of a regulated workload that needs to reside within the state boundary. Networking constructs (such as the Amazon Virtual Private Cloud (Amazon VPC)) extend across AZs in a Region. This allows instances in one AZ to communicate with instances in another with no additional overhead.

The AWS Region allows for direct ingress, or indirect by tunneling through other Regions. Customers may opt for the former approach to reduce roundtrip latency from end-user clients. The latter approach trades latency by creating a centralized ingress point for public traffic. This streamlines security by creating a single set of public endpoints that need to be observed and protected. In both scenarios, Regional AWS offerings such as AWS WAF and Amazon CloudFront can be leveraged.

2. Local Zone and Wavelength Zone deployment

If an AWS Region cannot be leveraged, a combination of Local Zone and Wavelength Zone provides the next best option to meet operational resiliency requirements. Both offerings provide operators the ability to deploy core AWS offerings (such as Amazon Elastic Compute Cloud (Amazon EC2) instances, Application Load Balancers and Amazon Elastic Kubernetes Service (Amazon EKS)) without needing to manage networking or a physical footprint.

This deployment model comes with specific architectural considerations. First, we need to account for ingress traffic flow from end users. Second, we must consider how traffic moves between the Wavelength Zone and Local Zone. This is especially important for replicating stateful components. Let’s examine these traffic flows in detail.

This image shows a high level user journey of accessing backend services from an external mobile device. The image shows traffic entering the AWS Region through a CloudFront Distribution. If allowed, it makes its way down to a Regional Availability Zone where it is then proxied to either a Local Zone or a Wavelength Zone. The diagram also depicts how traffic can flow between a Local Zone and a Wavelength Zone. A TLS/SSL VPN is set up allowing communication between the two sites. The VPN tunnel flows through the Internet Gateway on the Local Zone and the Carrier Gateway on the Wavelength Zone.

Figure 2: Alternative resiliency architecture leveraging Local Zones and Wavelength Zones.

North-South traffic flow summary:

Traffic enters through Regional endpoints. In Figure 2, a CloudFront distribution is protected by AWS WAF and AWS Shield Advanced against common attacks.
Traffic then reaches Fleet of Proxy Instances. These instances can be a web server (such as NGINX running on the EC2 instances) that has routing logic to redirect incoming traffic to appropriate sites. The request is evaluated and forwarded to the appropriate location.

Currently traffic can only directly ingress into a Wavelength Zone through the Carrier Gateway (CGW) if the source is on the same network as the partner telecommunications. To overcome this limitation, we recommend proxying all traffic through the AWS Region. Both Wavelength Zone and Local Zones are anchored to an AWS Region through managed circuits—referred to as the Service Link. These extend the control plane and the data plane, enabling resources in AZ subnets to seamlessly communicate with those at the edge.

Traffic can traverse between a Local Zone and Wavelength Zone through a TLS or SSL VPN tunnel. Additionally, consider proxying end-user traffic to edge locations from the Region. Maintaining a single ingress point is less complicated than one for each edge location.

Beyond bypassing network restrictions, traffic traversal through a Service Link also has security and operational advantages. Flowing through the Region provides access to security tooling (such as AWS WAF and Shield Advanced) not available at the edge. Customer managed security appliances would need to be deployed for similar functionality. Furthermore, creating a centralized ingress that routes traffic based on the intended destination provides additional, non-DNS based avenues for workload failover.

East-West traffic flow summary:

More specific routing is leveraged to point to firewall Elastic Network Interfaces (ENIs) as the next hop for traffic traversing between Wavelength Zone and Local Zones subnets.
Connectivity between Wavelength Zone and Local Zones is established through an SSL or TLS VPN tunnel. The tunnel is initiated on the Wavelength Zone side.
Changes are synced between the primary copy hosted in the Local Zones and a secondary copy hosted in the Wavelength Zone through the VPN Tunnel. This forms a bidirectional channel for communication.

The ability for traffic to flow in the East-West direction so that stateful components like databases can be synchronized is critical to a high availability or Fault Tolerant deployment. While the Wavelength Zone and Local Zones are both connected by a native Service Link, it is not possible to traverse across the sites through this path. Traffic traversing through two Service Links will be dropped.

This image depicts an anti-pattern for traffic flow between a Wavelength Zone and a Local Zone. In the image, traffic exiting the Local Zone over the Service Link and attempting to ingress through the Wavelength Zone Service Link is blocked. A red “X” is placed, denoting that the traffic egressing the Local Zone and bound for Wavelength Zone will be dropped when it reaches the AWS Region.

Figure 3: Traversal through multiple Service Links—resulting in dropped traffic.

Currently Local Zone traffic is treated as coming from a non-telecommunication network and will therefore be blocked by perimeter firewalls. To enable East-West traversal between the Local Zone Internet Gateway (IGW) and the Wavelength Zone CGW, we must rely on an SSL or TLS VPN tunnel initiated from the Wavelength Zone. A security appliance needs to be deployed in the Local Zones and the Wavelength Zone (for example, Fortinet FortiGate Next-Generation Firewall). Also, more-specific routes (MSR) must be configured to use its ENI as a next-hop.

The VPN tunnel acts as an overlay, enabling direct bidirectional communication between resources in the two locations. This tunnel can flow entirely over the internet, or through AWS Partners like Megaport. The latter approach reduces the impact of possible network congestion along the traffic traversal route. It is critical that the tunnel is initiated from the Wavelength Zone or traffic will be dropped by the CGW.

3. Local Zone and Outpost deployment

In locations where a combination of a Wavelength Zone and Local Zone are not available (or the Wavelength Zone is not yet approved for B&G workload usage) customers may augment the Local Zone with AWS Outposts deployed at a location of their choosing.

Unlike the first two resiliency options, this approach requires upfront planning from the customer in the form of:

Identifying a location where AWS Outposts can be collocated (the site must meet power and space requirements)
Configuring networking (for example, are there fault-tolerant or highly available routes to the AWS Region)
Verifying non-functional requirements (such as security) are addressed

East-West traffic flow summary:

The MSR functionality can be used alongside AWS Direct Connect or Public Internet to facilitate East-West traffic traversal between an Outpost and a Local Zones
Security Appliances may be used to encrypt traffic as it flows between sites and as a target for MSR when a single VPC spans the Local Zones and Outpost

Figure 4 depicts the architectural pattern to send traffic directly between a Local Zones and an Outpost with high performance and low latency using Direct Connect. Traffic sent out the Outpost Local Gateway (LGW) will flow into a Virtual Gateway (VGW) attached to the Local Zones VPC.

For a direct, low latency path we recommend leveraging a Direct Connect Private Virtual Interface (VIF) that terminate onto a VGW attached to the Local Zone VPC. The Transit Gateway (TGW) is a Regional construct that does not extend to the Local Zone.

The example leverages two VPCs, one at each location. This is necessary to create a symmetric route between the sites. The VGW cannot be used to create a more-specific routing entry. As a result, intra-VPC traffic exiting the Local Zones will be sent over the local route, which takes it through the Region and over Service Links. This will result in dropped packets, as discussed previously.

This image shows the most direct flow for traffic traversal between a Local Zone and and Outpost. The image shows traffic leaving the Outpost egresses out the Local Gateway. The traffic flows to a Direct Connect Point-of-Presence and subsequently takes a Private VIF to the Local Zone. The ingress point at the Local Zone is a Virtual Private Gateway. Once the traffic reaches the Local Zone, it can access resources such as EC2 Instances residing in Local Zone subnets.

Figure 4: Direct Connect for East-West traversal between AWS Local Zones and Outposts.

An alternative approach to Figure 4, if a single VPC is desirable, is to deploy Security Appliances on either side and establish a VPN tunnel. A more-specific route can be created on either side using the appliance’s ENI. An upside to this approach, at the cost of management overhead, is the native encryption provided by the VPN tunnel.

Customers who do not wish to establish a Direct Connect between sites and do not want to expose backend systems to the internet can also leverage the Security Appliance approach for East-West connectivity. This will increase in latency and requires publicly routable IP addresses.

This diagram shows the most direct traffic traversal pattern between an Outpost and a Local Zone. In the diagram, both the Outpost and Local Zone contain a TLS/SSL VPN Appliance. The Outpost Site’s Local Gateway leverages CoIP/NAT to map Private IP addresses to publicly routable ones. A tunnel is initiated from the Local Zone over the internet. The traffic flows through the Internet Gateway on the Local Zone side and reaches the Local Gateway on the Outpost side. This creates a bidirectional flow for traffic.

Figure 5: Traffic traversal over internet with a single VPC and Security Appliances.

4. Wavelength Zone and Outpost deployment

Where a Local Zone is not available, or not approved for use, an alternative may be to leverage a Wavelength Zone in addition to an Outpost. The East-West traffic patterns are like the second point Local Zones and Wavelength Zone deployment model discussed earlier.

There are two options to consider here:

Create a TLS or SSL VPN Tunnel between the Wavelength Zone and the Outpost site over a public network using Security Appliances.
Use an AWS Partner (such as Megaport) to create a secure path between the Wavelength Zone and the Outposts. This pattern is useful in cases where the Outpost site may not have access to Public IPs, or the operator does not wish to directly expose their datacenter to the internet. To establish East-West connectivity, a customer can terminate a Multiprotocol Label Switching (MPLS) or Direct Connect circuit into a provider like Megaport. From here, Megaport Virtual Edge (MVE) appliances, including the hub VPN server, can be used to establish a TLS or SSL VPN connectivity with the Wavelength Zone. This pattern is depicted in the following diagram.

The diagram outlines a direct traffic flow between an Outpost and a Wavelength Zone location. The traffic egresses out the Outpost’s Local Gateway. From here, it reaches a Direct Connect Point of Presence. The traffic is then handed off to Megaport. The connectivity between the Direct Connect and Megaport is through their Transit VXC offering. From Megaport, an SSL/TLS VPN tunnel is established over the internet. The traffic flows through the Wavelength Zone Carrier Gateway and terminates at a TLS/SSL VPN Appliance. Resources sitting behind the VPN Appliance are accessible using this route.

Figure 6: Minimizing traffic between Outpost and Wavelength Zone by leveraging Megapor.

5. Outpost for primary and secondary sites

The final scenario reflects a jurisdiction where neither the Wavelength Zone or Local Zones exist, or are not available for regulated B&G use. The approach that an operator can take in this scenario is to leverage Outpost Racks for both the primary and secondary site. This deployment pattern presents the highest operational overhead, as well as lack of elasticity and pre-defined instance types, for the ability to scale architectural components. It should be noted that end-user capacity planning is required.

Traffic can flow in the North-South direction between the Outpost and the AWS Region through a Service Link. This is the native path that is used for the control plane and intra-VPC traffic leveraging the route table local entry.

For traffic high packet flow or throughput, it is recommended that the Service Link is bypassed and the LGW is used instead. More-specific routes must be explicitly configured in Regional and Outpost subnets to mitigate asymmetric traffic traversal. The overarching architecture is outlined in the following diagram.

This diagram depicts the path for traffic traversal between and Outpost and the AWS Region. In the diagram, a single VPC extends from the AWS Region to the Outpost. Routing is configured so that traffic bound for Outpost subnets is routed through a Transit Gateway. From here, a Direct Connect path is taken to reach the Datacenter and subsequently Local Gateway.

Figure 7: High performance path for traffic in the North-South direction.

We can also facilitate East-West connectivity between the primary and secondary site by leveraging the LGW. As discussed previously, traffic cannot traverse through multiple Service Links, it will be dropped.

To establish East-West connectivity between Outpost sites:

Leverage public or MPLS-based connectivity between two sites.
Use SiteLink functionality of Direct Connect to connect two otherwise isolated sites together. If both sites leverage Direct Connect for North-South traffic traversal, enabling SiteLink implements routes to be advertised to AWS routing equipment in the Direct Connect Points-of-Presence. This creates a low-latency, high-performance path between sites using AWS infrastructure.

This diagram outlines the scenario where two Outpost sites are used for resiliency. The image depicts traffic flow from Outpost Site 1 to Outpost Site 2 which takes place over Direct Connect. Traffic egresses from Outpost Site 1 and flows over the Local Gateway. From here, it proceeds to the Direct Connect Point-of-Presence for Site 1. With Site Link enabled, traffic will then proceed to the Direct Connect Point-of-Presence for Site 2. From here, it can proceed to the Local Gateway and Outpost resources in Site 2.

Figure 8: Outposts using Direct Connect SiteLink.

Conclusion

AWS hybrid and edge offerings (such as Wavelength Zones, Local Zones and AWS Outposts) provide operators the ability to deploy regulated Betting and Gaming workloads, while meeting compliance requirements. We explored five different combinations of offerings that can be used to achieve resiliency and detailed the design patterns seen working with B&G customers.

To start developing your B&G Workload on the AWS Cloud, contact your Account team, or contact us.

AWS for Games Blog

Designing compliant and secure betting and gaming applications on AWS

The patterns for resiliency

1. Regional deployment

2. Local Zone and Wavelength Zone deployment

3. Local Zone and Outpost deployment

4. Wavelength Zone and Outpost deployment

5. Outpost for primary and secondary sites

Conclusion

Further reading

Resources

Follow

Learn

Resources

Developers

Help