Microsoft Workloads on AWS

Optimize CPUs best practices for SQL Server workloads – continued

In this blog post, we will extend the analysis of the Optimize CPUs feature provided in a previous publication to other Amazon Elastic Compute Cloud (Amazon EC2) instances suitable for deployment of Microsoft SQL Server workloads on the Amazon Web Services (AWS) Cloud. Amazon EC2 now allows customers to modify an instance’s CPU options to optimize the licensing costs of Microsoft Windows license-included workloads. You can now customize the number of vCPUs and/or disable hyperthreading on Windows Server and SQL Server license-included instances to save on vCPU-based licensing costs.

Introduction

In our previous blog post, we detailed the Optimize CPUs feature for the EC2 r6idn instance type. There are many other instance types available, each with different combinations of CPU, memory, storage, and networking capacity.

Due to the multitude of AWS instance families and multiple variations within each family, we are unable to test SQL Server performance on all available Amazon EC2 instance types. We limited our testing to instances matching criteria established for SQL Server workloads.

We focused our SQL Server performance testing on Memory Optimized instances, which offer 8, 16, and 32 GB of RAM per vCPU. Access to a large amount of RAM is beneficial for SQL Server. Within the Memory Optimized instance type, we selected instances with the highest IOPS as SQL Server performance strongly correlates with the performance of the storage subsystem. Our EC2 instance selection was limited to those that use Intel or AMD processors as SQL Server at present does not support ARM-based AWS Graviton Processors. Finally, we focused only on the latest generations of the instance types that matched our criteria.

In addition to the EC2 r6idn instance, which we analyzed in our previous blog post, our selection criteria left us with four instance types for additional analysis. There are multiple instance sizes within each instance type, but we limited our testing to the largest and mid-tier instances within each type. The list of instances included in testing is presented in Table 1.

Type Size #CPU CPU Clock RAM IOPS
r7i 48xlarge 192 3.2 GHz 1.5 TiB 240,000
24xlarge 96 768 GiB 120,000
r7a 48xlarge 192 3.7 GHz 1.5 TiB 240,000
24xlarge 96 768 GiB 120,000
x2idn 32xlarge 128 3.5 GHz 1.0 TiB 260,000
16xlarge 64 2.0 TiB 173,333
x2iedn 32xlarge 128 3.5 GHz 4.0TiB 260,000
16xlarge 64 2.0 TiB 130,000

Table 1. Instances selected for performance testing.

Performance test configuration

From the results published in our previous blog post, disabling hyper-threading provides the most significant reduction in license cost as both Windows and SQL Server licensing depends on the number of active processors. Disabling hyper-threading cuts the number of processors by half, significantly reducing the license cost. Further cost optimization is possible by disabling some of the cores, but it offers diminishing returns and generally requires additional performance testing to avoid shifting a workload to the CPU-bound category.

Our focus during performance testing was to compare the performance of SQL Server on the native instance with the performance achieved on the same instance after disabling hyper-threading. The EC2 r7a instance type presented a challenge with this approach as it does not support hyper-threading and all virtual CPUs are actual physical cores. Following the same pattern, we tested the EC2 r7a instances with default number of active CPUs and then we disabled half of them using the Optimize CPUs feature to simulate disabling the hyper-threading.

To run the performance tests, we used an industry-standard HammerDB benchmarking tool using TPCC-like workload emulating OLTP databases. For the larger instances (32xlarge and 48xlarge) in each type we used the 8 TiB HammerDB database (75,000 warehouses). For the smaller instances (16xlarge and 24xlarge respectively) of each type, which have less RAM, we used the 3.5 TiB database (30,000 warehouses). We configured HammerDB performance tests with the “Use All Warehouses” option to increase the I/O load on the system.

Performance test results

Table 2 summarizes performance test results for native instances and the same instances with hyper-threading disabled. Please note, as mentioned earlier, the EC2 r7a instance type does not use hyper-threading, so for our tests, we disabled half of the cores. We also calculated the on-demand monthly cost for respective license included instance configurations with Windows Server and Microsoft SQL Server Enterprise Edition using pricing for the US East (N. Virginia) region.

Instance Type

RAM

(GB)

MAX IOPS vCPU

CPU

(%)

Monthly cost Perf.
(TPM)
Perf.
(relative)
Cost Savings Cost Savings
(relative)
r6idn.32x 1,024 400,000 128 49% $48,466.86 1,787,732  
64 75% $28,797.74 1,778,883 99.50% $19,669.12 41%
r7i.48x 1,536 240,000 192 30% $68,278.94 1,122,192  
96 56% $38,775.26 1,120,590 99.86% $29,503.68 43%
r7a.48x 1,536 240,000 192 28% $69,670.03 1,119,488  
96 52% $40,166.35 1,119,560 100.01% $29,503.68 42%
x2idn.32x 2,048 260,000 128 27% $49,074.98 1,311,519  
64 48% $29,405.86 1,307,481 99.69% $19,669.12 40%
x2iedn.32x 4,096 260,000 128 36% $58,811.72 1,652,385  
64 52% $39,142.60 1,619,810 98.03% $19,669.12 33%

Table 2. Effect of disabling hyper-threading on performance and cost for large instances

In Table 2, SQL Server performance levels, provided in transactions-per-minute (TPM) achieved, is listed for each instance type. In some cases, disabling hyper-threading resulted in slightly lower performance, while in some other cases slightly larger. All these differences are within the expected HammerDB performance test deviations, so we may assume that disabling hyper-threading or reducing the number of cores for the EC2 r7a instance did not affect observed performance.

As expected, disabling hyper-threading or some cores resulted in increase in CPU utilization, but in none of the cases reaching levels exceeding 75%. Disabling hyper-threading or some cores resulted in significant cost savings between 33% and 43%! For most of the instance types we tested, cost savings exceeded 40% except for the EC2 x2iedn instance type, where savings are 33% due to the relatively high cost of the hardware as compared to the license cost of Windows Server and Microsoft SQL Server Enterprise edition. Results for mid-tier instances of each of the instance type that we analyzed, presented in Table 3, closely matched results for the largest instances in terms of the relative cost savings. Again, we see relative cost savings between 40% and 43% for all instance types we tested except for the EC2 x2iedn instance type, which lagged with the relative cost savings of 33%. Disabling hyper-threading and reducing the number of cores did not affect performance but resulted in an increase in CPU utilization.

Instance Type RAM MAX IOPS vCPU

CPU

(%)

Monthly

cost

Perf.
(TPM)
Perf.
(relative)
Cost Savings Cost Savings
(relative)
r6idn.16x 512 200,000 64 39% $24,233.43 924,974  
32 59% $14,398.87 919,199 99.38% $9,834.56 41%
r7i.24x 768 120,000 96 26% $34,139.47 574,159  
48 51% $19,387.63 574,656 100.09% $14,751.84 43%
r7a.24x 768 120,000 96 24% $34,835.02 587,171  
48 48% $20,083.18 586,643 99.91% $14,751.84 42%
x2idn.16x 1,024 173,333 64 36% $24,537.49 955,941  
32 58% $14,702.93 960,984 100.53% $9,834.56 40%
x2iedn.16x 2,048 130,000 64 32% $29,405.86 956,963  
32 46% $19,571.30 928,396 97.01% $9,834.56 33%

Table 3. Effect of disabling hyper-threading on performance and cost for mid-tier instances

Encouraged by these results, we decided to test how far we can go with disabling cores on EC2 r7a instances until the workload became CPU-bound. The results of this extended series of tests are presented in Table 4.

Instance Type RAM MAX IOPS vCPU

CPU

(%)

Monthly cost Perf.
(TPM)
Perf.
(relative)
Cost Savings Cost Savings
(relative)
r7a.24x 768 120,000 96 24% $34,835.02 587,171
48 48% $20,083.18 586,643 99.91% $14,751.84 42%
36 62% $16,395.22 589,366 100.37% $18,439.80 53%
24 85% $12,707.26 592,581 100.92% $22,127.76 64%
r7a.48x 1,536 240,000 192 28% $69,670.03 1,119,488
96 52% $40,166.35 1,119,560 100.01% $29,503.68 42%
72 69% $32,790.43 1,119,164 99.97% $36,879.60 53%
48 91% $25,414.51 1,118,900 99.95% $44,255.52 64%

Table 4. Effect of disabling cores on performance and cost for EC2 r7a instance type

Instances of this type have very fast AMD EPYC 4th generation (AMD EPYC 9R14) processors offering up to 3.7 GHz clock speed. This allowed us to reduce the number of cores by 75% before the workload became CPU-bound. This reduction in the number of cores resulted in cost savings of up to 64%! However, this also resulted in increasing CPU utilization to 85% and 91% respectively so we cannot recommend such a deep level of CPU optimization for every workload without specific load testing using real workloads.

Conclusion

Extending Optimize CPU feature to license-included deployments, offers cost savings for both Windows and SQL Server licenses. This results in a higher level of cost savings than were showed in our original blog post, which calculated savings only for SQL Server licenses assuming BYOL deployment.

SQL Server performance tests under various configurations, as discussed in this blog post, support the following conclusions and allow us to provide the following recommendations for effective use of the Optimize CPU feature.

  • Disabling hyper-threading results in ~45% cost reduction (SQL Server EE) or 50% reduction in the number of required SQL Server and Windows licenses without a negative effect on performance when testing with HammerDB.
  • For EC2 r7a instances, which do not support hyper-threading, disabling half of the cores appears to be a safe approach to reduce the cost without affecting SQL Server performance.
  • Disabling more than 50% in number of active CPU cores may provide some additional cost savings but would require performance testing on real workloads.
  • Applying Optimize CPUs feature results in a significant reduction in cost per 1,000 TPM.

These results were obtained using synthetic OLTP-type workloads. Your results may differ depending on the type of your database workload. Also, the cost and cost reduction estimates are calculated using baseline Microsoft license costs. Due to the different licensing options or negotiated discounts, your actual cost savings may deviate from the estimates provided in this blog post.


Vikas Babu Gali

Vikas Babu Gali

Vikas Babu Gali is a Senior Specialist Solutions Architect, focusing on Microsoft Workloads at Amazon Web Services. Vikas provides architectural guidance and technical assistance to customers across different industry verticals accelerating their cloud adoption.

Reghardt van Rooyen

Reghardt van Rooyen

Reghardt van Rooyen is a Senior Specialist Solutions Architect, focusing on Microsoft Workloads, at Amazon Web Services. Utilizing his 14 years of SQL Server database administration and leadership experience, Reghardt specializes in architecting high throughput SQL Server HADR solutions for enterprise customers. Always inquisitive, he explores AWS infrastructure and SQL Server database performance limits to ensure customers’ implementations are performant and cost optimized. As a native of South Africa, Reghardt enjoys rugby, BBQ, and spending time with his family and friends outdoors.

Alex Zarenin

Alex Zarenin

Alex Zarenin is a Principal Solutions Architect with Amazon Web Services. He works with financial services companies designing and implementing a broad array of technical solutions. With domain expertise in Microsoft technologies, Alex has more than 30 years of technical experience in both the commercial and public sectors. Alex holds a Ph.D. in Computer Science from NYU.