Microsoft Workloads on AWS
Optimize CPUs best practices for SQL Server workloads – continued
In this blog post, we will extend the analysis of the Optimize CPUs feature provided in a previous publication to other Amazon Elastic Compute Cloud (Amazon EC2) instances suitable for deployment of Microsoft SQL Server workloads on the Amazon Web Services (AWS) Cloud. Amazon EC2 now allows customers to modify an instance’s CPU options to optimize the licensing costs of Microsoft Windows license-included workloads. You can now customize the number of vCPUs and/or disable hyperthreading on Windows Server and SQL Server license-included instances to save on vCPU-based licensing costs.
Introduction
In our previous blog post, we detailed the Optimize CPUs feature for the EC2 r6idn instance type. There are many other instance types available, each with different combinations of CPU, memory, storage, and networking capacity.
Due to the multitude of AWS instance families and multiple variations within each family, we are unable to test SQL Server performance on all available Amazon EC2 instance types. We limited our testing to instances matching criteria established for SQL Server workloads.
We focused our SQL Server performance testing on Memory Optimized instances, which offer 8, 16, and 32 GB of RAM per vCPU. Access to a large amount of RAM is beneficial for SQL Server. Within the Memory Optimized instance type, we selected instances with the highest IOPS as SQL Server performance strongly correlates with the performance of the storage subsystem. Our EC2 instance selection was limited to those that use Intel or AMD processors as SQL Server at present does not support ARM-based AWS Graviton Processors. Finally, we focused only on the latest generations of the instance types that matched our criteria.
In addition to the EC2 r6idn instance, which we analyzed in our previous blog post, our selection criteria left us with four instance types for additional analysis. There are multiple instance sizes within each instance type, but we limited our testing to the largest and mid-tier instances within each type. The list of instances included in testing is presented in Table 1.
Type | Size | #CPU | CPU Clock | RAM | IOPS |
r7i | 48xlarge | 192 | 3.2 GHz | 1.5 TiB | 240,000 |
24xlarge | 96 | 768 GiB | 120,000 | ||
r7a | 48xlarge | 192 | 3.7 GHz | 1.5 TiB | 240,000 |
24xlarge | 96 | 768 GiB | 120,000 | ||
x2idn | 32xlarge | 128 | 3.5 GHz | 1.0 TiB | 260,000 |
16xlarge | 64 | 2.0 TiB | 173,333 | ||
x2iedn | 32xlarge | 128 | 3.5 GHz | 4.0TiB | 260,000 |
16xlarge | 64 | 2.0 TiB | 130,000 |
Table 1. Instances selected for performance testing.
Performance test configuration
From the results published in our previous blog post, disabling hyper-threading provides the most significant reduction in license cost as both Windows and SQL Server licensing depends on the number of active processors. Disabling hyper-threading cuts the number of processors by half, significantly reducing the license cost. Further cost optimization is possible by disabling some of the cores, but it offers diminishing returns and generally requires additional performance testing to avoid shifting a workload to the CPU-bound category.
Our focus during performance testing was to compare the performance of SQL Server on the native instance with the performance achieved on the same instance after disabling hyper-threading. The EC2 r7a instance type presented a challenge with this approach as it does not support hyper-threading and all virtual CPUs are actual physical cores. Following the same pattern, we tested the EC2 r7a instances with default number of active CPUs and then we disabled half of them using the Optimize CPUs feature to simulate disabling the hyper-threading.
To run the performance tests, we used an industry-standard HammerDB benchmarking tool using TPCC-like workload emulating OLTP databases. For the larger instances (32xlarge and 48xlarge) in each type we used the 8 TiB HammerDB database (75,000 warehouses). For the smaller instances (16xlarge and 24xlarge respectively) of each type, which have less RAM, we used the 3.5 TiB database (30,000 warehouses). We configured HammerDB performance tests with the “Use All Warehouses” option to increase the I/O load on the system.
Performance test results
Table 2 summarizes performance test results for native instances and the same instances with hyper-threading disabled. Please note, as mentioned earlier, the EC2 r7a instance type does not use hyper-threading, so for our tests, we disabled half of the cores. We also calculated the on-demand monthly cost for respective license included instance configurations with Windows Server and Microsoft SQL Server Enterprise Edition using pricing for the US East (N. Virginia) region.
Instance Type | RAM (GB) |
MAX IOPS | vCPU | CPU (%) |
Monthly cost | Perf. (TPM) |
Perf. (relative) |
Cost Savings | Cost Savings (relative) |
r6idn.32x | 1,024 | 400,000 | 128 | 49% | $48,466.86 | 1,787,732 | |||
64 | 75% | $28,797.74 | 1,778,883 | 99.50% | $19,669.12 | 41% | |||
r7i.48x | 1,536 | 240,000 | 192 | 30% | $68,278.94 | 1,122,192 | |||
96 | 56% | $38,775.26 | 1,120,590 | 99.86% | $29,503.68 | 43% | |||
r7a.48x | 1,536 | 240,000 | 192 | 28% | $69,670.03 | 1,119,488 | |||
96 | 52% | $40,166.35 | 1,119,560 | 100.01% | $29,503.68 | 42% | |||
x2idn.32x | 2,048 | 260,000 | 128 | 27% | $49,074.98 | 1,311,519 | |||
64 | 48% | $29,405.86 | 1,307,481 | 99.69% | $19,669.12 | 40% | |||
x2iedn.32x | 4,096 | 260,000 | 128 | 36% | $58,811.72 | 1,652,385 | |||
64 | 52% | $39,142.60 | 1,619,810 | 98.03% | $19,669.12 | 33% |
Table 2. Effect of disabling hyper-threading on performance and cost for large instances
In Table 2, SQL Server performance levels, provided in transactions-per-minute (TPM) achieved, is listed for each instance type. In some cases, disabling hyper-threading resulted in slightly lower performance, while in some other cases slightly larger. All these differences are within the expected HammerDB performance test deviations, so we may assume that disabling hyper-threading or reducing the number of cores for the EC2 r7a instance did not affect observed performance.
As expected, disabling hyper-threading or some cores resulted in increase in CPU utilization, but in none of the cases reaching levels exceeding 75%. Disabling hyper-threading or some cores resulted in significant cost savings between 33% and 43%! For most of the instance types we tested, cost savings exceeded 40% except for the EC2 x2iedn instance type, where savings are 33% due to the relatively high cost of the hardware as compared to the license cost of Windows Server and Microsoft SQL Server Enterprise edition. Results for mid-tier instances of each of the instance type that we analyzed, presented in Table 3, closely matched results for the largest instances in terms of the relative cost savings. Again, we see relative cost savings between 40% and 43% for all instance types we tested except for the EC2 x2iedn instance type, which lagged with the relative cost savings of 33%. Disabling hyper-threading and reducing the number of cores did not affect performance but resulted in an increase in CPU utilization.
Instance Type | RAM | MAX IOPS | vCPU | CPU (%) |
Monthly cost |
Perf. (TPM) |
Perf. (relative) |
Cost Savings | Cost Savings (relative) |
r6idn.16x | 512 | 200,000 | 64 | 39% | $24,233.43 | 924,974 | |||
32 | 59% | $14,398.87 | 919,199 | 99.38% | $9,834.56 | 41% | |||
r7i.24x | 768 | 120,000 | 96 | 26% | $34,139.47 | 574,159 | |||
48 | 51% | $19,387.63 | 574,656 | 100.09% | $14,751.84 | 43% | |||
r7a.24x | 768 | 120,000 | 96 | 24% | $34,835.02 | 587,171 | |||
48 | 48% | $20,083.18 | 586,643 | 99.91% | $14,751.84 | 42% | |||
x2idn.16x | 1,024 | 173,333 | 64 | 36% | $24,537.49 | 955,941 | |||
32 | 58% | $14,702.93 | 960,984 | 100.53% | $9,834.56 | 40% | |||
x2iedn.16x | 2,048 | 130,000 | 64 | 32% | $29,405.86 | 956,963 | |||
32 | 46% | $19,571.30 | 928,396 | 97.01% | $9,834.56 | 33% |
Table 3. Effect of disabling hyper-threading on performance and cost for mid-tier instances
Encouraged by these results, we decided to test how far we can go with disabling cores on EC2 r7a instances until the workload became CPU-bound. The results of this extended series of tests are presented in Table 4.
Instance Type | RAM | MAX IOPS | vCPU | CPU (%) |
Monthly cost | Perf. (TPM) |
Perf. (relative) |
Cost Savings | Cost Savings (relative) |
r7a.24x | 768 | 120,000 | 96 | 24% | $34,835.02 | 587,171 | |||
48 | 48% | $20,083.18 | 586,643 | 99.91% | $14,751.84 | 42% | |||
36 | 62% | $16,395.22 | 589,366 | 100.37% | $18,439.80 | 53% | |||
24 | 85% | $12,707.26 | 592,581 | 100.92% | $22,127.76 | 64% | |||
r7a.48x | 1,536 | 240,000 | 192 | 28% | $69,670.03 | 1,119,488 | |||
96 | 52% | $40,166.35 | 1,119,560 | 100.01% | $29,503.68 | 42% | |||
72 | 69% | $32,790.43 | 1,119,164 | 99.97% | $36,879.60 | 53% | |||
48 | 91% | $25,414.51 | 1,118,900 | 99.95% | $44,255.52 | 64% |
Table 4. Effect of disabling cores on performance and cost for EC2 r7a instance type
Instances of this type have very fast AMD EPYC 4th generation (AMD EPYC 9R14) processors offering up to 3.7 GHz clock speed. This allowed us to reduce the number of cores by 75% before the workload became CPU-bound. This reduction in the number of cores resulted in cost savings of up to 64%! However, this also resulted in increasing CPU utilization to 85% and 91% respectively so we cannot recommend such a deep level of CPU optimization for every workload without specific load testing using real workloads.
Conclusion
Extending Optimize CPU feature to license-included deployments, offers cost savings for both Windows and SQL Server licenses. This results in a higher level of cost savings than were showed in our original blog post, which calculated savings only for SQL Server licenses assuming BYOL deployment.
SQL Server performance tests under various configurations, as discussed in this blog post, support the following conclusions and allow us to provide the following recommendations for effective use of the Optimize CPU feature.
- Disabling hyper-threading results in ~45% cost reduction (SQL Server EE) or 50% reduction in the number of required SQL Server and Windows licenses without a negative effect on performance when testing with HammerDB.
- For EC2 r7a instances, which do not support hyper-threading, disabling half of the cores appears to be a safe approach to reduce the cost without affecting SQL Server performance.
- Disabling more than 50% in number of active CPU cores may provide some additional cost savings but would require performance testing on real workloads.
- Applying Optimize CPUs feature results in a significant reduction in cost per 1,000 TPM.
These results were obtained using synthetic OLTP-type workloads. Your results may differ depending on the type of your database workload. Also, the cost and cost reduction estimates are calculated using baseline Microsoft license costs. Due to the different licensing options or negotiated discounts, your actual cost savings may deviate from the estimates provided in this blog post.