Optimize CPUs best practices for SQL Server workloads – continued

In this blog post, we will extend the analysis of the Optimize CPUs feature provided in a previous publication to other Amazon Elastic Compute Cloud (Amazon EC2) instances suitable for deployment of Microsoft SQL Server workloads on the Amazon Web Services (AWS) Cloud. Amazon EC2 now allows customers to modify an instance’s CPU options to optimize the licensing costs of Microsoft Windows license-included workloads. You can now customize the number of vCPUs and/or disable hyperthreading on Windows Server and SQL Server license-included instances to save on vCPU-based licensing costs.

Introduction

In our previous blog post, we detailed the Optimize CPUs feature for the EC2 r6idn instance type. There are many other instance types available, each with different combinations of CPU, memory, storage, and networking capacity.

Due to the multitude of AWS instance families and multiple variations within each family, we are unable to test SQL Server performance on all available Amazon EC2 instance types. We limited our testing to instances matching criteria established for SQL Server workloads.

We focused our SQL Server performance testing on Memory Optimized instances, which offer 8, 16, and 32 GB of RAM per vCPU. Access to a large amount of RAM is beneficial for SQL Server. Within the Memory Optimized instance type, we selected instances with the highest IOPS as SQL Server performance strongly correlates with the performance of the storage subsystem. Our EC2 instance selection was limited to those that use Intel or AMD processors as SQL Server at present does not support ARM-based AWS Graviton Processors. Finally, we focused only on the latest generations of the instance types that matched our criteria.

In addition to the EC2 r6idn instance, which we analyzed in our previous blog post, our selection criteria left us with four instance types for additional analysis. There are multiple instance sizes within each instance type, but we limited our testing to the largest and mid-tier instances within each type. The list of instances included in testing is presented in Table 1.

Type	Size	#CPU	CPU Clock	RAM	IOPS
r7i	48xlarge	192	3.2 GHz	1.5 TiB	240,000
r7i	24xlarge	96	3.2 GHz	768 GiB	120,000
r7a	48xlarge	192	3.7 GHz	1.5 TiB	240,000
r7a	24xlarge	96	3.7 GHz	768 GiB	120,000
x2idn	32xlarge	128	3.5 GHz	1.0 TiB	260,000
x2idn	16xlarge	64	3.5 GHz	2.0 TiB	173,333
x2iedn	32xlarge	128	3.5 GHz	4.0TiB	260,000
x2iedn	16xlarge	64	3.5 GHz	2.0 TiB	130,000

Table 1. Instances selected for performance testing.

Performance test configuration

From the results published in our previous blog post, disabling hyper-threading provides the most significant reduction in license cost as both Windows and SQL Server licensing depends on the number of active processors. Disabling hyper-threading cuts the number of processors by half, significantly reducing the license cost. Further cost optimization is possible by disabling some of the cores, but it offers diminishing returns and generally requires additional performance testing to avoid shifting a workload to the CPU-bound category.

Our focus during performance testing was to compare the performance of SQL Server on the native instance with the performance achieved on the same instance after disabling hyper-threading. The EC2 r7a instance type presented a challenge with this approach as it does not support hyper-threading and all virtual CPUs are actual physical cores. Following the same pattern, we tested the EC2 r7a instances with default number of active CPUs and then we disabled half of them using the Optimize CPUs feature to simulate disabling the hyper-threading.

To run the performance tests, we used an industry-standard HammerDB benchmarking tool using TPCC-like workload emulating OLTP databases. For the larger instances (32xlarge and 48xlarge) in each type we used the 8 TiB HammerDB database (75,000 warehouses). For the smaller instances (16xlarge and 24xlarge respectively) of each type, which have less RAM, we used the 3.5 TiB database (30,000 warehouses). We configured HammerDB performance tests with the “Use All Warehouses” option to increase the I/O load on the system.

Performance test results

Table 2 summarizes performance test results for native instances and the same instances with hyper-threading disabled. Please note, as mentioned earlier, the EC2 r7a instance type does not use hyper-threading, so for our tests, we disabled half of the cores. We also calculated the on-demand monthly cost for respective license included instance configurations with Windows Server and Microsoft SQL Server Enterprise Edition using pricing for the US East (N. Virginia) region.

Instance Type	RAM (GB)	MAX IOPS	vCPU	CPU (%)	Monthly cost	Perf. (TPM)	Perf. (relative)	Cost Savings	Cost Savings (relative)
r6idn.32x	1,024	400,000	128	49%	$48,466.86	1,787,732
r6idn.32x	1,024	400,000	64	75%	$28,797.74	1,778,883	99.50%	$19,669.12	41%
r7i.48x	1,536	240,000	192	30%	$68,278.94	1,122,192
r7i.48x	1,536	240,000	96	56%	$38,775.26	1,120,590	99.86%	$29,503.68	43%
r7a.48x	1,536	240,000	192	28%	$69,670.03	1,119,488
r7a.48x	1,536	240,000	96	52%	$40,166.35	1,119,560	100.01%	$29,503.68	42%
x2idn.32x	2,048	260,000	128	27%	$49,074.98	1,311,519
x2idn.32x	2,048	260,000	64	48%	$29,405.86	1,307,481	99.69%	$19,669.12	40%
x2iedn.32x	4,096	260,000	128	36%	$58,811.72	1,652,385
x2iedn.32x	4,096	260,000	64	52%	$39,142.60	1,619,810	98.03%	$19,669.12	33%

Table 2. Effect of disabling hyper-threading on performance and cost for large instances

In Table 2, SQL Server performance levels, provided in transactions-per-minute (TPM) achieved, is listed for each instance type. In some cases, disabling hyper-threading resulted in slightly lower performance, while in some other cases slightly larger. All these differences are within the expected HammerDB performance test deviations, so we may assume that disabling hyper-threading or reducing the number of cores for the EC2 r7a instance did not affect observed performance.

As expected, disabling hyper-threading or some cores resulted in increase in CPU utilization, but in none of the cases reaching levels exceeding 75%. Disabling hyper-threading or some cores resulted in significant cost savings between 33% and 43%! For most of the instance types we tested, cost savings exceeded 40% except for the EC2 x2iedn instance type, where savings are 33% due to the relatively high cost of the hardware as compared to the license cost of Windows Server and Microsoft SQL Server Enterprise edition. Results for mid-tier instances of each of the instance type that we analyzed, presented in Table 3, closely matched results for the largest instances in terms of the relative cost savings. Again, we see relative cost savings between 40% and 43% for all instance types we tested except for the EC2 x2iedn instance type, which lagged with the relative cost savings of 33%. Disabling hyper-threading and reducing the number of cores did not affect performance but resulted in an increase in CPU utilization.

Instance Type	RAM	MAX IOPS	vCPU	CPU (%)	Monthly cost	Perf. (TPM)	Perf. (relative)	Cost Savings	Cost Savings (relative)
r6idn.16x	512	200,000	64	39%	$24,233.43	924,974
r6idn.16x	512	200,000	32	59%	$14,398.87	919,199	99.38%	$9,834.56	41%
r7i.24x	768	120,000	96	26%	$34,139.47	574,159
r7i.24x	768	120,000	48	51%	$19,387.63	574,656	100.09%	$14,751.84	43%
r7a.24x	768	120,000	96	24%	$34,835.02	587,171
r7a.24x	768	120,000	48	48%	$20,083.18	586,643	99.91%	$14,751.84	42%
x2idn.16x	1,024	173,333	64	36%	$24,537.49	955,941
x2idn.16x	1,024	173,333	32	58%	$14,702.93	960,984	100.53%	$9,834.56	40%
x2iedn.16x	2,048	130,000	64	32%	$29,405.86	956,963
x2iedn.16x	2,048	130,000	32	46%	$19,571.30	928,396	97.01%	$9,834.56	33%

Table 3. Effect of disabling hyper-threading on performance and cost for mid-tier instances

Encouraged by these results, we decided to test how far we can go with disabling cores on EC2 r7a instances until the workload became CPU-bound. The results of this extended series of tests are presented in Table 4.

Instance Type	RAM	MAX IOPS	vCPU	CPU (%)	Monthly cost	Perf. (TPM)	Perf. (relative)	Cost Savings	Cost Savings (relative)
r7a.24x	768	120,000	96	24%	$34,835.02	587,171
			48	48%	$20,083.18	586,643	99.91%	$14,751.84	42%
			36	62%	$16,395.22	589,366	100.37%	$18,439.80	53%
			24	85%	$12,707.26	592,581	100.92%	$22,127.76	64%
r7a.48x	1,536	240,000	192	28%	$69,670.03	1,119,488
			96	52%	$40,166.35	1,119,560	100.01%	$29,503.68	42%
			72	69%	$32,790.43	1,119,164	99.97%	$36,879.60	53%
			48	91%	$25,414.51	1,118,900	99.95%	$44,255.52	64%

Table 4. Effect of disabling cores on performance and cost for EC2 r7a instance type

Instances of this type have very fast AMD EPYC 4th generation (AMD EPYC 9R14) processors offering up to 3.7 GHz clock speed. This allowed us to reduce the number of cores by 75% before the workload became CPU-bound. This reduction in the number of cores resulted in cost savings of up to 64%! However, this also resulted in increasing CPU utilization to 85% and 91% respectively so we cannot recommend such a deep level of CPU optimization for every workload without specific load testing using real workloads.

Conclusion

Extending Optimize CPU feature to license-included deployments, offers cost savings for both Windows and SQL Server licenses. This results in a higher level of cost savings than were showed in our original blog post, which calculated savings only for SQL Server licenses assuming BYOL deployment.

SQL Server performance tests under various configurations, as discussed in this blog post, support the following conclusions and allow us to provide the following recommendations for effective use of the Optimize CPU feature.

Disabling hyper-threading results in ~45% cost reduction (SQL Server EE) or 50% reduction in the number of required SQL Server and Windows licenses without a negative effect on performance when testing with HammerDB.
For EC2 r7a instances, which do not support hyper-threading, disabling half of the cores appears to be a safe approach to reduce the cost without affecting SQL Server performance.
Disabling more than 50% in number of active CPU cores may provide some additional cost savings but would require performance testing on real workloads.
Applying Optimize CPUs feature results in a significant reduction in cost per 1,000 TPM.

These results were obtained using synthetic OLTP-type workloads. Your results may differ depending on the type of your database workload. Also, the cost and cost reduction estimates are calculated using baseline Microsoft license costs. Due to the different licensing options or negotiated discounts, your actual cost savings may deviate from the estimates provided in this blog post.

Microsoft Workloads on AWS

Optimize CPUs best practices for SQL Server workloads – continued

Introduction

Performance test configuration

Performance test results

Conclusion

Resources

Follow

Learn

Resources

Developers

Help