AWS HPC Blog
Support for Instance Allocation Flexibility in AWS ParallelCluster 3.3
With AWS ParallelCluster, you can build an autoscaling HPC cluster that adapts its size and cost to the amount of work you have to do.
ParallelCluster accomplishes this by monitoring how many Amazon EC2 virtual CPUs (vCPU) are needed to run the pending jobs in a Slurm queue. If that number is higher than a configurable threshold, the Slurm scheduler adds more EC2 instances to the cluster and makes them available to run jobs. In previous versions of ParallelCluster, these new instances all had to launch from the same compute resource, and compute resources could only use one instance type. If there were not enough instances of the requested type available in the availability zone mapped to the cluster, ParallelCluster could switch over to another compute resource and try the instance type configured there. It was not, however, possible to launch instances from multiple compute resources at the same time.
Today we’re announcing “multiple instance type allocation” in ParallelCluster 3.3.0. This feature enables you to specify multiple instance types to use when scaling up the compute resources for a Slurm job queue. Your HPC workloads will have more paths forward to get access to the EC2 capacity they need, helping you to get more computing done.
This post explains in detail how the new feature works and how to configure your cluster to use it.
What’s New
In previous versions of ParallelCluster, you defined one or more Slurm queues. You configured each queue with one or more compute resources, and each compute resource could be configured with one EC2 instance type that would be used to launch a collection of instances (Figure 1A). But, only one compute resource – and as a consequence, one instance type could be available for instance launches at any given time.
 
 
        Figure 1: ParallelCluster configurations now support multiple instance types in a compute resource as well as a queue-level allocation strategy.
However, there are many EC2 instance types that are close enough in architecture that they could be used interchangeably. They have the same vCPU count, processor architecture, accelerator, network capability, and so on. Customers asked to be able to combine those instance types in a single Slurm compute resource which is what ParallelCluster 3.3.0 now enables. Specifically, you can provide a list of instance types (Figure 1B) and set an allocation strategy (Figure 1C) to optimize the cost and total time to solution of your HPC jobs.
Using Multiple Instance Type Allocation with Slurm
To take advantage of this new capability, you’ll need ParallelCluster to 3.3.0. You can follow this online guide to help you upgrade. Next, edit your cluster configuration as described below. Finally, create a cluster using the new configuration.
Configuring Your Cluster
The schema for the ParallelCluster configuration file has changed to support flexible instance types. In this example, we define a Slurm queue called flex_od powered by an Amazon EC2 on-demand instance. It has a single compute resource named cra. Note how this differs from earlier versions of the ParallelCluster configuration file. Rather than defining a single InstanceType, we now have a parameter named Instances. It contains three InstanceType entries, one for each instance type we want to use.
...
Scheduling:
  Scheduler: slurm
  SlurmQueues:
  - Name: flex_od
    CapacityType: ONDEMAND
    ComputeResources:
    - Name: cra
      Instances:
        - InstanceType: c6a.24xlarge
        - InstanceType: m6a.24xlarge
        - InstanceType: r6a.24xlarge
      MinCount: 0
      MaxCount: 100
    Networking:
      SubnetIds:
      - subnet-0123456789
...
Our new configuration tells ParallelCluster to launch up to 100 total c6a.24xlarge, m6a.24xlarge, and r6a.24xlarge on-demand instances to process jobs in the flex_od queue. You can switch to using Spot Instances by changing the CapacityType to SPOT. Also, you can still have multiple compute resources for each queue if you have a use case that requires that configuration.
Selecting Instance Types
There are some rules that define which instance types can be combined in a compute resource. First, the CPUs must all be the same broad architecture (e.g. x86), but they can be from different manufacturers (such as Intel and AMD). They must have the same number of vCPUs, or, if CPU hyperthreading is disabled, they must have the same number of physical cores. Next, if the instances have accelerators, they must have the same number and be from the same manufacturer. Finally, if EFA is enabled for the queue, all instance types must support EFA.
You can find instance types with matching criteria by searching in the AWS EC2 Console under Instance Types (Figure 2).
 
 
        Figure 2 – the instance type search window in the Amazon EC2 console allows you to search for instance types by filtering for common properties.
You can also use the AWS CLI with a search filter. Here’s an example to find all instance types with 64 x x86 vCPUs, along with example output (Figure 3).
aws ec2 describe-instance-types --region REGION-ID \
  --filters "Name=vcpu-info.default-vcpus,Values=64" "Name=processor-info.supported-architecture,Values=x86_64" \
  --query "sort_by(InstanceTypes[*].{InstanceType:InstanceType,MemoryMiB:MemoryInfo.SizeInMiB,CurrentGeneration:CurrentGeneration,VCpus:VCpuInfo.DefaultVCpus,Cores:VCpuInfo.DefaultCores,Architecture:ProcessorInfo.SupportedArchitectures[0],MaxNetworkCards:NetworkInfo.MaximumNetworkCards,EfaSupported:NetworkInfo.EfaSupported,GpuCount:GpuInfo.Gpus[0].Count,GpuManufacturer:GpuInfo.Gpus[0].Manufacturer}, &InstanceType)" \
  --output table 
 
        Figure 3: Table of AWS EC2 instance types with 64 vCPU based on the x86_64 architecture generated by the AWS CLI
Choosing an Allocation Strategy
By default, ParallelCluster will optimize for cost by launching the least expensive instances first.
However, when you are using Spot instances, you will probably want to optimize the chances that your jobs will run to completion instead of being interrupted. This is especially the case for workloads where it may be quite expensive to checkpoint and re-start work in progress. You can configure a ParallelCluster queue with this optimization by adding an AllocationStategy key to the queue and setting it to capacity-optimized, rather than its default value of lowest-price. Here’s an example:
...
Scheduling:
  Scheduler: slurm
  SlurmQueues:
  - Name: flex_spot
    CapacityType: SPOT
    AllocationStrategy: capacity-optimized
    ComputeResources:
    - Name: cra
      Instances:
        - InstanceType: c6i.xlarge
        - InstanceType: m6i.xlarge
        - InstanceType: r6i.xlarge
      MinCount: 0
      MaxCount: 100
    Networking:
      SubnetIds:
      - subnet-0123456789
...
Under this configuration, EC2 Fleet will look at real-time Spot capacity and launch instances into pools that are the most available. This offers the possibility of fewer interruptions to running HPC jobs. Depending on the cost of interruptions, this strategy, which may not always launch the least expensive instances, may still reduce the total cost or runtime of your jobs.
It’s also possible to change the AllocationStrategy dynamically without having to stop and restart the compute fleet. This can help you respond to shifting Spot availability conditions. To accomplish this, change the Slurm queue update strategy to either DRAIN or TERMINATE. This will set nodes affected by the changed configuration to the DRAINING state — they won’t accept new jobs but will continue running any jobs already in process. TERMINATE, on the other hand, would immediately stop any running jobs on the affected nodes. You can consult the pcluster update documentation for more detail on cluster updates using this setting. Here’s an example of setting it to DRAIN:
Scheduling:
  Scheduler: slurm
  SlurmSettings:
    QueueUpdateStrategy: DRAIN
Conclusion
Using ParallelCluster 3.3 you can dynamically specify a list of EC2 instance types, and optionally, an allocation strategy, to define aggregate capacity for your Slurm job scheduler queues. This gives you newfound flexibility in how you assemble compute capacity for your HPC workloads. You’ll need to update your ParallelCluster installation and configuration to use this new capability.
We’d love to know what you think after trying out ParallelCluster multiple instance type allocation, and how we can improve it.