Amazon EMR on EC2 Spot Instances
Performance, scale, and deep cost savings on big data workloads
Optimize Cost and Performance for Big Data Workloads with EC2 Spot Instances
Amazon EMR reduces the complexity of managing big data frameworks (e.g. Apache Spark and Hive), while taking advantage of cloud best practices such as separating compute and storage.
Due to the deep and broad scale of AWS, unused EC2 capacity is offered at up to a 90% discount (vs On-Demand pricing) through Amazon EC2 Spot Instances. While EC2 can reclaim Spot capacity with a two-minute warning, less than 5% of workloads are interrupted. Due to the fault-tolerant nature of big data workloads on EMR, they can continue processing, even when interrupted. Running EMR on Spot Instances drastically reduces the cost of big data, allows for significantly higher compute capacity, and reduces the time to process big data sets.
See Spot Instance price savings vs On-Demand by filtering for “Instance types supported by EMR” on the Spot Instance Advisor page.
Benefits
Features
Customer Case Studies
Additional Resources
Amazon EMR on EC2 Spot Instances
Setting up EMR Clusters on Spot Instances
Short vs Long-Running EMR Clusters on Spot Instances
More Resources
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages