Why Amazon EC2 DL1 Instances?
Amazon EC2 DL1 instances powered by Gaudi accelerators from Habana Labs (an Intel company), deliver low cost-to-train deep learning models for natural language processing, object detection, and image recognition use cases. DL1 instances provide up to 40% better price performance for training deep learning models compared to current generation GPU-based EC2 instances.
Amazon EC2 DL1 instances feature 8 Gaudi accelerators with 32 GiB of high bandwidth memory (HBM) per accelerator, 768 GiB of system memory, custom 2nd generation Intel Xeon Scalable processors, 400 Gbps of networking throughput, and 4 TB of local NVMe storage.
 
 DL1 instances include the Habana SynapseAI® SDK, that is integrated with leading machine learning frameworks such as TensorFlow and PyTorch.
You can get started on DL1 instances easily, using AWS Deep Learning AMIs or AWS Deep Learning Containers, or Amazon EKS and ECS for containerized applications. Support for DL1 instances in Amazon SageMaker is coming soon.
New Amazon EC2 DL1 instances overview video
Benefits
Low cost-to-train for deep learning models
Ease of use and code portability
Support for leading ML frameworks and models
Features
Powered by Gaudi Accelerators from Habana Labs
Habana SynapseAI® SDK
High Performance Networking and Storage
Built on AWS Nitro System
DL1 instances are built on the AWS Nitro System, which is a rich collection of building blocks that offloads many of the traditional virtualization functions to dedicated hardware and software to deliver high performance, high availability, and high security while also reducing virtualization overhead.
Product details
|  
                              Instance Size 
                              |  
                              vCPU 
                              |  
                              Instance Memory (GiB) 
                              |  
                              Gaudi Accelerators 
                              |  
                              Network Bandwidth (Gbps) 
                              |  
                              Accelerator Peer-to-Peer Bidirectional (Gbps) 
                              |  
                              Instance Storage (GB) 
                              |  
                              EBS Bandwidth (Gbps) 
                              |  
                              On-demand (Price/Hr) 
                              |  
                              1-yr Reserved Instance Effective Hourly 
                              |  
                              3-yr Reserved Instance Effective Hourly* 
                              | 
|---|---|---|---|---|---|---|---|---|---|---|
|  
                              dl1.24xlarge 
                               | 96 | 768 | 8 | 400 | 100 | 4 x 1000  | 19 | $13.11 | $7.87 | $5.24 | 
*Prices shown are for US East (N. Virginia) and US West (Oregon) regions.
Seagate
Seagate Technology has been a global leader offering data storage and management solutions for over 40 years. Seagate’s data science and machine learning engineers have built an advanced deep learning (DL) defect detection system and deployed it globally across the company’s manufacturing facilities. In a recent proof of concept project, Habana Gaudi exceeded the performance targets for training one of the DL semantic segmentation models currently used in Seagate’s production.
"We expect the significant price performance advantage of Amazon EC2 DL1 instances, powered by Habana Gaudi accelerators, could make a compelling future addition to AWS compute clusters. As Habana Labs continues to evolve and enables broader coverage of operators, there is potential for expanding to additional enterprise use cases, and thereby harnessing additional cost savings."
 
 
                Leidos
Leidos is recognized as a Top 10 Health IT provider delivering a broad range of customizable, scalable solutions to hospitals and health systems, biomedical organizations, and every U.S. federal agency focused on health.
"One of the numerous technologies we are enabling to advance healthcare today is the use of machine learning and deep learning for disease diagnosis based on medical imaging data. Our massive data sets require timely and efficient training to aid researchers seeking to solve some of the most urgent medical mysteries. Given Leidos's and its customers' need for quick, easy, and cost-effective training for deep learning models, we are excited to have begun this journey with Intel and AWS to use Amazon EC2 DL1 instances based on Habana Gaudi AI processors. Using DL1 instances, we expect an increase in model training speed and efficiency, with a subsequent reduction in risk and cost of research and development."
 
 
                Intel
Intel has created 3D Athlete Tracking technology that analyzes athlete-in-action video in real time to inform performance training processes and enhance audience experiences during competitions.
"Training our models on Amazon EC2 DL1 instances, powered by Gaudi accelerators from Habana Labs, will enable us to accurately and reliably process thousands of videos and generate associated performance data, while lowering training cost. With DL1 instances, we can now train at the speed and cost required to productively serve athletes, teams, and broadcasters of all levels across a variety of sports."
 
 
                RiskFuel
RiskFuel provides real-time valuations and risk sensitivities to companies managing financial portfolios, helping them increase trading accuracy and performance.
"Two factors drew us to Amazon EC2 DL1 instances based on Habana Gaudi AI accelerators. First, we want to make sure our banking and insurance clients can run Riskfuel models that take advantage of the newest hardware. Fortunately, we found migrating our models to DL1 instances to be simple and straightforward – really, it was just a matter of changing a few lines of code. Second, training costs are a big component of our spending, and the promise of up to 40% improvement in price performance offers potentially substantial benefit to our bottom line."
 
 
                Fractal
Fractal is a global leader in artificial intelligence and analytics, powering decisions in Fortune 500 companies.
"AI and deep learning are at the core of our Machine Vision capability, enabling customers to make better decisions across industries we serve. In order to improve accuracy, data sets are becoming larger and more complex, requiring larger and more complex models. This is driving the need for improved compute price performance. The new Amazon EC2 DL1 instances promise significantly lower cost training than GPU-based EC2 instances. We expect this to make training of AI models on cloud much more cost competitive and accessible than before for a broad array of clients."
 
 
                Getting started
The AWS Deep Learning AMIs (DLAMI) and AWS Deep Learning Containers (DLC)
Amazon Elastic Kubernetes Service (EKS) or Elastic Container Service (ECS)
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages