The Internet of Things on AWS – Official Blog
Transfer data from Amazon S3 to IoT Edge device
Seamlessly transferring data between cloud and edge devices is crucial for IoT applications across various industries, such as healthcare, manufacturing, autonomous vehicles, and aerospace. For example, it enables aircraft operators to seamlessly transfer software updates to aircraft fleets, eliminating the operational burden of manual updates with physical storage devices. By leveraging AWS IoT and Amazon Simple Storage Service (Amazon S3), you can establish a data transfer mechanism that enables real-time and historical data exchange between the cloud and edge devices.
Introduction
This blog post guides you through the step-by-step process of transferring data in the form of files from Amazon S3 to your IoT Edge devices.
We will be using AWS IoT Greengrass, which is an open-source edge runtime and cloud service for building, remotely deploying, and managing device software on millions of devices. IoT Greengrass provides prebuilt components for common use cases allowing you to discover, import, configure, and deploy applications and services at the edge without the need to understand different device protocols, manage credentials, or interact with external APIs. You can also create your own custom components based on your IoT use case.
In this blog, we will build and deploy a custom IoT Greengrass component that harnesses the capabilities of Amazon S3 Transfer Manager. The IoT Greengrass component performs actions like downloading through IoT Jobs topics. Parameters set on the IoT Jobs define these actions.
The S3 Transfer Manager uses multipart upload API and byte-range fetches to transfer files from Amazon S3 to the edge device. Please see the blog for details on S3 Transfer Manager capabilities.
Prerequisites
To simulate an edge device, we’ll be using an EC2 instance. Before we proceed with the steps to transfer files from Amazon S3 to your instance, ensure you have the following prerequisites in place:
- An AWS account with permissions to create and access Amazon EC2 instances, AWS Systems Manager (SSM), AWS Cloudformation stacks, AWS IAM Roles and Policies, Amazon S3, AWS IoT Core, and AWS IoT Greengrass services.
- AWS CLI installed and configured on your laptop with the SSM Manager Plugin.
- Follow the steps in the Visual Studio Code on EC2 for Prototyping repository to deploy an EC2 instance. Use browser-based VS Code IDE to edit files and execute the instructions.
The deployment creates the EC2 instance with an IAM Role that grants unrestricted access to all AWS resources. We recommend that you review the role attached to the EC2 instance and modify it to limit permissions to SSM, S3, IoT Core and IoT Greengrass.
Solution overview
Transferring files from Amazon S3 to an edge device involves creating a custom IoT Greengrass component called the “Download Manager”. This component is responsible for downloading files from Amazon S3 to the edge device, which, in this case, is an EC2 instance simulating an edge device. The process can be broken down into the following steps:
Step 1: Develop and package a custom IoT Greengrass Download Manager Component, which will handle the file transfer logic. Once packaged, upload this component to the designated Component and Content Bucket on Amazon S3.
Step 2: Using the AWS IoT Core service, build, publish, and deploy the Download Manager Component to the EC2 instance representing the edge device.
Step 3: Upload the files that need to be transferred to the edge device to the ‘Component and Content Bucket’ on Amazon S3.
Step 4: The deployed Download Manager Component on the an EC2 instance will download the files from the Amazon S3 bucket and store them locally on the edge device’s file system.
Figure 1 – Transfer files from Amazon S3 to EC2 instance simulating edge device
Solution walkthrough
Step 1: Develop and package custom IoT Greengrass Download Manager component
1.1 Clone the custom IoT Greengrass component from aws-samples repository
1.2 Follow the instructions to configure the EC2 instance as an IoT Greengrass core device
1.3 The IoT Greengrass Development Kit Command-Line Interface (GDK CLI) reads from a configuration file named gdk-config.json to build and publish components. Update the gdk-config.json file, replace us-west-2 with the region where the component will be deployed. Replace gdk_version 1.3.0 with the version of the gdk CLI you installed.
Step 2: Build, publish, and deploy Download Manager component
2.1 You can build and publish the Download Manager Component to the Amazon S3 bucket following the instructions here.
This step will automatically create an Amazon S3 bucket titled greengrass-artifacts-YOUR_REGION-YOUR_AWS_ACCOUNT_ID. Built components are stored as objects within this Amazon S3 bucket. We will use this Amazon S3 bucket to publish the custom Download Manager component and also use this to store the assets that will be downloaded to the EC2 instance.
2.2 Follow the instructions mentioned here to allow IoT Greengrass core device to access the Amazon S3 bucket.
2.3 After publishing the Download Manager component successfully, you can find it in the AWS Management Console → AWS IoT Core → Greengrass Devices → Components → My Components.
Figure 2 – AWS IoTCore list of Greengrass components
2.4 To enable the transfer of files from the Amazon S3 bucket to the edge device, we will deploy the Download Manager component to the simulated Greengrass device running on the EC2 instance. From the component list above, click on the component titled com.example.DownloadManager and hit Deploy, choose Create new deployment and hit Next.
2.5 Provide the deployment name as My Deployment and Deployment Target as Core Device. Type in the core device name which can be found from AWS Management Console → AWS IoT Core → Greengrass Devices → Core devices, and hit Next.
2.6 Select components: Along with the custom component, we will also deploy below listed AWS provided public components:
- aws.greengrass.Nucleus – The IoT Greengrass nucleus component is a mandatory component and the minimum requirement to run IoT Greengrass Core software on an edge device.
- aws.greengrass.Cli – The IoT Greengrass CLI component provides local command-line interface that you can use on edge device to develop and debug components locally. The IoT Greengrass CLI lets you create local deployments and restart components on the edge device.
- aws.greengrass.TokenExchangeService – The token exchange service provides AWS credentials that can be used to interact with AWS services from the custom components. This is essential for the boto3 library to download files from Amazon S3 bucket to the edge device.
Figure 3 – Select components to deploy
2.7 Configure Components: From the list of Public components, configure the Nucleus component and enable the `interpolateComponentConfiguration` flag to true. It is recommended to set this option to true so that the edge device can run IoT Greengrass components using recipe variables from the configuration. This would also refer to the thingName in the code base from an environment variable AWS_IOT_THING_NAME and don’t have to hardcode the thingName.
In the Configure components list, select the Nucleus component and hit Configure Component. Update the Configuration to Merge section as follows and hit Confirm.
Figure 4 – Configure aws.greengrass.Nucleus
2.8 Keep the deployment configuration as default and proceed to Review page and click Deploy.
2.9 You can monitor the process by viewing the IoT Greengrass log file on the simulated IoT Greengrass device running on the EC2 instance. You should see “status=SUCCEEDED” in the logs.
sudo tail -f /greengrass/v2/logs/greengrass.log
2.10 Once the deployment succeeds, you can tail the logs for the custom Download Manager component on the simulated IoT Greengrass device running on the EC2 instance as shown below. You should see currentState=RUNNING in the logs.
sudo tail -f /greengrass/v2/logs/com.example.DownloadManager.log
2.11 The download folder is configured to /opt/downloads
while deploying the custom Download Manager component. Monitor the download by opening a terminal window in the IDE with the following command
Step 3: Upload the file to be downloaded on the edge device
The Download Manager component facilitates the transfer of files from Amazon S3 to your edge device. AWS IoT Jobs plays a crucial role in this process by enabling you to define and execute remote operations on your connected devices. With AWS IoT Jobs, you can create a job that instructs your edge device to download files from a specified Amazon S3 bucket location. This job serves as a set of instructions, guiding the Download Manager component on where to look for the desired files within the Amazon S3 bucket. Once the job is created and sent to your edge device, the Download Manager component will initiate the download process, seamlessly transferring the specified files from Amazon S3 to your edge device’s local storage.
3.1 Create a folder titled uploads in the Amazon S3 bucket (greengrass-artifacts-YOUR_REGION-YOUR_AWS_ACCOUNT_ID
) created in Step 2.1. Upload the below GenAI generated image titled owl.png to the uploads folder on Amazon S3 bucket.
Figure 5 – GenAI generated image – owl.png
For simplicity purpose, we are reusing the same Amazon S3 bucket (greengrass-artifacts-YOUR_REGION-YOUR_AWS_ACCOUNT_ID
). However, as a best practice, create 2 separate buckets for IoT Greengrass components and the files that needed to be downloaded to the edge.
3.2 After the file has been uploaded to the Amazon S3 bucket, copy the S3 URI of this image to be used in the next step.The S3 URI will be s3://greengrass-artifacts-REGION-ACCOUNT_ID/uploads/owl_logo.png
Step 4: Download file from Amazon S3 to edge device
4.1 Create the AWS IoT Job Document
4.1.1 From the AWS Management Console navigate to AWS IoT Core → Remote actions→ Jobs and click Create job.
4.1.2 Choose create custom job
4.1.3 Give a job name for example Test-1 and optionally provide a description and click Next
4.1.4 For the Job Target choose the core device indicated by thing name <YOUR GREENGRASS DEVICE NAME
>. You may leave the Thing groups as empty for now.
4.1.5 Choose a Job document From a template and choose AWS-Download-File from Template
4.1.6 Paste the S3 URI in the downloadUrl section. The S3 URI must begin with s3://greengrass-artifacts-REGION-ACCOUNT_ID/uploads/owl_logo.png
4.1.7 For the filePath enter a sub-folder where you want the file will be downloaded. For this blog, we will create a folder titled images and click Next. Do not add a leading /
to the path as the component will auto append path prefixes.
4.1.8 For job configuration and run type, select Snapshot and click Submit.
4.2 Tail the component log on the EC2 instance to see the download folder being created and the image titled owl.png being downloaded.
sudo tail -f /greengrass/v2/logs/com.example.DownloadManager.log
4.3 Track Job Progress: Each Job document also supports updating the execution status from a job level and thing level. From the AWS Management Console → Jobs → Test-1→ Job executions.
Figure 6 – Track job executions
4.4 To view the status of execution from an edge device, click the checkbox for the core device under the Job executions section.
Figure 7 – View job execution status details
4.5 Once the file has been downloaded to the EC2 instance, you can find the file under /opt/downloads/images
folder in the core device.
Cleaning up
To ensure cost efficiency, this blog utilizes the AWS Free Tier for all services except the EC2 instance and EBS volume attached to the instance. The EC2 instance employed in this example requires an On-Demand t3.medium instance to accommodate both the development environment and the simulated edge device within the same underlying EC2 instance. For more information, please refer to the pricing details. Once you have completed this tutorial, remember to access the AWS Console and delete the resources created during the process by following the instructions provided. This step is crucial to prevent any unintended charges from accruing in the future.
Clean-up instructions:
- Open S3 from AWS console and delete the contents of the Amazon S3 bucket titled greengrass-artifacts-YOUR_REGION-YOUR_AWS_ACCOUNT_ID and the Amazon S3 bucket
- Open IoT Core from the AWS console and delete all the jobs from IoT Jobs Manager Dashboard
- Open IoT Greengrass from the AWS console and delete the IoT thing Group, Thing, Certificate, Policies and Role associated with MyGreengrassCore
- Follow the cleanup instructions in the aws-samples VS Code on EC2 repository
Customer Reference
AWS customers are using this approach to transfer files from Amazon S3 to the edge device.
Conclusion
This blog post demonstrates how AWS customers can efficiently move data from Amazon S3 to their edge devices. The outlined steps enable seamless downloads of software updates, firmware updates, content, and other essential files. Real-time monitoring capabilities provide complete visibility and control over all file transfers. You can further optimize your operations by implementing pause and resume functionality covered in the blog. Additionally, you can use AWS IoT Greengrass and Amazon S3 Transfer Manager for implementing reverse data flow from edge devices to Amazon S3. Moreover, through a custom IoT Greengrass component you can facilitate the upload of logs and telemetry data, unlocking powerful opportunities for predictive maintenance, real-time analytics, and data-driven insights.
About the authors