AWS for M&E Blog
How to configure a low-latency HLS workflow using AWS Media Services
The HTTP Live Streaming (HLS) protocol allows delivery of live streams to global-scale audiences. Historically, HLS has favored stream reliability over latency. Low-Latency HLS (LL-HLS) is an extension of the protocol that appeared in 2020 in the 2nd edition of the specification, enabling low-latency video streaming while maintaining scalability. It allows reduction of live streaming latency by a factor of two. While regular HLS latency usually ranges between 12 and 30 seconds depending on the workflow configuration and the player capabilities, LL-HLS brings end-to-end workflows latency to between 5 and 10 seconds. This enables streaming video latency to rival broadcast video latency, which is 6 seconds on average, and typically prevents viewers of a streamed live sport event from being spoiled by surrounding TVs or by social media stemming from the stadium or from broadcast viewers. The technology also opens up exciting new use cases like augmenting the broadcast viewing experience with additional synchronized camera feeds delivered over LL-HLS on second screen devices, as demonstrated by Amazon Web Services (AWS) partner NativeWaves.
In May 2023, AWS Elemental MediaPackage launched support for the packaging of media streams in LL-HLS, both with transport stream (TS) and CMAF segments. This blog post explains how to configure multiple AWS Media Services – namely AWS Elemental MediaLive , MediaPackage and Amazon CloudFront – to support LL-HLS workflows. DRM content protection with SPEKE v2 and multi-key encryption is also supported in these workflows, and documented in a previous blog post. Let’s first start with a summary of the LL-HLS mechanisms in order to map configuration settings.
How does LL-HLS reduce latency?
LL-HLS uses a combination of approaches to achieve this goal.
- Blocking Playlist Reload: While it was already possible to somewhat reduce latency with regular HLS by using short segments, it is not possible to guarantee latency predictability with this approach, as the video player requests HLS media playlists on a random timing. While using a cascading effect, it also requests media segments on a random timing. This unpredictable request timing then leads to varying latencies and an unpredictable positioning of the play head compared to the edge of the live stream. LL-HLS solves this core problem by making sure that the player always requests the most recent media playlist, which is done by having the player find the most recent media sequence and partial segment number values from the initial media playlist response, and add incremented values as query string parameters (respectively _HLS_msn and _HLS_part) to subsequent media playlist requests. The Blocking Playlist Reload mechanism ensures that the playlist request is kept open until the requested media sequence number and partial segment number are reached on the origin, at which stage the media playlist, including references to the latest available partial segments, will be returned to the player through the CDN.
- Rendition report: this mechanism complements the Blocking Playlist Reload mechanism. Rendition Report signaling allows the player to understand the values of the most recent media sequence number and partial segments in other media playlists so that it can switch bitrate without reassessing the last media sequence number and partial segment number from a default media playlist obtained at switch time, without the use of query string parameters.
- Partial segments: LL-HLS playlists include a mix of classical full duration segments (usually 6 seconds) and partial segments (also known as “parts”) that have a much smaller duration (usually between 500 milliseconds and 2 seconds) and roll off the playlists after exceeding a duration of three full segments from the edge of the live stream. LL-HLS players will consume these partial segments, as they are available before the full duration segments, which will allow play head positioning at a shorter distance than a full segment duration from the edge of the live stream. Partial segments are referenced in the playlists as soon as they are pushed successfully to the origin.
- Hinted partial segments: the end of LL-HLS playlists can include EXT-X-PRELOAD-HINT tags that reference partial segments that don’t yet exist at the time when the media playlist is returned to the player. Requests to such predictive parts are then issued by the player. The origin will hold onto the request until it can respond, resulting in the shortest time to partial segment delivery to the player.
- HTTP delivery optimizations: LL-HLS requires HTTP/2 on the CDN side in order to benefit from the multiplexing benefits of this HTTP version. On the public hls-interest mailing-list, Roger Pantos recently announced that iOS 17 would add support for HTTP/3 and that LL-HLS delivery over HTTP/3 would require the use of server-defined priorities described by RFC 9218 (Extensible Prioritization Scheme for HTTP), to prioritize playlists delivery. AWS will continue to update its solutions, when such improvements are added to the specification.
If you are looking for more information on LL-HLS, a good summary of all its mechanisms is available on the Apple Developer website.
Workflow architecture and latency measurement
The configuration steps in the following sections aim to build an end-to-end workflow that combines a contribution encoder (AWS Elemental Link in the following example, but another contribution encoder could be used) pushing to MediaLive for the ABR encoding of HLS with 1s segments, then to MediaPackage for the repackaging of this ingest stream into LL-HLS with 1s parts, and finally to CloudFront for the last mile delivery of the streams. The easiest way to measure the glass-to-glass latency of such workflows is to film a clapperboard application running on a tablet, sitting alongside the LL-HLS player screen, and to take a picture of the two screens side-by-side. The difference between the two timecodes is the glass-to-glass latency.
 
 
        Figure 1 Workflow for measuring glass-to-glass latency
For more information about how to perform fine-grain latency measurements per workflow component, refer to a previous blog post outlining methodologies to do so.
MediaPackage configuration
First, you need to configure a Channel Group. Channel Groups are logical containers that host your channels and define the ingest and origination DNS entries of these channels. Combined with other user-defined parameters like Channel name, Endpoint name, and Manifest name, MediaPackage v2 makes all ingest and origination URLs fully predictable. In the MediaPackage console, select “Channel Groups” in the Live v2 section, and click on the “Create channel group” button.
 
 
        Figure 2 Selecting Channel group under Live v2 and creating channel group
Enter a name for the Channel Group and a description (optional).
 
 
        Figure 3 Assigning name and description to channel group
You can start creating Channels with the “Create channel” button once your Channel Group is created.
 
 
        Figure 4 Channel group Egress domain and ARN
Enter a name for the Channel and a description (optional), then select “Attach a custom IAM policy”.
 
 
        Figure 5 Assigning channel name and description
The policy field expects a json snippet that allows MediaPackage to identify MediaLive through the AWS Signature Version 4 (SigV4) headers that MediaLive adds to the ingest requests. Following is a reference model for this:
Replace the red parts with your actual AWS account number, AWS region (e.g. “us-west-2”), Channel group name (e.g. “demo-channel-group” and Channel name (e.g. “LL-HLS-demo-channel”) values. Once your parameters have been entered, copy and paste the IAM policy into the policy field and select “Create”.
Once your channel is created, select “Settings” to view the MediaPackage HLS ingestion points. Take note of the ingestion endpoint URIs as these will be required to create the ABR encoding channel in MediaLive in the next section.
 
 
        Figure 6 Channel created and showing endpoint details for ingestion of Live channel
Select “Origin endpoints” and then select “Create endpoint”. The endpoint defines the media segment parameters. Enter the name and description (optional) of the endpoint. You can either select TS or CMAF (also known as FMP4) as the container type, depending on your player support constraints. CMAF is recommended unless legacy devices are in scope. In the additional settings, the default start over window is 15 minutes (900 seconds). You can change this value up to 14 days (1209600 seconds), depending on how long you need MediaPackage to retain your ingest media segments on disk for origination.
 
 
        Figure 7 Create origin endpoint
The Segment duration value is the total length of a full segment, generated through the concatenation of multiple shorter ingest segments. It’s not the duration of partial segments, which is defined by the duration of ingest segments coming from the ABR encoder (Set to 1 second in our example MediaLive configuration that follows). The ”Include IFrame-only streams” option will generate one distinct IFrame-only track per video rendition present in your ingest streamset. The “Enable SCTE support” will define the ad insertion behavior of your endpoint, if selected. You can find detailed information on this setting in the “SCTE-35 messages” user guide section.
 
 
        Figure 8 Attaching public policy to endpoint
The “Encrypt content” option allows you to configure all the encryption and DRM settings that can be applied to your endpoint. MediaPackage v2 supports different options in terms of encryption schemes and DRM types, depending on the endpoint type (TS or CMAF). Please refer to the Encryption fields and SPEKE Version 2.0 presets pages in the user guide for comprehensive information on all the exposed configuration parameters.
In the Endpoint policy section, you define the origination behavior of your endpoint. Using “Don’t attach a policy” will totally disable the origination, while using “Attach a custom policy” will allow you to restrict origination based on a variety of conditions: AWS SigV4 if your CDN supports this authentication method when forwarding requests to origins, AWS accounts, or IP ranges for other use cases. If you select “Attach a public policy”, MediaPackage will populate the policy field for you with a policy allowing public access to all of your endpoint objects, and will include the correct AWS account number, AWS region, Channel Group, and Channel name values relevant to your endpoint. For more details on these endpoint policies, please refer to the “Origin endpoint authorization” in the user guide.
On the same screen, you will now be able to create multiple manifests sharing the same media segments produced by the endpoint. LL-HLS playlists should be backward compatible with legacy HLS players – meaning that these players should be able to ignore all the new HLS tags related to the low latency mode. If some of your HLS players actually don’t properly ignore these tags, you can always create regular latency HLS streams through the first half of the Manifest definitions section.
 
 
        Figure 9 Add HLS manifest and enter details
The LL-HLS playlists should be configured in the second half of the Manifest definitions section. It’s important for the resulting latency to configure a Program date/time interval that is aligned on the partial segments duration (1 second in our reference configuration).
 
 
        Figure 10 Adding low latency HLS manifest and enter details
Once created, you will get the playback URL as follows.
Contribution encoder and MediaLive configuration
In most of our tests we used an AWS Elemental Link contribution encoder, for which Latency can be set in milliseconds. 200 milliseconds works well as a value, but you might need to increase this buffer level depending on your network conditions. On other contribution encoders you should find similar buffering parameters that you can tweak, as well as encoding options that you can simplify to reduce the encoding latency (e.g. look-ahead or B-frames). Generally speaking, it’s good to activate timecode burning on the video whenever this encoding option is available, as it allows you to get a finer grain idea of the latency split between multiple contribution/ABR encoders and the downstream packaging/delivery part of the workflow.
 
 
        Figure 11 Latency value in Link device
As referred to earlier, partial segment length will be configured in MediaLive, which will let downstream devices know the length of partial segments in LL-HLS manifests. In the MediaLive console, create a channel, attach an input, then add an HLS output. Select the HLS output and enter the ingestion URLs created earlier.
 
 
        Figure 12 LL-HLS origin ingestion endpoints
 
 
        Figure 13 Configure endpoint URL’s in MediaLive output destination
In the output configuration, under “Manifests and Segments”, change segment length to 1 second and in “Stream settings” change GOP size to 1 second. Since we are defining segment length of 1 second here, this will be actual length of partial segment which players will get in the LL-HLS manifests. A 1-second GOP size will ensure that each fragment created by the encoder will have a keyframe so the player can start playback. GOP size is one of the main encoding parameters that has a direct impact on video bitrate and video quality, and an indirect impact on end-to-end latency. It determines how often a keyframe (or IDR frame) will be available. In LL-HLS, the player requires a keyframe to start decoding, meaning it can start the playback only at GOP boundaries. Longer GOPs cause higher start-up delay and higher latency. Apple’s recommended GOP size is 2 seconds. Typical LL-HLS workflow implementations have about 5 seconds of end-to-end latency when the GOP is set to 1 second.
 
 
        Figure 14 Setting values in MediaLive output manifest
 
 
        Figure 15 GOP size configuration
CloudFront configuration
Amazon CloudFront is a global Content Delivery Network (CDN) that securely delivers web content to users with low latency and high transfer speeds. CloudFront consists of over 120 Edge Locations located close to your viewers. CloudFront lowers the latency of delivering content using caching, request-collapsing, and TCP optimization across Amazon’s global infrastructure. In addition, CloudFront includes Regional Edge Caches (RECs), located within most AWS regions, to provide features such as mid-tier caching.
You need to create 3 custom policies in CloudFront to use a MediaPackageV2 configuration. You will create a cache policy, an origin request policy, and a response headers policy. The cache policy will customize the cache key and the time-to-live (TTL) settings. The cache key settings help determine whether a viewer request results in a cache hit, which can help you increase your cache hit ratio. Including fewer values in the cache key settings can help increase your cache hit ratio. The TTL settings work together with the Cache-Control and Expires headers to determine how long objects in the CloudFront cache remain valid. In the CloudFront console, select your policies.
 
 
        Figure 16 Defining CloudFront policy
Create a custom cache policy with the following parameters and name it MediaPackage-LL-HLS-CachePolicy. Save the changes.
 
 
        Figure 17 Creating Cache policy
 
 
        Figure 18 Configuring cache policy parameters
Now you will create the second custom policy, which is origin request policy. Some information from the viewer request( such as URL query strings, HTTP headers, and cookies), are not included in the origin request by default. You need to make sure that CloudFront will pass all the parameters required for LL-HLS to MediaPackage V2. Under the “Origin request” tab click on Create origin request policy.
 
 
        Figure 19 Create origin request policy
Create a custom policy with the following parameters and name it MediaPackage-LL-HLS-OriginRequest. Save the changes.
 
 
        Figure 20 Parameters for origin request policy
Finally, you need to create the third policy, the Response headers policy. This policy will specify one or more HTTP headers for CloudFront to add to the responses that it sends to viewers. With a response headers policy, you can specify the desired headers and their values without changing the origin or writing code. If your origin already sends one or more of the headers that are in your response headers policy, you can choose whether CloudFront uses the header from the origin or the one specified in the policy. You will also specify the CORS headers that CloudFront adds when it responds to CORS requests. CloudFront only adds these headers in responses to CORS requests.
Create a custom response headers policy with the following parameters and name it MediaPackage-LL-HLS-ResponseHeader-policy. Save the changes.
 
 
        Figure 21 Configure Response header policy
 
 
        Figure 22 Configure response header policy parameters
Now that you have created all three policies, you can create the CloudFront distribution. In CloudFront console select Create distribution.
 
 
        Figure 23 Creating CloudFront distribution
Enter the “Origin domain” and “Origin path” from the MediaPackage endpoint you created earlier.
Please note that CloudFront will not automatically populate the origin domain name in the drop-down menu as it does for MediaPackage V1. You will need to manually enter the domain name and origin path in the respective fields.
 
Next you are defining one CloudFront distribution per EMP channel group. You will put the channel group name in the origin path as it will be common to all the channels which will run under this distribution. Channel group is the top-level resource, which contains channels and origin endpoints that are associated with it. All underlying channels and endpoints will then be served by the same CloudFront distribution, which will minimize the number of distributions that need to be managed.
 
 
        Figure 24 Creating origin for CloudFront distribution
Next you need to define the behaviors and attach the policies to the distribution which you created in previous steps. Navigate to Cache key and origin requests and select the three policies, then leave everything default.
 
 
        Figure 25 Selecting policies under behaviors in CloudFront console.
Save the configuration and the CloudFront distribution will be deployed in few minutes. Check the playback using the CloudFront domain name. The playback URLs will follow the {CDN-hostname}/{channel-name}/{endpoint-name}/{manifest-name} pattern. In our example, it will be as follows:
Player support status and configuration recommendations
In the Apple ecosystem, you will be able to play LL-HLS streams reliably on compiled applications that leverage AVplayer. This player doesn’t integrate a drift compensation latency mechanism that would skip in time or accelerate playback rate when drift happens, so you will have to implement the mechanism of your choice. Once you’ve implemented it, the URL should play fine on macOS, iOS17/ipadOS17 and tvOS17. It will also play on versions 14 to 16 but with less-optimized heuristics. Direct playback in Safari mobile is not a suitable option, as this browser still needs multiple improvements (bitrate up-switching, playhead positioning predictability, and drift compensation) to offer a good user experience with LL-HLS streams.
There is still an option to properly play LL-HLS streams on Safari mobile, and that is through the use of the hls.js player which can leverage the new (as of iOS17) Managed Media Source API . There are a couple of LL-HLS related improvements planned for hls.js v1.8 (especially the support for EXT-X-PRELOAD-HINT signaling parts not yet available) but this player is stable and usable in production since version 1.4.2, on all browsers including Safari mobile. The recommended settings are the following:
| { “debug”: false, “enableWorker”: false, “lowLatencyMode”: true, “backBufferLength”: 90, “maxLiveSyncPlaybackRate”: 1.05, “liveDurationInfinity”: true } | 
The maxLiveSyncPlaybackRate parameter impacts both the speed at which the player will catch up with the live edge time in case of latency drift and the audio pitch. Past 6% of audio pitch acceleration, it’s generally considered that the human perception will detect it. Therefore the value of maxLiveSyncPlaybackRate needs to be a careful trade-off between how fast you want to reduce latency drift and how important it is to preserve the audio perception. A too-aggressive value will also be perceptible visually, when the player will fast-forward in the video timeline. Playback rate acceleration is also good to balance small latency drifts, but it’s not a good approach if your player was on a suspended browser tab and suddenly wakes up, as an example. In this case you would probably want it to jump straight to the live edge time, which can be achieved by adding the two following parameters to your hls.js configuration:
"liveSyncDurationCount": 0
 "liveMaxLatencyDurationCount": 6
The first parameter tells the player to position the playhead at the live edge, at the beginning of the playback session. The second one tells the player to jump to the live edge if the playhead is more than 6 full segments duration or more behind the live edge.
In regards to other open source players, Exoplayer on Android also includes a production-grade implementation, which doesn’t require specific configuration to reach optimal latency. Shaka Player also has a deployment visible in its nightly player build, but at the time of writing this blog post, we haven’t achieved satisfying results with this player.
On the commercial players side, we successfully validated playback on THEOplayer and JW Player. Additional player partners are currently working on their implementations.
Latency results
Using the workflow described previously (Link > MediaLive > MediaPackage > CloudFront > Player), we tested two scenarios leveraging different MediaLive segment and GOP durations – 1 second and 2 seconds, resulting respectively on the MediaPackage output into 1 second parts with 3 seconds PART-HOLD-BACK value and 2 seconds parts with 6 seconds PART-HOLD-BACK value. The first scenario prioritizes the latency, while the second one prioritizes the encoding efficiency. Using the players that we consider production-ready, we obtained these results:
| Parts duration | PART-HOLD-BACK | hls.js (v1.4.2) | AVPlayer (v16.5) | Exoplayer (v2.18.6) | THEOplayer (v4.11) | JW Player (v8.27) | 
| 1 second | 3 seconds | 5.0s latency | 5.95s latency | 4.04s latency | 4.70s latency | 5.90s latency | 
| 2 seconds | 6 seconds | 8.0s latency | 9.75s latency | 5.32s latency | 6.06s latency | 9.45s latency | 
As a comparison, an alternative workflow where the ABR encoding is done on-premises (AWS Elemental Live > MediaPackage > CloudFront > Player) will result in a latency reduced by 400ms.
Conclusion
This blog post explained how to set up end-to-end LL-HLS workflows leveraging MediaLive, MediaPackage, and CloudFront. The walkthrough includes all the configuration parameters required for reducing latency and provides detail about commercial and open source video players – allowing you to reduce glass-to-glass latency to 5 seconds. Give it a try, it’s easy to configure! Low latency allows you to compete with broadcast latency and to create new user experiences not possible before. Stay tuned for further latency improvements and additional low latency standards support on the AWS for M&E Blog.