AWS for M&E Blog
Part 4: How to compete with broadcast latency using current adaptive bitrate technologies
Part 1: Defining and Measuring Latency 
 Part 2: Recommended Optimizations for Encoding, Packaging, and CDN Delivery
 Part 3: Recommended Optimizations for Video Players
 Part 4: Reference Architectures and Tests Results (this post)
Part 4: Reference Architectures and Tests Results
 
 In previous installments of this blog series, we explored the options for optimizing latency across the video processing and delivery chain, and video players; now, let’s examine some reference architectures that can provide the best results if low latency is your priority when deploying a live video workflow, and the associated results in terms of end-to-end latency. We worked on two different axis, with one scenario that can be fully deployed on-premises, and three hybrid scenarios that include encoding or contribution on the ground and the rest of the workflow using AWS Elemental Media Services in the cloud. We believe that this approach will allow you to visualize what to expect from your existing equipment and services, and help you learn how to combine them with AWS Elemental Media Services to reach a latency level that is compatible with your requirements.
The common parameters used in these tests are the encoding profiles and the DASH packaging parameters:
- Encoding: AVC 360p@700Kbps / 720p@3Mbps / 1080p@5Mbps
- DASH packaging: 
         - SegmentTemplate/$Number%09d$
- minBufferTime="PT2S"
- suggestedPresentationDelay="PT1S"
- timeShiftBufferDepth="PT3S"
 
In the tables, you will see different variants for the test streams:
- 3x1s means that the DVR playlist exposes 3 segments of 1 second each
- 3x2s means 3 segments of 2 seconds each
- 3600x1s means one hour of DVR with 1 second segments
The objectives were to find out what are the most efficient strategies to produce short segments and what is the behavior of standard players and optimized players over long playback durations like 45 minutes. More tests are required with HEVC and fMP4/CMAF in complement of traditional HLS and DASH with AVC. However, the current test results can still provide a good overview of what is possible to achieve in terms of latency, using AWS Elemental solutions and various players.
Let’s start with the full on-premises scenario. It involves AWS Elemental Live for the encoding and AWS Elemental Delta for the packaging and origination. It corresponds to a typical Pay TV operator workflow for multiscreen distribution with all components deployed on-premises.
This scenario was tested with several type of ingest formats: HLS and RTMP inputs provide slightly better results than UDP inputs. In green we outline the combination that provides the best playback stability results. While 1 second segments can be produced on AWS Elemental Delta, the read/write performances of the storage solution used in conjunction with it will limit the capability to produce and deliver those 1 second segments fast enough to allow a playback with an acceptable amount of rebuffering. With Safari mobile and 2 second segments we are slightly above the 10 second threshold, so that’s obviously not entirely satisfactory.
| PLAYER | 3x1s HLS | 3x2s HLS | 3x1s DASH | 3x2s DASH | 
|---|---|---|---|---|
| hls.js 0.8.7 | 6.86s | 8.44s | ||
| dash.js 2.6.5 | 5.94s | 8.23s | ||
| Safari mobile (iOS 11.2.2) | 6.04s | 11.65s | ||
| Exoplayer 2.6.0 (Android 6.0.1) | 6.14s | 10.19s | 5.49s | 7.14s | 
With AWS Elemental Delta and a regular storage solution, our recommendation is therefore to package 2 second segments in HLS and DASH. Using 1 second segments requires a storage solution offering higher read/write performance.
The second scenario is close to the first one, but with a hybrid workflow between the ground and the cloud: the encoding is performed on the ground and the packaging and origination is done on AWS with AWS Elemental MediaPackage. This corresponds to an evolution of an existing on-premises workflow to support ephemeral channels, like during major sporting events.
This scenario is very close to the first one in terms of latency. Exoplayer results in DASH are a bit behind in this test, and generally speaking we see less predictable latency with this player. The Index Duration seems to be the key factor here: presenting only 3 segments in DASH doesn’t seem to work equally in all circumstances.
| PLAYER | 3x1s HLS | 3x2s HLS | 3x1s DASH | 3x2s DASH | 
|---|---|---|---|---|
| hls.js 0.8.7 | 6.39s | 9.35s | ||
| dash.js 2.6.5 | 5.69s | 7.54s | ||
| Safari mobile (iOS 11.2.2) | 6.64s | 10.64s | ||
| Exoplayer 2.6.0 (Android 6.0.1) | 6.95s | 10.59s | 7.19s | 8.11s | 
With AWS Elemental MediaPackage, our recommendation is to produce 2 second segments in HLS and DASH, with the need for player tuning in case of the use of 1 second segments.
The third scenario is typical for simple live events in one format where the upload bandwidth is limited onsite and requires the use of a contribution encoder. The encoding of bitrate variants is offloaded to a cloud-based service, AWS Elemental MediaLive in our case, that packages the live streams in HLS, and publishes them to AWS Elemental MediaStore as the origin.
Here are the results that we gathered with such an architecture. With 1 second segments (and using a Buffer Size parameter at half of the bitrate) we are slightly under 7 seconds latency, so this can work pretty well with most requirements. This is our recommendation in terms of segment length with this specific workflow.
| PLAYER | 3x1s HLS | 3x2s HLS | 
|---|---|---|
| hls.js 0.8.7 | 6.64s | 9.85s | 
| Safari mobile (iOS 11.2.2) | 6.68s | 12.56 | 
| Exoplayer 2.6.0 (Android 6.0.1) | 8.86s | 11.94s | 
With AWS Elemental MediaLive and AWS Elemental MediaStore combined, we also recommend the use of 1 second segments in HLS, as the design and performance of AWS Elemental MediaStore is specifically oriented towards low latency.
This brings us naturally to the last hybrid scenario, where encoding with AWS Elemental Live on the ground is combined with origination on AWS through AWS Elemental MediaStore. This is a scenario where multiple formats are required with the least possible latency, and where there’s no hard constraint on upload bandwidth. For this scenario we tested only the 1 second segments packaging types, as this is our fastest reference architecture.
Compared to the previous tests, we added two new players , in order to highlight the benefits of player optimizations : Shaka player 2.3.0 (standard) and Shaka player 2.3.0 (optimized). We also upgraded our dash.js test player to v2.7.0, as this release brings a brand new low latency mode (leveraging the fetch API) which lowers the latency by 0.6s (with 3 seconds DVR) to 4.2s (with 1 hour DVR) compared to v.2.6.5 in our tests. We introduced one new format that just appeared on AWS Elemental Live 2.12.3 during the tests, namely fragmented MP4 (fMP4) for HLS, also known as CMAF, which replaces the Transport Stream (TS) segments with MP4 segments. As the format is relatively new for live use cases in HLS, it’s interesting to look at how players perform and behave with this format. It’s actually a mixed result, with only half of the players taking advantage of it.
We also introduced 1 hour DVR (3600x1s columns) windows in order to observe players behavior with long Index Duration. They behave quite well apart from some exceptions where the resulting latency approaches 10 seconds , like Shaka in HLS.
| PLAYER | 3x1s HLS | 3600x1s HLS | 3x1s DASH | 3600x1s DASH | 
|---|---|---|---|---|
| hls.js 0.8.7 with TS | 5.32s | 6.30s | ||
| hls.js 0.8.7 with fMP4 | 5.34s | Problems | ||
| dash.js 2.7.0 | 4.90s | 6.50s | ||
| Safari mobile (iOS 11.2.2) with TS | 5.64s | 5.34s | ||
| Safari mobile (iOS 11.2.2) with fMP4 | 7.50s | 9.47s | ||
| Exoplayer 2.6.0 (Android 6.0.1) with TS | 5.31s | 7.49s | 6.40s | 7.06s | 
| Exoplayer 2.6.0 (Android 6.0.1) with fMP4 | 6.25s | 7.22s | ||
| Shaka 2.3.0 with TS | Problems | 10.12s | 6.94s | 6.44s | 
| Shaka 2.3.0 with fMP4 | Problems | 9.40s | ||
| Shaka 2.3.0 (optimized) | Problems | 6.80s (TS) | 5.54s | 6.01s | 
With fMP4, hls.js positions the playhead at the beginning DVR start time, which explains the result in the 3600x1s HLS case. With Shaka, neither 3x1s HLS with demuxed audio versions work. With one hour DVR HLS (TS) with demuxed audio, sometimes the stream starts, sometimes it doesn’t. With two optimizations applied to Shaka, the latency is significantly decreased in HLS (TS) and in DASH with a short DVR window.
- config.streaming.bufferingGoal = 2
- config.streaming.rebufferingGoal = 2
We are currently working with both hls.js and Shaka developer communities to address the problems found with each player. Overall the results are pretty good and the players behave in quite a stable way with 1 second segments. But the bitrate switching algorithms of some players are clearly challenged by such a short segment duration and there is some room for improvement with those players. There is obviously some optimization work left for fMP4 with almost all players but it’s a good start considering the newness of this format. The video player ecosystem is more and more sensitive to latency issues and we see increasing development efforts dedicated to properly support low latency on all platforms.
Final Words
 There is follow-up work that remains to be done, between the HEVC format tests and additional experimentations with optimized open source players, which will generate complementary blog posts. But we’d like to highlight that latency is not a fatality, and that you can efficiently minimize it – with some efforts – using “standard” HLS, DASH, and CMAF/fMP4 technologies.
Optimized players enable down to 4 second latency with the current HLS and DASH workflows. This makes the quest for low latency scalable from technical and economical standpoints, as these technologies are fully compliant with HTTP 1.1 and HTTP 2.0; you don’t need to deploy costly UDP-based solutions to make low latency happen. If your business requires less than 4 second latency, chunked CMAF is around the corner and will bring an additional opportunity to lower the latency by consuming on the player the chunks of a segment currently being produced on the encoder and the origin, a sort of just-in-time end-to-end workflow that gets us very close to the concept of live TS in the broadcast world, but using HTTP.
As of now, 10 second end-to-end latency is the new standard that lets you address all your connected devices over HLS or DASH. And a stable 5 second latency is already possible today, if your business requires it. So, don’t wait anymore to experiment and break the 30 second barrier!
————



