AWS for M&E Blog

Back to basics: Accessibility services for Media

In today’s world, accessibility isn’t just a feature— it transforms how many people consume content. While most are familiar with subtitles, accessibility services represent a broader set of solutions for television viewers of all abilities. At Amazon Web Services (AWS), we’re focused on making media accessible to everyone, and today we’ll explore the technologies that enable this.

Television as the dominant source of entertainment, news and more in our lives was one of the first that focused on accessibility. Accessibility is the practice of making information, activities and environments sensible, meaningful and usable for as many people as possible.

Television accessibility services support the seven percent of the EU population that have sight impairments, with one in five people typically having sight loss in their lifetime. Similarly, it is estimated around 20 percent of the population have a form of hearing loss or deafness. However, it is noted in a recent study that younger generations, without any hearing loss, are four times more likely to consume content with subtitles enabled than older generations.

Here we present a two-part blog, the first looking at accessibility standards for Television (TV) services, and the second part looking at how the accessibility functionality can be supported for content delivered over the Internet by AWS native services.

Captions and subtitles role in accessibility

The accessibility feature that displays text on TV screens to represent dialogue and additional audio information exists in two forms: captions and subtitles. Technically, captions provide the full audio experience for deaf and hard of hearing viewers while subtitles mainly focus on displaying or translating spoken dialogue to text. The ability for the viewer to enable the display of captions technically defines them as closed captions, but in some regions (such as the United Kingdom (UK)) they can be colloquially also called subtitles.

Picture is showing example of video frame with enabled subtitle functionality. The image is of AWS Elemental MediaLive and the subtitles have started to display at the bottom of the screen, "AWS elemental media live is a."

Figure 1: Captions – English subtitles.

MPEG Transport streams (MPEG TS) are a type of delivery container commonly used for broadcast of digital television and also Internet Protocol TeleVision (IPTV) distribution. The addition of captions and subtitles in transport streams empowers viewers who are deaf, hard of hearing, or have other audio limitations to read the dialogue and other environmental audio cues.

The most common formats for captions and subtitles in transport streams are:

  • Burn in captions: Captions that are converted into text and then overlaid on top of the picture directly in the video stream.
  • EIA-608 later CEA-608 captions: The standard format for closed captions carried in line 21 of the vertical blanking interval (VBI) alongside the video signal. This is common for SD analogue and digital TV.
  • EIA-708 later CEA-708 captions: An updated digital captioning system for Advanced Television Systems Committee (ATSC) and digital TV. These are inserted as digital data within the video transport stream. EIA-708 is more flexible, supporting more caption languages and styles.
  • SCTE-27 subtitles: An additional method to embed 608/708 style captions as data packets in MPEG-2 transport streams.

Other formats for accessibility exist, for example, in Europe, digital video broadcast (DVB) Teletext and DVB picture/bitmap-based subtitles.

  • DVB Teletext is the standard for carrying interactive teletext services in DVB transport streams alongside video program. Like traditional Teletext, it displays pages of text information that can be accessed using a remote control to key a page number. Teletext is useful for viewers to get news, sports updates, program guides and more. Carrying it digitally in transport streams retains this accessibility service in the digital age.
Picture showing example of teletext page.

Figure 2: DVB Teletext page with Program guide.

  • DVB Teletext subtitles carry subtitles on a specific teletext magazine and page. A viewers can then quickly access this page directly by using their remote-control keys and selecting the right magazine page (such as 888, 777, and so on). Program specific information (PSI) is comprised of the caption component descriptor in the transport stream in a program map table (PMT) to announce the presence and format of captions being carried (see Figure 3).
Example of how teletext is described in MPEG TS PSI/PMT. It shows the folder tree of information.

Figure 3: Example of how DVB Teletext is described in MPEG TS in PMT table.

If the current program carries DVB Teletext subtitles, the viewer will be able to see the text of the audio transcription overlaid over the video.

Single frame from Tears of Steel with Teletext subtitle showing at the bottom of the screen. The frame shows 3 actors with one using a communication device. The Teletext subtitles read, "Well that's perfect."

Figure 4: From Tears of Steel: DVB Teletext subtitle over video – (CC) Blender Foundation | mango.blender.org

  • DVB bitmap subtitles are rendered as graphic objects, and allow the use of higher resolution fonts which can be proportionately spaced and placed over a black background to enhance legibility. Using this approach, bitmap subtitles for multiple languages can be embedded into a single transport stream as separate packet identifiers (PIDs). This allows viewers to enable subtitles in their selected language for dialogue and audio cues. The DVB subtitle specification as well as DVB Teletext subtitles also supports a range of worldwide languages with styling capabilities, including text color and positioning.
Single frame from Tears of Steel with DVB Bitmap subtitle showing at the bottom of the screen. The frame shows 3 actors with one using a communication device. The DVB Bitmap subtitles read, "Well that's perfect."

Figure 5: DVB Bitmap subtitle using the font Tiresias – (CC) Blender Foundation | mango.blender.org

DVB bitmap subtitles are created fonts selected by the broadcaster and are rendered as bitmaps on the screen by the decoder in a layer on top of the video. Font selection for legibility is an important consideration.

DVB bitmap subtitles can also be used to carry audio cues to add context (see Figure 6).

Image shows additional functionality that can be transmitted in DVB subtitles such as audio cues like clang, chatter, and phone alert.

Figure 6: DVB Bitmap subtitle for an audio cue.

The subtitles descriptor is a critical part of the PSI/PMT for receivers to recognize that the program has accessibility data in it (see Figure 7).

Example of how DVB subtitles described in MPEG TS SI/PMT table.

Figure 7: Example of how DVB subtitles are described in MPEG TS in PMT table.

 

Audio description’s role in accessibility

Audio description is for those with visual impairments, providing additional context and relevant descriptions during natural pauses in the program content. Content sources with audio description may include:

  • Descriptive audio: An additional narration track intended for blind and low vision viewers to understand key visual details that are important for following the program’s plot or action.
  • Clean audio: A separate audio track with enhanced dialogue and reduced background noise or music. This improves clarity for those with limited hearing.
  • Audio description metadata: Data embedded in the stream to indicate the presence of descriptive audio and signal when it should be played based on program time codes.

Encoders can take these and produce a broadcast mix audio description track (also known as a pre-mixed audio description). This is where the main program audio has been mixed with the descriptive audio and audio levels of the main program content is adjusted, if required, for the audio description. Figure 8 shows how the transport stream indicates it has audio accessibility data.

Example of how AD descriptor described in MPEG TS PSI/PMT table.

Figure 8: Example of how audio description is described in MPEG TS PSI/PMT.

The following screenshot shows an electronic program guide, including indication that the program content has audio description (AD), subtitles (S) and sign language (SL). The construct is also available through an over-the-top (OTT) stream.

Example of how accessibility features supported for a program in an Electronic Program guide, shown on a TV menu, with pointer towards the Audio Description, Subtitles and Sign Language indicators.

Figure 9: Example of how AD descriptors shown in TV menu.

 

Over The Top (OTT) era

In recent years, there has been a shift for TV delivery from Broadcast over RF (cable, satellite and terrestrial) to consumption through an IP (IPTV and OTT). You can learn more about this by reading Back to basics: HTTP video streaming. While broadcast TV offers accessibility services such as closed caption subtitling and audio description, IP delivered services typically have provided video and audio services only.

European and the UK regulations are driving an increase in the proportion of programs with accessibility services, including in-video signing for sign language consumers. At the same time there is also a drive to move distribution from broadcast to IP.

Broadcasters are looking to match today’s experience with accessibility services for their new IP offerings since viewers are expecting to have, at least, the same level of functionality as it exists currently in broadcast TV.

Accessibility features therefore are essential for a migration to new OTT distribution platforms.

Example of how accessibility options can be configured in TV menu. The image shows a person holding a remote in front of a TV displaying a menu of audio options.

Figure 10: Example of how accessibility options can be configured in a TV menu.

 

Captions in OTT

Captions remain a vital accessibility feature for OTT video allowing people who are deaf or hard of hearing to access dialogue and audio content.

Major OTT platforms and groups of national broadcasters have adopted formats like Timed Text Markup Language (TTML) and its profiles SMPTE Timed Text (SMPTE-TT). Internet Media Subtitles and Captions (IMSC), EBU-TT-D (an enhanced version of TTML standard that is widely adopted in Europe), as well as Web Video Text Tracks (WebVTT) are also being adopted to deliver captions. These formats build on TV captioning standards, but adapt them for modern streaming.

Specifically, TTML and WebVTT captions are sidecar text files that synchronize captions to video time codes. They support styling like text color, position, and font style and size. On the player side, modern browsers and video SDKs offer full support to decode caption files and overlay the text on video during playback. This provides a seamless closed caption viewing experience.

Example of captions showing on the top of video. The image is of a speaker on a stage giving a presentation in their native language of English. Copy is displayed on a screen behind them stating: Amazon Transcribe, Automatic speech recognition, Available in preview today. Across the top of the image is closed captioning in German which reads as: Menschenandern mochten. So bin ich aufgeregt, einen nauen Dienst.

Figure 11: Example of captions showing on the top of a video.

 

Audio accessibility in OTT

OTT platforms have various additional opportunities to increase accessibility for viewers with visual and hearing impairments. Some best practices include:

  • Control over the audio levels when mixing the audio description narration into main program audio.
  • Adding metadata like audio description indicators and volume normalization tags. This helps compatible devices know when and how to process special tracks.
  • Building custom audio settings like dialogue enhancers, left and right audio balancing and integrated volume controls into the video player interface.

As OTT consumption grows, including accessibility services ensures they are available to the widest audience. If widely adopted, access features like descriptive audio tracks have a greater potential to expand accessibility.

Conclusion

In this first part of this series, we discussed the role that accessibility play within society.

We reviewed the different accessibility services for broadcast and internet delivery, such as DVB Teletext, DVB bitmap subtitles and audio descriptions. We also touched on different standards and protocols that used to bring this functionality to viewers, such as Broadcast over RF and IP, and over the top platforms.

In the second part of the series, we will look at how these OTT accessibility requirements can be met using AWS Elemental Media services.

Contact an AWS Representative to learn more and know how we can help accelerate your business.

Further reading

Roman Chekmazov

Roman Chekmazov

Sr. Solution Architect for AWS Elemental

Ben Formesyn

Ben Formesyn

Ben Formesyn is a Senior Specialist Solution Architect, Media Service and Edge at AWS. Ben has 20 years of experience in Broadcast and Content Delivery.