Skip to main content

Amazon Nova Sonic

A speech-to-speech foundation model for conversational AI

What is Amazon Nova Sonic?

Amazon Nova Sonic delivers real-time, human-like voice conversations with leading price performance and low latency. Available in Amazon Bedrock via the bidirectional streaming API, the model understands streaming speech in various speaking styles and generates expressive speech responses that dynamically adapt to the prosody of input speech.

Amazon Nova Sonic supports expressive voices, including both masculine-sounding and feminine-sounding voices, in English, Spanish, French, Italian, and German. The model can be utilized across a wide range of applications, including customer support call automation, outbound marketing, voice-enabled personal assistants and agents, and interactive education and language learning.

Key capabilities

Learn more about Amazon Nova Sonic capabilities

Handles user interruptions and detects non-verbal cues (e.g., laughter, grunts, inter-sentential pauses, and hesitations) to enable human-like turn-taking in dialogues.

Nova Sonic’s unified architecture enables it to adapt speech responses to the user’s tone and sentiment.

Bidirectional streaming speech I/O with low user perceived latency.

Accurately recognizes streaming speech across accents with robustness to background noise.

Amazon Nova Sonic supports English (including American and British accents), Spanish, French, Italian, and German.

See Amazon Nova Sonic

Amazon Nova Sonic

Model comparison tables

Discover real-world use cases

Getting started with Amazon Nova Sonic

This video provides a step-by-step tutorial on how to use Amazon Nova Sonic in Amazon Bedrock to build your own voice-enabled bot.