
Overview
MARS6 by CAMB.AI revolutionizes multilingual speech synthesis with its efficient, real-time, and contextually aware TTS capabilities. Using only 80 million parameters, MARS6 achieves lifelike, nuanced speech across languages, previously possible only with larger models. Its autoregressive architecture provides expressive control over tone, speed, and emotion, enabling applications like dynamic customer service and media localization. MARS6 is ethical and scalable, offering a low-bitrate solution (~1.7kbps) that respects data integrity. With advanced voice cloning features, it supports highly personalized and native-sounding voice replication for virtual assistants and media dubbing. Designed for AWS customers, MARS6 sets a new standard in generative voice technology, enhancing engagement, accessibility, and global reach.
Highlights
- MARS6 by CAMB.AI delivers real-time, multilingual speech synthesis with remarkable realism using just 80 million parameters. It offers expressive control for precise tone, speed, and emotion, ideal for dynamic customer interactions and localization. MARS6 is both ethical and scalable, ensuring efficient bandwidth (~1.7kbps) and secure voice watermarking. With advanced voice cloning, it enables personalized, native-like voice replication for virtual assistants and media. Designed for AWS, MARS6 redefines TTS, boosting engagement, accessibility, and global scalability in generative voice tech.
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost |
|---|---|---|
ml.p3.2xlarge Inference (Batch) Recommended | Model inference on the ml.p3.2xlarge instance type, batch mode | $30.00/host/hour |
inference.count.m.i.c Inference Pricing | inference.count.m.i.c Inference Pricing | $0.0001/request |
Vendor refund policy
Email help@camb.ai for support
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
Reverted to vanilla payloads, away from Converse API
Additional details
Inputs
- Summary
- "text": Text to be spoken, up to 250 characters.
- "max_tokens": Maximum output tokens, default is unlimited.
- "temperature" and "top_p": Sampling parameters for randomness, default is 0.5 and 0.1, respectively.
- "language": Language code, e.g., "en-us".
- "audio_ref": Voice identifier.
- Additional optional controls include emotion intensity ("emo_intensity_multiplier"), PAD model values ("arousal", "valence", "dominance"), and "deep_clone_mode".
- Input MIME type
- application/json
Input data descriptions
The following table describes supported input data fields for real-time inference and batch transform.
Field name | Description | Constraints | Required |
|---|---|---|---|
text | The text content that the TTS model should synthesize into speech, up to 200 characters. | Type: FreeText | Yes |
language | Language code specifying the language in which the text should be spoken, such as 'en-us' for English or 'ja-jp' for Japanese. | Type: FreeText | Yes |
audio_ref | Voice identity reference that determines the specific voice characteristics used for speech synthesis.
Current choices include --
voice-reference-en-female-excited1
voice-reference-en-male-caster1
voice-reference-jp-female1
voice-reference-en-female-highpitch1
voice-reference-en-male-normal1
voice-reference-jp-female2
voice-reference-en-female-normal1
voice-reference-en-male-normal2
voice-reference-en-female-uk1
voice-reference-en-male-slow1 | Type: FreeText | Yes |
max_tokens | Maximum number of output tokens for the speech, defining the length limit of generated speech. | Default value: 1000
Type: Integer | No |
temperature | Sampling temperature for controlling the diversity of output. Lower values produce more conservative outputs. | Default value: 0.3
Type: Continuous | No |
top_p | Sampling probability for narrowing the output scope to the top percentage of choices. | Default value: 0.05
Type: Continuous | No |
rep_presense_penalty | Penalty factor for repetition to encourage more varied output. | Default value: 0.16
Type: Continuous | No |
sil_trim_db | dB level for trimming silence at the beginning and end of audio output. | Default value: 34
Type: Continuous | No |
min_words_per_chunk | Minimum number of words to include in each chunk of the generated speech for smoother transitions. | Default value: 16
Type: Integer | No |
emo_intensity_multiplier | Multiplier to control the intensity of the expressed emotion in speech output. | Default value: 1.0
Type: Continuous | No |
Resources
Vendor resources
Support
Vendor support
Reach out via email with the subject line "[MARS6 on AWS] Help Needed"
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.