Overview
voyage-multimodal-3.5 is a state-of-the-art multimodal embedding model capable of vectorizing not only text, images, and video individually, but also content that interleaves all three modalities. It delivers excellent performance for mixed-modality searches involving text and visual content such as PDF screenshots, figures, tables, videos, and more. Enabled by Matryoshka learning and quantization-aware training, voyage-multimodal-3.5 supports embeddings in 2048, 1024, 512, and 256 dimensions, with multiple quantization options.
Learn more about voyage-multimodal-3.5 here: https://blog.voyageai.com/2026/01/15/voyage-multimodal-3-5
Highlights
- State-of-the-art multimodal embedding model capable of vectorizing not only text, images, and video individually, but also content that interleaves all three modalities. It delivers excellent performance for mixed-modality searches involving text and visual content such as PDF screenshots, figures, tables, videos, and more.
- Supports embeddings of 2048, 1024, 512, and 256 dimensions and offers multiple embedding quantization, including float (32-bit floating point), int8 (8-bit signed integer), uint8 (8-bit unsigned integer), binary (bit-packed int8), and ubinary (bit-packed uint8).
- 32K token context length.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Free trial
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.g5.2xlarge Inference (Batch) Recommended | Model inference on the ml.g5.2xlarge instance type, batch mode | $3.03 |
ml.g6.2xlarge Inference (Real-Time) Recommended | Model inference on the ml.g6.2xlarge instance type, real-time mode | $2.44 |
ml.g5.xlarge Inference (Real-Time) | Model inference on the ml.g5.xlarge instance type, real-time mode | $2.82 |
ml.g5.2xlarge Inference (Real-Time) | Model inference on the ml.g5.2xlarge instance type, real-time mode | $3.03 |
ml.g5.4xlarge Inference (Real-Time) | Model inference on the ml.g5.4xlarge instance type, real-time mode | $4.06 |
ml.g5.8xlarge Inference (Real-Time) | Model inference on the ml.g5.8xlarge instance type, real-time mode | $6.12 |
ml.g6.xlarge Inference (Real-Time) | Model inference on the ml.g6.xlarge instance type, real-time mode | $2.25 |
ml.g6.4xlarge Inference (Real-Time) | Model inference on the ml.g6.4xlarge instance type, real-time mode | $3.31 |
ml.g6.8xlarge Inference (Real-Time) | Model inference on the ml.g6.8xlarge instance type, real-time mode | $5.04 |
Vendor refund policy
Refunds to be processed under the conditions specified in EULA. Please contact aws-marketplace@mongodb.com for further assistance
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
MongoDB is excited to announce the initial release of voyage-multimodal-3.5
Additional details
Inputs
- Summary
- inputs (List[dict]) – A list of multimodal inputs. Each input contains a content list of dictionaries with the following keys:
- type (string): text, image_base64, or video_base64.
- text (string): Text string (required if type is text).
- image_base64 / video_base64 (string): Data URL format (e.g., data:image/jpeg;base64,...).
- input_type (string, optional, default = null) – The role of the input: query, document, or null.
- truncation (bool, optional, default = true) – Whether to truncate inputs to fit context limits.
- output_encoding (string, optional, default = null) – Format of the embeddings: null (list of floats) or base64.
- output_dimension (int, optional, default = 1024) – Supported dimensions: 2048, 1024, 512, 256.
- output_dtype (string, optional, default = "float") – Data type for embeddings: float, int8, uint8, binary, or ubinary.
- id (string, optional, default=null) - Batch request ID.
- inputs (List[dict]) – A list of multimodal inputs. Each input contains a content list of dictionaries with the following keys:
- Limitations for input type
- Maximum Inputs: 1,000 per request. Per-Input Limit: 32,000 tokens. Total Request Limit: 320,000 tokens across all inputs. Image/Video Constraints: - Size: Max 20 MB per file. - Image Resolution: Max 16 million pixels. - Token Conversion: 560 pixels = 1 token (images); 1,120 pixels = 1 token (video).
- Input MIME type
- application/json
Input data descriptions
The following table describes supported input data fields for real-time inference and batch transform.
Field name | Description | Constraints | Required |
|---|---|---|---|
inputs | A list of multimodal inputs. Each input contains a content list of dictionaries with the following keys:
- type (string): text, image_base64, or video_base64.
- text (string): Text string (required if type is text).
- image_base64 / video_base64 (string): Data URL format (e.g., data:image/jpeg;base64,...).
| Maximum Inputs: 1,000 per request.
Per-Input Limit: 32,000 tokens.
Total Request Limit: 320,000 tokens across all inputs.
Image/Video Constraints:
- Size: Max 20 MB per file.
- Image Resolution: Max 16 million pixels.
- Token Conversion: 560 pixels = 1 token (images); 1,120 pixels = 1 token (video). | Yes |
input_type | The role of the input: query, document, or null.
| Default value: null
Type: string | No |
truncation | Whether to truncate inputs to fit context limits. | Default value: true
Type: boolean | No |
output_encoding | Format of the embeddings: null (list of floats) or base64.
| Default value: null
Type: string | No |
output_dimension | Supported dimensions: 2048, 1024, 512, 256.
| Default value: 1024
Type: int | No |
output_dtype | Data type for embeddings: float, int8, uint8, binary, or ubinary. | Default value: "float"
Type: string | No |
id | Batch request ID. | Default value: null
Type: string | No |
Resources
Vendor resources
Support
Vendor support
Please email us at aws-marketplace@mongodb.com for inquiries and customer support.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.