
Overview
The Tabular Data Synthesizer by Synthesized brings the generative AI capabilities of Synthesized's Scientific Data Kit (SDK) to AWS Sagemaker.
Synthesized provides a comprehensive framework for generative modelling for structured data. The SDK helps you create compliant statistical-preserving data snapshots for BI/Analytics and ML/AI applications.
Highlights
- **Improve data quality** - benefit from up to ~15% uplift in ML/AI model performance with data rebalancing, data imputation, and high-quality synthetic data generation. SDK helps increase revenue across conversion, fraud, revenue recovery, and more.
- **Ensure data privacy and data compliance** - codify complex data privacy requirements into concrete data transformations. Ensure compliance when using sensitive data in cloud initiatives. Rapidly migrate your data pipelines and workflows to the cloud faster.
- **Key Benefits**: * Increase market value of existing data * Improve model performance by up to 15% * Shorten model time to value from hours/days to minutes * Increase developer productivity by 20%+ * Codified data privacy transformations for compliance **Key Features**: * Data rebalancing * Data snapshots * Synthetic data generation * Data anonymisation * JSON/YAML Configuration
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.m5.2xlarge Inference (Batch) Recommended | Model inference on the ml.m5.2xlarge instance type, batch mode | $4.75 |
ml.m5.2xlarge Inference (Real-Time) Recommended | Model inference on the ml.m5.2xlarge instance type, real-time mode | $4.75 |
ml.m5.2xlarge Training Recommended | Algorithm training on the ml.m5.2xlarge instance type | $4.75 |
ml.c5.2xlarge Inference (Batch) | Model inference on the ml.c5.2xlarge instance type, batch mode | $4.75 |
ml.p3.2xlarge Inference (Batch) | Model inference on the ml.p3.2xlarge instance type, batch mode | $8.50 |
ml.c5.xlarge Inference (Batch) | Model inference on the ml.c5.xlarge instance type, batch mode | $4.25 |
ml.m5.xlarge Inference (Batch) | Model inference on the ml.m5.xlarge instance type, batch mode | $4.25 |
ml.c5.2xlarge Inference (Real-Time) | Model inference on the ml.c5.2xlarge instance type, real-time mode | $4.75 |
ml.p3.2xlarge Inference (Real-Time) | Model inference on the ml.p3.2xlarge instance type, real-time mode | $8.50 |
ml.c5.xlarge Inference (Real-Time) | Model inference on the ml.c5.xlarge instance type, real-time mode | $4.25 |
Vendor refund policy
Please contact support@synthesized.io .
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker algorithm
An Amazon SageMaker algorithm is a machine learning model that requires your training data to make predictions. Use the included training algorithm to generate your unique model artifact. Then deploy the model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
This is the first release of the Tabular Synthesizer Algorithm on AWS powered by Synthesized's SDK.
Additional details
Inputs
- Summary
The model input is a JSON config specifying the number of rows of data to generate along with some optional modifications. Please see our the synthesis section of our documentation for more details on the schema of the generation config.
- Input MIME type
- application/json
Input data descriptions
The following table describes supported input data fields for real-time inference and batch transform.
Field name | Description | Constraints | Required |
|---|---|---|---|
num_rows | Integer specifying the number of rows of data to generate. If not supplied, the synthetic data generated will have the same number of rows as the original. | Default value: Number of rows of original DataFrame
Type: Integer
Minimum: 1 | No |
produce_nans | Whether to generate synthetic data with missing values in the same proportion as the original data. | Default value: "true"
Type: FreeText
Limitations: One of "true" or "false". | No |
rebalance | A list of dictionaries with keys describing specifiying column names along with a desired output distribution. An example for a 50-50 split of true/false for a column named "fraud_column" is given below:
[{
"name": fraud_column,
"marginals":{
"false": 0.5
"true": 0.5
}
}] | Default value: []
Type: FreeText
Limitations: The sum of the marginals must be equal to 1.0. | No |
Resources
Vendor resources
Support
Vendor support
For any questions, please contact support@synthesized.io or submit a ticket in the Synthesized support portal.
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products



