AWS Partner Network (APN) Blog

Zilliz Cloud Enterprise Vector Search Powers High-Performance AI on AWS

By: Jiang Chen, Chief Solution Architect – Zilliz
By: Fei Lang, Principal Partner Solution Architect – AWS
By: Yuantao Zhang, Senior Partner Solution Architect – AWS

Zilliz
Zilliz
Connect with Zilliz

As organizations implement retrieval-augmented generation (RAG) architectures and multimodal artificial intelligence (AI) applications, the demand for high-performance vector search has become important. Enterprises harnessing Amazon Bedrock and other foundation models need scalable, robust vector search infrastructure to match their AI ambitions. Zilliz Cloud on Amazon Web Services (AWS) delivers a fully managed vector database service offering secure, compliant, and fast AI-powered search at enterprise scale. While many solutions work in development, production environments demand consistent performance under heavy workloads while maintaining strict security requirements.

This post explores how Zilliz Cloud on AWS meets these requirements, offering performance, security, and scalability for enterprise AI initiatives.

Scaling Vector Search to Billions with Zilliz Cloud

Purpose-built for mission-critical AI workloads on AWS, Zilliz Cloud eliminates operational complexity while delivering the performance, security, and scalability that enterprise customers demand.

Cardinal, our proprietary vector search engine, powers the service and sets new performance benchmarks in the industry. Cardinal delivers unprecedented efficiency with up to 10x higher query throughput and 3x faster index building compared to open-source alternatives. This breakthrough performance enables AWS customers to scale their vector search operations to billions of embeddings while optimizing both cost and latency. Whether you’re implementing enterprise-wide RAG systems or real-time similarity search, Zilliz Cloud provides the production-ready infrastructure to power your most demanding AI initiatives.

3-Layer Optimization

Cardinal achieves its exceptional performance and efficiency through a vertically integrated stack with three key layers of optimization:

  • Advanced Index Algorithms: By combining Inverted File (IVF) indexing with graph-based approaches like Hierarchical Navigable Small World (HNSW), Cardinal delivers high recall and low latency—even under complex scenarios such as filtered search and search-while-ingesting.
  • Meticulous Engineering: Cardinal implements custom memory allocators, NUMA-aware scheduling, and multi-threaded execution pipelines that are fine-tuned for high-throughput and low-latency performance in production environments.
  • Hardware-Aware Kernel Enhancements: Optimized for ARM-based AWS Graviton processors, Cardinal uses Single Instruction, Multiple Data (SIMD) acceleration, CPU pinning, and intelligent I/O scheduling to significantly reduce CPU cycles and improve throughput.

Built-in Intelligence for Zero-DevOps

Zilliz Cloud offers both high performance and intelligent automation features:

  • AutoIndex: Leveraging machine learning, AutoIndex automatically selects the optimal index type and configuration based on data characteristics and system state. This ensures the best balance between search accuracy and latency without human intervention.
  • Auto-Scaling: Compute and storage scale elastically based on real-time workload demands, ensuring seamless handling of traffic spikes and massive data ingests.

Enterprise-Ready Features

  • High Availability: Our high availability system design distributes queries automatically across replicas, reducing latency while ensuring continuous operation during zone outages. Replicas are intelligently synchronized across AWS availability zones for maximum resilience.
  • Comprehensive Observability: Deep integration with Prometheus enables real-time monitoring through 41 alerts across 26 metrics, covering everything from infrastructure health to data operations. Teams can proactively manage their vector search infrastructure with full visibility into performance patterns.
  • Seamless Data Migration: Well-lit migration pathways allows Zilliz Cloud users to migrate data from other sources like Pinecone and Elasticsearch with one click. The migration feature also supports automatic schema transformation to preserve data integrity while unlocking Zilliz Cloud’s enhanced capabilities.
  • Global Infrastructure and Security: Available in 7 AWS regions across US, Europe and APAC, Zilliz Cloud delivers low-latency performance globally. Security is built-in with Auth0-based authentication supporting enterprise SSO through Okta, GitHub, and Google OAuth.

With its cloud-native architecture and proprietary Cardinal index engine, Zilliz Cloud delivers unmatched speed, elasticity, and simplicity for vector search at the billion-vector-scale.

Deeply Integrated with AWS Technology

The reliable infrastructures and cloud services that AWS provides is a key to Zilliz Cloud’s success in serving billion-scale vector search in production without worrying about the low-level hardware details. Zilliz Cloud is deeply integrated with AWS technology to enhance performance, reliability, and security:

  • Amazon EKS: Milvus is Kubernetes native. The fully-managed Milvus on Zilliz Cloud deploys microservices used for vector search, indexing, and metadata management on EKS, the managed Kubernetes environment of AWS, for simplified deployment and high availability.
  • AWS Graviton Processors: Deliver a superior performance-to-cost ratio using ARM-based architecture optimized for compute-intensive workloads such as index building.
  • AWS PrivateLink: Provides secure, private connectivity between client’s VPC and vector database servers in Zilliz Cloud VPC — without crossing the public internet.
  • AWS Global Infrastructure: Leverages AWS’s global network of regions and availability zones to deliver low-latency search experiences worldwide.

Flexible Deployment for Every Security Requirement

Zilliz Cloud’s fully managed SaaS offering supports most enterprise workloads. However, organizations in highly regulated industries often require stricter control over data residency and infrastructure access. With its bring your own cloud (BYOC) offering, high-performance vector search is deployed directly into your AWS account and VPC, thereby ensuring complete data sovereignty and avoiding any exposure to shared infrastructure or public endpoints.

Why Zilliz Cloud BYOC?

While generative AI has delivered major gains in productivity and personalization, regulatory constraints can make traditional SaaS deployments untenable. On-premises alternatives, meanwhile, often come with prohibitive operational complexity.

Zilliz Cloud BYOC bridges that gap—giving enterprises the power to run vector search close to their data, within their own secure cloud environment. It eliminates compliance friction without sacrificing performance or scalability. Key BYOC benefits include:

  • Data remains within your AWS environment: Meeting even the strictest data sovereignty and residency requirements
  • Operations are managed securely through AWS-native services: Including PrivateLink, IAM, and VPC peering
  • No public data egress: AI workloads run where the data resides, eliminating security concerns of data crossing public networks.

How Zilliz Cloud BYOC Works on AWS

Figure 1 shows how Zilliz Cloud BYOC has a carefully designed architecture to achieve the balance of control and data sovereignty.

Zilliz Cloud BYOC on AWS Architecture

Figure 1 – Zilliz Cloud BYOC on AWS Architecture

The Control Plane, managed by Zilliz and hosted in their AWS account, instructs operational tasks such as software upgrades and scaling within the Zilliz VPC. Meanwhile, the Data Plane, deployed in the customer’s AWS account, runs all vector search services, ensuring complete isolation and providing full visibility to the customer.

This design is made possible by AWS security capabilities:

  • AWS PrivateLink ensures that communication between Zilliz’s control plane and your data plane remains private and secure, without crossing the public internet.
  • Cross-account IAM roles enable secure, least-privilege access for provisioning and scaling.
  • Amazon S3 is used for storing audit logs and operational metadata within your environment, ensuring compliance with internal data governance policies.

Real-World Enterprise Impact

Zilliz Cloud’s high-performance vector search capabilities provide significant business impact for AWS enterprise customers. With sub-10ms latency searches across billions of vectors and strict compliance, organizations can implement AI applications at unprecedented scales.

Filevine, a top US-based legal AI SaaS company, uses Zilliz Cloud to make huge amounts of legal documents quickly searchable, cutting research time from hours to minutes, for example. This enhancement is driven by Cardinal’s 10x performance improvement and features like AutoIndex optimization.

By choosing Zilliz Cloud on AWS, organizations can confidently scale their vector search operations and focus on innovation, rather than infrastructure management.

Get Started with Zilliz Cloud Vector Database on AWS

Whether you’re building your first AI application or scaling existing systems to handle billions of vectors, Zilliz Cloud provides the performance, reliability, and security that enterprise AI demands. Start using Zilliz Cloud today through AWS Marketplace or if you would like to start a Free Trial , Explore BYOC Options or Contact Zilliz through AWS. Transform your AI applications with vector search that’s built for enterprise scale—delivering the performance your users expect with the security your business requires.

Zilliz is thrilled to be among the first launch partners for AWS Marketplace’s new AI agents and tools, bringing powerful vector database capabilities to AWS customers.

Connect with Zilliz.


Zilliz – AWS Partner Spotlight

Zilliz is an AWS Advanced Technology Partner and AWS Competency Partner that provides vector database solutions for enterprise-grade AI built on Milvus, the popular open-source vector database. These applications include image retrieval, video analysis, NLP, recommendation engines, targeted ads, customized search, intelligent customer service, fraud detection, network security, new drug discovery, and more.

Contact Zilliz | Partner Overview | AWS Marketplace