Containers

Managing Team Workloads in shared Amazon EKS cluster using Loft vCluster and Argo CD for better cost optimization and operational efficiency

This blog was authored by Adam Issaoui, Cloud Support Engineer – Containers, Asif Khan, Senior Solutions Architect, and Sébastien Allamand, Sr. Solution Architect Specialist – Containers. 

Introduction

Amazon Elastic Kubernetes Service (Amazon EKS) has emerged as a fundamental platform for modern container orchestration. It enables organizations to effectively optimize their application deployment and management processes by providing a fully-managed, certified Kubernetes conformant service that streamlines the building, securing, operating, and maintenance of Kubernetes clusters on Amazon Web Services (AWS).

EKS clusters are often shared by multiple teams within an organization, allowing them to efficiently use available resources. Furthermore, Amazon EKS is used to deliver applications to end-users, necessitating strong segmentation and isolation of resources among various teams. Securely sharing the Amazon EKS control plane and worker node resources allows teams to enhance productivity and achieve cost efficiencies. This post demonstrates how to use vCluster for separating workloads on a shared EKS cluster.

Achieving Kubernetes multi-tenancy: strategies for isolation and efficiency

Kubernetes multi-tenancy allows multiple tenants to share a cluster’s resources. However, Kubernetes lacks built-in multi-tenancy, thus administrators must implement isolation strategies using quotas and limits. Two main patterns exist:

  • Hard isolation: Dedicated clusters per tenant/workload, which streamlines isolation but leads to poorer resource usage and increased management overhead.
  • Soft isolation: Using Kubernetes constructs like namespaces to share a cluster while maintaining logical separation. This improves resource usage but necessitates more setup through RBAC, network policies, and other configurations.

Kubernetes provides three main multi-tenancy strategies:

  1. Dedicated clusters: Streamlines isolation but leads to poorer resource usage and increased management overhead.
  2. Namespaces: Improves resource usage but necessitates more complex setup using RBAC, network policies, and other configurations. The Hierarchical Namespace Controller (HNC) addresses some namespace management challenges, but does not solve all multi-tenancy issues, particularly those related to cluster-wide resources.
  3. Shared control plane with virtual clusters: Balances efficiency and isolation through solutions such as loft vCluster or kamaji.

What is vCluster?

vCluster on Amazon EKS offers a range of benefits to users, streamlining operations and reducing costs. vCluster allows organizations to significantly lower their infrastructure expenses while streamlining cluster management. The solution provides control plan isolation, enabling more efficient development and testing processes. It also enhances security in continuous integration and continuous deployment (CI/CD) workflows. Furthermore, vCluster’s lightweight, isolated virtual clusters make training more cost-effective. For those looking to expand their Kubernetes capabilities, vCluster can be seamlessly integrated with Crossplane. This combination allows users to create and test Custom Resource Definitions (CRDs) in isolated environments.

Virtual clusters are functional Kubernetes clusters nested within a host cluster, enhancing resource sharing and flexibility.

Figure 1: EKS cluster and vClusters communication for pod scheduling

Figure 1: EKS cluster and vClusters communication for pod scheduling

The vCluster control plane pod contains:

  1. API Server: Gateway for Kubernetes API requests, supporting various distributions.
  2. Controller Manager: Tracks and manages Kubernetes resources.
  3. Data Store: Stores resource definitions and state, with options such as SQLite, etcd, or managed databases Amazon Relational Database Service (Amazon RDS) for MySQL or PostgreSQL.
  4. Syncer: Synchronizes resources between virtual and host clusters.

The Syncer maintains bidirectional synchronization, scheduling virtual pods on host nodes and propagating changes. This allows vCluster resources to remain isolated while low-level pod resources are synchronized, enabling the virtual cluster to function within the host cluster.

Figure 2: Architecture overview

Figure 2: Architecture overview

Solution overview

The EKS cluster in the following solution hosts multiple virtual clusters using vCluster, with each virtual cluster running a different Kubernetes version. Argo CD deploys application components to the virtual clusters.

Kyverno automatically adds new virtual clusters to the Argo CD managed list, streamlining infrastructure expansion. This setup centrally and automatically manages application deployments across virtual clusters. vCluster creates isolated Kubernetes environments, while Argo CD provides a GitOps approach to deployment management across the virtual clusters.

Walkthrough

The following steps outline the process described in this post:

  1. Create EKS cluster.
  2. Install Amazon Elastic Block Store (Amazon EBS) CSI driver add-on.
  3. Install Argo CD and Kyverno.
  4. Create vClusters for Teams A and B with different Kubernetes versions.
  5. Set up Kyverno cluster policy to add vClusters to Argo CD.
  6. Deploy apps to Team B vCluster using Argo CD.
  7. Verify workload isolation between Teams A and B.

Prerequisites

The following prerequisites are necessary to complete this solution:

vCluster CLI (0.19.6) , installation instructions are available here.

Step-by-step guidance

Step 0: Setup environment variables

Open terminal, update variables as follows, and run in terminal:

export AWS_REGION="us-east-1"  #change it with desired AWS region
export ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export CLUSTER_NAME="eks-blog-vcluster"  #change it with desired Amazon EKS Cluster name
export VPC_CNI_ROLE="AmazonEKSVPCCNIRole"
export VPC_CNI_VERSION="v1.19.2-eksbuild.1"  #check compatibility

Step 1: Set up your EKS cluster

Before proceeding, you need an active EKS cluster. You can create one using the AWS  Console, AWS CLI, or eksctl command-line tool. For a basic setup, you might use the following command:

eksctl create cluster --name $CLUSTER_NAME --region $AWS_REGION --version 1.32

Step 2: Set up VPC CNI plugin

To install the CNI add-on, run the following command:

curl -sL https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/main/hostcluster/deployment/cni/vpc_cni.sh | bash

This command downloads and runs a script from the aws-samples GitHub repository to install the VPC CNI add-on on your EKS cluster with network policy enabled.

Step 3: Setting up Amazon EKS Pod Identity and Amazon EBS CSI Driver EKS managed add-on

The Amazon EBS managed EKS add-on version must be greater than v1.26.0 to support EKS Pod Identity. Refer to this link for more details on other considerations when using Amazon EKS.

Installation steps

  1. Install the EBS CSI Driver with Pod Identity:

Run the following command in your terminal:

curl -sL https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/main/hostcluster/deployment/csi/csi.sh | bash

This script sets up the EBS CSI Driver with Pod Identity on your EKS cluster.

  1. Create the default storage class:

After the EBS CSI Driver is installed, apply the default storage class configuration:

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/refs/heads/main/automation/vcluster/virtualclusters/ebs-storage-class.yaml

This creates a default storage class for your EKS cluster using the EBS CSI Driver.

Step 4: Install Argo CD and Kyverno

Argo CD is used to manage Kubernetes resources and applications within your cluster in a declarative way, while Kyverno serves as the policy engine to define cluster policies for automation purposes. Both can be installed using Helm with the following commands:

Step 4.a: Installing Argo CD

helm repo add argo https://argoproj.github.io/argo-helm

helm install argocd argo/argo-cd --version 7.8.10 -n argocd --create-namespace
kubectl get pod -n argocd

Step 4.b: Installing Kyverno

helm repo add kyverno https://kyverno.github.io/kyverno/
helm install kyverno kyverno/kyverno --version 3.3.7 -n kyverno --set features.policyExceptions.enabled=true --set features.policyExceptions.namespace="*" --values https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/refs/heads/main/automation/vcluster/kyverno/values.yaml --create-namespace

Step 5: Create team-a and team-b vClusters

We’ve set up an EKS 1.32 cluster with two vClusters:

  • Team A: Kubernetes v1.30
  • Team B: Kubernetes v1.31

This allows teams to use different Kubernetes versions in isolated environments.

vCluster isolation mode is enabled, providing:

Argo CD creates vClusters using the vCluster Helm chart, enabling a GitOps approach.

For vCluster’s persistent volumes, you need a gp3 StorageClass. Apply it with this command:

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/refs/heads/main/automation/vcluster/virtualclusters/ebs-storage-class.yaml

After setting up the StorageClass, the next step is to create a vCluster project in Argo CD to organize applications and define deployment, destination, and object type restrictions.

kubectl create -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/main/automation/vcluster/virtualclusters/vcluster-project.yaml

To create the Team A vCluster:

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/main/automation/vcluster/virtualclusters/vcluster-team-a-application.yaml

To create the Team B vCluster:

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/main/automation/vcluster/virtualclusters/vcluster-team-b-application.yaml

vClusters are exposed using Classic LoadBalancer services for external access. Alternative access methods include Ingress or Service LoadBalancer with ingress controllers, more details can be found in the documentation.

To list the created vClusters:

vcluster list --output json | jq '.[] | {Name, Namespace, Status}'
Figure 3: vClutser status

Figure 3: vClutser status

Furthermore, you can check the status of the vCluster application in the Argo CD UI. The vClusters are associated with the vcluster Argo CD project.

To access the Argo CD server from outside the cluster, use kubectl port-forward command. Refer to this link if you would like to expose it using ingress resources:

kubectl port-forward service/argocd-server -n argocd 8080:443 &

This exposes the argocd-server cluster-IP service on local port 8080, allowing access to the Argo CD UI at https://localhost:8080.

To log in to the Argo CD web UI, use the default admin user:

1. Retrieve the initial admin password from the argocd-initial-admin-secret secret:

kubectl get secret argocd-initial-admin-secret -n argocd -o jsonpath="{.data.password}" | base64 -d

2. Use the retrieved password to authenticate with the admin username in the Argo CD web UI.

Figure 4: ArgoCd vCluster applications

Figure 4: ArgoCd vCluster applications

Step 6: Configure ClusterPolicy for automated Argo CD cluster configuration

Our goal is to automate adding new virtual Kubernetes clusters to Argo CD management, improving deployment efficiency and scalability. Argo CD stores cluster details in Kubernetes Secrets labeled with argocd.argoproj.io/secret-type: cluster. vCluster manages cluster credentials by storing them in Secrets within dedicated namespaces. The Secret name is derived by prefixing the cluster name with vc-. For example, the Secret name for the vcteam-a cluster is vc-vcteam-a.

We use Kyverno ClusterPolicies for automation. Kyverno can create more Kubernetes objects when objects are created or updated. The generateExisting attribute controls whether the policy applies to existing resources. When set to true, Kyverno generates ArgoCD secrets for existing vClusters. ArgoCD can access vCluster internally using Kubernetes Service with the same name as the vCluster name.

To create Kyverno ClusterPolicy, you can use the following command:

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/main/automation/vcluster/kyverno/vcluster-kyverno-policy.yaml

Starting from Kyverno v1.13, we need to grant specific permissions for configured policies, to accomplish this, apply the following RBAC:

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/refs/heads/main/automation/vcluster/kyverno/keyverno-secret-clusterroles.yaml

When you create the Kyverno ClusterPolicy, you should see output similar to the following:

Figure 5: Kyverno ClusterPolicy

Figure 5: Kyverno ClusterPolicy

If you examine the secrets in the argocd namespace using the command kubectl get secrets -n argocd, then you discover that Kyverno has automatically created two secrets named vcteam-a and vcteam-b.

kubectl get secrets -n argocd | grep vcteam
Figure 6: vClusters secrets

Figure 6: vClusters secrets

If you describe the vcteam-a secret in the argocd namespace using the command kubectl describe secret vcteam-a -n argocd, then it would confirm that the secret was created by Kyverno through the sync-secret cluster policy applied in previous steps.

Figure 7: vClusters secrets annotations

Figure 7: vClusters secrets annotations

Step 7: Verify that vClusters are added to the Argo CD clusters settings

You created the Kyverno ClusterPolicy to generate Argo CD secrets in Step 6, thus you can check if all the vClusters have been successfully added by navigating to Settings > Clusters in the Argo CD UI.

Figure 8: ArgoCd registered Kubernetes clusters

Figure 8: ArgoCd registered Kubernetes clusters

Step 8: Deploy applications to vCluster clusters Using Argo CD

In this solution, you test isolation from two perspectives:

  • Isolating the networking connectivity between the vClusters.
  • Demonstrating the key feature of vCluster on isolation of cluster-scoped resources.

Step 8.a: Deploy product and sale applications to team-a and team-b vCluster

You use two API services (named as Product API and Sale API), both implemented using NGINX images for testing purposes. You have the flexibility to modify these applications manifests and build your own custom Docker images to suit your specific requirements.

To create both applications on team-b and team-a vcluster, use the following command:

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/main/automation/gitops/all-applicationset.yaml
When you have completed the necessary steps, you can check the status of the application on the Argo CD UI.
Figure 9: Teams ArgoCD applications

Figure 9: Teams ArgoCD applications

Step 8.b: Deploy cert-manager operator application to team b vCluster

You use ArgoCD’s ApplicationSet to deploy cert-manager across multiple vClusters while maintaining isolation. The goal is to install cert-manager with CRDs enabled for vCluster team-b only, excluding the host cluster and team-a vCluster. ApplicationSet generates multiple ArgoCD Applications based on defined criteria, which are useful for managing deployments across various Kubernetes environments.

The ApplicationSet does the following:

  • Deploy cert-manager to a specific vCluster
  • Exclude host cluster and team-a vCluster

This approach tests vCluster isolation with Argo CD by validating three key aspects:

  • Cert-manager deploys only to the intended environment
  • CRDs exist only in designated vCluster
  • Host EKS and other vClusters remain unaffected

To create cert-manager operator on team-b vCluster, you can use the following command:

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/main/automation/gitops/cert-manager-applicationset.yaml

Navigate to the Argo CD UI and view the applications You can see that the cert-manager control has been deployed to vCluster vcteam-b in the cert-manager namespace.

Figure 10: Cert-manager ArgoCd application

Figure 10: Cert-manager ArgoCd application

Drilling down into the vcteam-b-cert-manager App, shows that all the resources, such as services, pods, and replicasets, are deployed and healthy, and they are all running in isolation within the vcteam-b vCluster.

Figure 11: Cert-manager ArgoCd application details

Figure 11: Cert-manager ArgoCd application details

Step 9: Test the multi-team isolation

To test isolation between virtual clusters, restrict connectivity between two virtual clusters using network policies. This setup allows you to assess the isolation of networking and Kubernetes cluster-scoped resources between the vClusters.

Step 9.a: Networking isolation

By default, pods can access APIs across vClusters due to Kubernetes’ open pod networking. Network policies are needed to restrict access and achieve isolation between vClusters.

Figure 12: Inter-vClusters default communication

Figure 12: Inter-vClusters default communication

Test network isolation between:

  • vcteam-a: Product API to Sale API
  • vcteam-a to vcteam-b: Product APIs
  • vcteam-b to vcteam-a: Sale APIs
  • vcteam-b: Product API to Sale API
To enable cross-vCluster/Pod API calls, you need IP addresses and names of all four pods in both "vcteam-a" and "vcteam-b" Vclusters. Use the following commands to retrieve this information:
curl -s https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/refs/heads/main/automation/vars/set_pod_vars.sh | bash

Test the connectivity:

Team A internal access

vcluster connect vcteam-a -- kubectl -n app-product exec $TEAMA_PRODUCT_POD -- curl http://$TEAMA_SALE_IP

Result: ACCESS WORKS

Team A to Team B access

vcluster connect vcteam-a -- kubectl -n app-product exec $TEAMA_PRODUCT_POD -- curl http://$TEAMB_PRODUCT_POD

Result: ACCESS WORKS

Team B to Team A access

vcluster connect vcteam-b -- kubectl -n app-sale exec $TEAMB_SALE_POD -- curl http://$TEAMA_SALE_IP

Result: ACCESS WORKS

Team B internal access

vcluster connect vcteam-b -- kubectl -n app-product exec $TEAMB_PRODUCT_POD -- curl http://$TEAMB_SALE_IP

Result: ACCESS WORKS

You’ve observed inter-vCluster pod communication, which is typically undesired. To enhance isolation, implement network segmentation.

Deploy a network policy to “vcteam-a” and “vcteam-b” namespaces on the host cluster. This does the following:

  • Prevent “product” and “sale” applications from communicating across namespaces
  • Restrict connections to stay within their respective virtual clusters
kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-eks-multitenancy-with-loft-virtualcluster/main/automation/gitops/team-a-b-networkpolicy.yaml
Figure 13: Inter-vClusters restricted communication

Figure 13: Inter-vClusters restricted communication

Re-testing the four scenarios yields these results:

  • vcteam-a: Product → Sale API: SUCCESS (intra-vCluster)
  • vcteam-a → vcteam-b: Product APIs: FAILS (policy-denied)
  • vcteam-b → vcteam-a: Sale APIs: FAILS (policy-denied)
  • vcteam-b: Product → Sale API: SUCCESS (intra-vCluster)

These results confirm vCluster isolation and applications can only communicate within the same vCluster, not across vClusters.

Step 9.b: Cluster scoped resources isolation

Verify pods running on specific vClusters using vCluster CLI. Alternatively, connect to vClusters as normal Kubernetes clusters using shared kubeconfig. For details, see the Accessing vCluster guide.

To connect to EKS cluster and access vcteam-b vCluster, run the following:

vcluster connect vcteam-b

Figure 14: vCluster connect output

Figure 14: vCluster connect output

The vCluster has been exposed, thus the vcluster connect command uses the load balancer endpoint.

The pods in the cert-manager namespace are running and healthy when checking the resources.

kubectl get po -n cert-manager

Figure 15: Cert-manager pods

Figure 15: Cert-manager pods

vClusters allow users to use their CRDs, namespaces, and cluster roles without affecting the host cluster or other vClusters.

In vCluster vcteam-b, cert-manager.io CRDs exist:
vcluster connect vcteam-b -- kubectl get crd | grep cert-manager.io

Figure 16: Cert-manager custom resource definitions (CRD)

Figure 16: Cert-manager custom resource definitions (CRD)

In vCluster vcteam-a, no cert-manager CRDs are found:

vcluster connect vcteam-a -- kubectl get crd

Figure 17: No Cert-manager custom resource definitions

Figure 17: No Cert-manager custom resource definitions

Compute isolation

Default vCluster setup shares nodes. Enhance isolation using the --enforce-node-selector flag on vCluster syncer to schedule workloads on specific nodes based on labels. For managed node groups, create multiple groups with specific selectors for each vCluster.

Karpenter integration options with vCluster:

  1. Designated NodePool for vCluster Workload Pods:
    • Uses node pool taint with vCluster name
    • Adds toleration to syncer
    • Makes sure that pods synced by vCluster are scheduled on designated NodePool nodes
  1. Target Node Pool Based on vCluster Pod Labels:
    • Uses exists operator in Karpenter nodePool spec requirements
    • Allows fine-grained control over which node pool a vCluster’s pods are scheduled on

These approaches optimize resource usage and cost savings by provisioning nodes based on each vCluster’s workload requirements.

Congratulations on successfully combining vCluster for isolation and Argo CD for deployment management in a multi-tenant EKS cluster.

Cleaning up

First, delete vClusters, which cleans up namespaces, resources, and load balancers:

kubectl patch app vcteam-a -n argocd -p '{"metadata": {"finalizers": ["resources-finalizer.argocd.argoproj.io"]}}' --type merge
kubectl delete app vcteam-a -n argocd

kubectl patch app vcteam-b -n argocd -p '{"metadata": {"finalizers": ["resources-finalizer.argocd.argoproj.io"]}}' --type merge
kubectl delete app vcteam-b -n argocd

Then, delete the EKS cluster:

eksctl delete cluster --name $CLUSTER_NAME --region $AWS_REGION

Conclusion

Kubernetes multi-tenancy, combined with Amazon EKS, Loft’s vCluster, and Argo CD delivers a comprehensive solution for hosting multiple isolated applications within a shared Amazon EKS environment. This integrated approach enables application-level isolation, cost savings through streamlined infrastructure management, and automated deployments — all while harnessing the scalability and reliability of the Amazon EKS platform.

Consolidating multiple virtual clusters on a single EKS cluster allows organizations to optimize infrastructure expenditure and streamline management, without compromising on application isolation or deployment agility. The GitOps workflow facilitated by Argo CD further enhances the reliability and reproducibility of application deployments across these virtual clusters.