Skip to main content
Testkube 2.10.0 is out! Granular Metrics, AI MCP improvements, Organization management, and much more. Read More

Deploying on AWS EKS

This guide walks through deploying Testkube On-Prem on an existing Amazon EKS cluster. It covers prerequisites, S3 storage configuration with two authentication methods (EKS Pod Identity and IRSA), MongoDB Atlas connectivity, ingress setup, and production hardening.

info

A ready-to-use reference repository with all configuration files, IAM templates, and install scripts is available at testkube-aws-deployment. You can clone it and customise the values files for your environment.

Prerequisites

RequirementVersion
Amazon EKS1.21+ (1.24+ for Pod Identity)
Helm3+
kubectlconfigured for the target cluster
cert-manager (recommended)1.11+
NGINX Ingress Controller (recommended)1.8+
MongoDB Atlas (when using external MongoDB)Atlas cluster reachable from the EKS VPC
IMPORTANT

Use the community kubernetes/ingress-nginx chart — not nginx/nginx-ingress from NGINX Inc. Using the wrong chart causes Dex or API Ingresses to be silently ignored when they share the same hostname.

Cluster Sizing

  • At least 3 nodes
  • At least 2 CPU cores per node
  • At least 8 GB RAM per node

1. Configure kubectl

aws eks update-kubeconfig --region <AWS_REGION> --name <EKS_CLUSTER_NAME>

2. Install Dependencies

cert-manager

helm repo add jetstack https://charts.jetstack.io && helm repo update
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager --create-namespace \
--set crds.enabled=true

NGINX Ingress Controller

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx && helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx --create-namespace

3. Create Kubernetes Secrets

Create the target namespace first:

kubectl create namespace testkube

License key:

kubectl create secret generic testkube-license \
--from-literal=LICENSE_KEY=<YOUR_LICENSE_KEY> \
-n testkube

Master password for credentials encryption:

kubectl create secret generic testkube-master-password \
--from-literal=password=$(openssl rand -base64 48) \
-n testkube
warning

The master password cannot be recovered if lost. Store it in a secrets manager such as AWS Secrets Manager or Parameter Store before proceeding.

4. Configure Helm Values

Start from the base values file in the reference repository and customise at minimum:

global:
enterpriseLicenseSecretRef: "testkube-license"

domain: "testkube.example.com"

ingress:
enabled: true

credentials:
masterPassword:
secretKeyRef:
name: testkube-master-password
key: password

certificateProvider: "cert-manager"
certManager:
issuerRef: "letsencrypt-prod"

Configure your identity provider connector under dex.configTemplate.additionalConfig. See SSO / Identity Providers for detailed examples.

If you use MongoDB Atlas instead of the chart-managed MongoDB, configure global.mongo.dsn with your Atlas connection string. See Configure MongoDB Atlas below.

5. Configure S3 Storage

Using AWS S3 instead of the default in-cluster MinIO is recommended for production EKS deployments. Two authentication methods are available — choose one:

MethodWhen to use
EKS Pod Identity (recommended)EKS 1.24+. Simpler setup, no OIDC provider needed.
IRSA (IAM Roles for Service Accounts)EKS 1.21+, legacy clusters, or when the Pod Identity Agent cannot be installed.

Common Steps

Create the S3 Bucket

aws s3 mb s3://<S3_BUCKET_NAME> --region <AWS_REGION>

Create the IAM Policy

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:ListBucket", "s3:GetBucketLocation"],
"Resource": "arn:aws:s3:::<S3_BUCKET_NAME>"
},
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
"Resource": "arn:aws:s3:::<S3_BUCKET_NAME>/*"
}
]
}
aws iam create-policy \
--policy-name TestkubeS3Access \
--policy-document file://iam-policy-s3.json

Configure CORS

To allow the Dashboard to retrieve workflow artifacts directly from S3:

aws s3api put-bucket-cors --bucket <S3_BUCKET_NAME> --cors-configuration '{
"CORSRules": [{
"AllowedOrigins": ["https://dashboard.testkube.example.com"],
"AllowedMethods": ["GET", "OPTIONS"],
"AllowedHeaders": ["*"],
"ExposeHeaders": ["Content-Length", "Content-Type", "ETag"],
"MaxAgeSeconds": 3600
}]
}'

Helm Values for S3

Both methods share the same storage configuration. Add or merge this into your values:

global:
storage:
endpoint: "s3.amazonaws.com"
region: "<AWS_REGION>"
outputsBucket: "<S3_BUCKET_NAME>"
secure: true
accessKeyId: ""
secretAccessKey: ""

minio:
enabled: false
note

accessKeyId and secretAccessKey must be set to "" (empty string, not omitted) so that the SDK falls back to IAM-based authentication.

Option A — EKS Pod Identity

EKS Pod Identity eliminates the need for OIDC provider configuration and service account annotations. The Pod Identity Agent runs as a DaemonSet and injects credentials directly into pods.

Use this option when Testkube pods need AWS credentials for S3, or when MongoDB Atlas users authenticate with AWS IAM using authMechanism=MONGODB-AWS.

Step 1 — Install the Pod Identity Agent addon:

aws eks create-addon \
--cluster-name <EKS_CLUSTER_NAME> \
--addon-name eks-pod-identity-agent

Verify the addon is active:

aws eks describe-addon \
--cluster-name <EKS_CLUSTER_NAME> \
--addon-name eks-pod-identity-agent \
--query 'addon.status' \
--output text

kubectl get pods -n kube-system -l app.kubernetes.io/name=eks-pod-identity-agent

Step 2 — Create the IAM Role:

The trust policy only needs to trust the pods.eks.amazonaws.com service — no cluster-specific OIDC ID required:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "pods.eks.amazonaws.com"
},
"Action": ["sts:AssumeRole", "sts:TagSession"]
}
]
}
aws iam create-role \
--role-name TestkubeS3Role \
--assume-role-policy-document file://iam-trust-policy-pod-identity.json

aws iam attach-role-policy \
--role-name TestkubeS3Role \
--policy-arn arn:aws:iam::<AWS_ACCOUNT_ID>:policy/TestkubeS3Access

If the same role is also used for MongoDB Atlas AWS IAM authentication, configure a MongoDB Atlas database user for the IAM role ARN. AWS IAM authenticates the pod to AWS; Atlas database roles still control what that identity can do inside MongoDB.

For Testkube migrations and runtime access, grant the Atlas database user at least:

  • readWrite on the Testkube database
  • dbAdmin on the Testkube database, or an equivalent custom role that allows index and collection modification commands such as collMod

Step 3 — Create Pod Identity Associations:

aws eks create-pod-identity-association \
--cluster-name <EKS_CLUSTER_NAME> \
--namespace testkube \
--service-account testkube-enterprise-api \
--role-arn arn:aws:iam::<AWS_ACCOUNT_ID>:role/TestkubeS3Role

aws eks create-pod-identity-association \
--cluster-name <EKS_CLUSTER_NAME> \
--namespace testkube \
--service-account testkube-worker-service \
--role-arn arn:aws:iam::<AWS_ACCOUNT_ID>:role/TestkubeS3Role

No service account annotations are needed — Pod Identity handles credential injection through the associations.

Verify the associations:

aws eks list-pod-identity-associations \
--cluster-name <EKS_CLUSTER_NAME> \
--namespace testkube \
--query 'associations[].{serviceAccount:serviceAccount,roleArn:roleArn,associationId:associationId}' \
--output table

The service account names in the associations must match the Helm values:

testkube-cloud-api:
serviceAccount:
name: testkube-enterprise-api

testkube-worker-service:
serviceAccount:
name: testkube-worker-service

Step 4 — Verify Pod Identity from inside the cluster:

After deployment, the API and worker pods should have Pod Identity environment variables injected:

kubectl -n testkube exec deploy/testkube-enterprise-api -- env | grep AWS_
kubectl -n testkube exec deploy/testkube-enterprise-worker-service -- env | grep AWS_

For EKS Pod Identity, expect variables similar to:

AWS_CONTAINER_CREDENTIALS_FULL_URI=http://169.254.170.23/v1/credentials
AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE=/var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token
AWS_REGION=<AWS_REGION>

If these variables are missing, check the addon, namespace, service account name, and Pod Identity association.

Test role assumption from the API service account:

kubectl -n testkube run aws-identity-test \
--rm -i --restart=Never \
--image=amazon/aws-cli:2 \
--overrides='{"spec":{"serviceAccountName":"testkube-enterprise-api"}}' \
-- sts get-caller-identity

The returned ARN should be the IAM role associated with the service account.

Test S3 access from the API service account:

kubectl -n testkube run s3-test \
--rm -i --restart=Never \
--image=amazon/aws-cli:2 \
--overrides='{"spec":{"serviceAccountName":"testkube-enterprise-api"}}' \
-- s3 ls s3://<S3_BUCKET_NAME>

Option B — IRSA (IAM Roles for Service Accounts)

IRSA uses the cluster's OIDC provider to establish trust between Kubernetes service accounts and IAM roles.

Step 1 — Get the OIDC provider ID:

aws eks describe-cluster --name <EKS_CLUSTER_NAME> \
--query "cluster.identity.oidc.issuer" --output text | cut -d/ -f5

Step 2 — Create the IAM Role with an OIDC trust policy:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::<AWS_ACCOUNT_ID>:oidc-provider/oidc.eks.<AWS_REGION>.amazonaws.com/id/<OIDC_ID>"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.<AWS_REGION>.amazonaws.com/id/<OIDC_ID>:sub": [
"system:serviceaccount:testkube:testkube-enterprise-api",
"system:serviceaccount:testkube:testkube-worker-service"
],
"oidc.eks.<AWS_REGION>.amazonaws.com/id/<OIDC_ID>:aud": "sts.amazonaws.com"
}
}
}
]
}
aws iam create-role \
--role-name TestkubeS3Role \
--assume-role-policy-document file://iam-trust-policy-s3.json

aws iam attach-role-policy \
--role-name TestkubeS3Role \
--policy-arn arn:aws:iam::<AWS_ACCOUNT_ID>:policy/TestkubeS3Access

Step 3 — Annotate service accounts in your Helm values:

testkube-cloud-api:
serviceAccount:
create: true
name: testkube-enterprise-api
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::<AWS_ACCOUNT_ID>:role/TestkubeS3Role"

testkube-worker-service:
serviceAccount:
create: true
name: testkube-worker-service
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::<AWS_ACCOUNT_ID>:role/TestkubeS3Role"

6. Configure MongoDB Atlas

Testkube requires MongoDB for control-plane data. For AWS deployments, MongoDB Atlas can be reached through public networking with Atlas IP access lists or through Atlas PrivateLink. PrivateLink is recommended for production.

MongoDB Atlas Connection String

Set the MongoDB DSN in your Helm values:

global:
mongo:
dsn: "mongodb+srv://<ATLAS_PRIVATE_ENDPOINT_HOST>/?authSource=%24external&authMechanism=MONGODB-AWS&connectTimeoutMS=30000&serverSelectionTimeoutMS=30000&socketTimeoutMS=30000&waitQueueTimeoutMS=30000"

Use authMechanism=MONGODB-AWS when Atlas database users are mapped to AWS IAM roles. The role must be available to the pod through EKS Pod Identity or IRSA.

If you use a database name other than the chart default, also override the chart MongoDB database value for your deployment.

If using Atlas PrivateLink, verify DNS from inside the testkube namespace:

kubectl -n testkube run dns-test \
--rm -i --restart=Never \
--image=busybox:1.36 \
-- nslookup <ATLAS_PRIVATE_ENDPOINT_HOST>

The lookup must return private endpoint records. NXDOMAIN means the Atlas private endpoint DNS name is not resolvable from the cluster.

Verify TCP connectivity to the Atlas hosts and ports:

kubectl -n testkube run mongo-tcp-test \
--rm -i --restart=Never \
--image=nicolaka/netshoot \
-- nc -vz <ATLAS_PRIVATE_ENDPOINT_HOST> 1024

Repeat for the Atlas ports shown in the private endpoint connection string.

MongoDB AWS IAM Authentication Check

Run mongosh with the same service account used by the API:

kubectl -n testkube run mongosh-api-test \
--rm -i --restart=Never \
--image=mongodb/mongodb-community-server:8.0-ubi8 \
--overrides='{"spec":{"serviceAccountName":"testkube-enterprise-api"}}' \
--command -- bash -lc \
'mongosh "mongodb+srv://<ATLAS_PRIVATE_ENDPOINT_HOST>/?authSource=%24external&authMechanism=MONGODB-AWS&connectTimeoutMS=30000&serverSelectionTimeoutMS=30000&socketTimeoutMS=30000&waitQueueTimeoutMS=30000" --eval "db.runCommand({ ping: 1 })"'

Expected result:

{ ok: 1 }

Repeat the same test with serviceAccountName set to testkube-worker-service if worker pods also connect to MongoDB Atlas.

If authentication succeeds but migrations fail with not authorized ... collMod, update the Atlas database user's roles. That error means the AWS identity authenticated successfully, but Atlas did not grant enough MongoDB privileges for schema or index migration.

7. Deploy

helm upgrade --install \
--create-namespace \
--namespace testkube \
-f values.yaml \
testkube oci://us-east1-docker.pkg.dev/testkube-cloud-372110/testkube/testkube-enterprise
tip

The reference repository includes an install.sh script that supports composable flags:

./install.sh --with-pod-identity --production
./install.sh --with-s3 --with-alb --production

8. DNS Setup

Create DNS records (CNAME or Alias) pointing to your NGINX Ingress load balancer for each service:

RecordDefault subdomain
Dashboarddashboard.<domain>
REST APIapi.<domain>
gRPC APIagent.<domain>
WebSocketswebsockets.<domain>
Storagestorage.<domain>

Get the load balancer hostname:

kubectl get svc -n ingress-nginx ingress-nginx-controller \
-o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

9. Verify the Installation

kubectl get pods -n testkube
kubectl get ingress -n testkube

All pods should reach Running status. The Dashboard should be accessible at https://dashboard.<domain>.

Using AWS ALB Instead of NGINX

If you prefer the AWS Load Balancer Controller over NGINX Ingress, you need to configure ALB annotations for each Ingress resource. Testkube exposes the gRPC endpoint (agent.<domain>) through a separate Ingress from the REST API, so it needs its own ALB configuration with backend-protocol-version: "GRPC".

testkube-cloud-api:
# REST API Ingress (api.<domain>)
ingress:
className: "alb"
annotations:
alb.ingress.kubernetes.io/scheme: "internet-facing"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/certificate-arn: "<ACM_CERTIFICATE_ARN>"
alb.ingress.kubernetes.io/ssl-policy: "ELBSecurityPolicy-TLS13-1-2-2021-06"

# gRPC Ingress (agent.<domain>) — requires GRPC backend protocol
grpcIngress:
enabled: true
annotations:
alb.ingress.kubernetes.io/scheme: "internet-facing"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/certificate-arn: "<ACM_CERTIFICATE_ARN>"
alb.ingress.kubernetes.io/ssl-policy: "ELBSecurityPolicy-TLS13-1-2-2021-06"
alb.ingress.kubernetes.io/backend-protocol-version: "GRPC"

testkube-cloud-ui:
ingress:
className: "alb"
annotations:
alb.ingress.kubernetes.io/scheme: "internet-facing"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/certificate-arn: "<ACM_CERTIFICATE_ARN>"
alb.ingress.kubernetes.io/ssl-policy: "ELBSecurityPolicy-TLS13-1-2-2021-06"

dex:
ingress:
className: "alb"
annotations:
alb.ingress.kubernetes.io/scheme: "internet-facing"
alb.ingress.kubernetes.io/target-type: "ip"
alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}]'
alb.ingress.kubernetes.io/certificate-arn: "<ACM_CERTIFICATE_ARN>"
alb.ingress.kubernetes.io/ssl-policy: "ELBSecurityPolicy-TLS13-1-2-2021-06"
warning

The grpcIngress section is critical — without backend-protocol-version: "GRPC", ALB defaults to HTTP/1.1 which breaks gRPC communication. Agents will fail to connect to the control plane.

TLS Certificates with ALB

ALB does not read TLS certificates from Kubernetes Secrets, so cert-manager cannot be used for TLS termination at the ALB. You must use AWS Certificate Manager (ACM) instead.

Two approaches are available:

Option 1 — Explicit ACM ARN (shown above)

Provision a certificate in ACM (or request one via DNS/email validation), then reference it by ARN in each Ingress annotation:

alb.ingress.kubernetes.io/certificate-arn: "arn:aws:acm:<REGION>:<ACCOUNT>:certificate/<CERT_ID>"
tip

A single ACM wildcard certificate (e.g. *.testkube.example.com) can be shared across all Ingress resources — use the same ARN for the API, gRPC, UI, and Dex Ingresses.

Option 2 — ACM certificate auto-discovery

When the certificate-arn annotation is omitted, ALB automatically discovers ACM certificates whose domain names match the Ingress hostnames. To use this approach, remove the certificate-arn annotation and ensure a matching ACM certificate exists for your domain:

# Request a wildcard certificate
aws acm request-certificate \
--domain-name "*.testkube.example.com" \
--validation-method DNS \
--region <AWS_REGION>

Then complete the DNS validation. Once issued, ALB will pick it up automatically — no ARN annotations needed.

note

When using ALB, disable certificateProvider in your base values to prevent the Helm chart from creating unnecessary cert-manager Certificate resources:

global:
certificateProvider: ""

Production Hardening

For production deployments, consider the following settings. A complete production overlay is provided in the reference repository.

Replicas and Pod Disruption Budgets:

testkube-cloud-api:
replicaCount: 2
podDisruptionBudget:
enabled: true
minAvailable: 1

testkube-worker-service:
replicaCount: 2
podDisruptionBudget:
enabled: true
minAvailable: 1

Pod anti-affinity to spread replicas across nodes:

testkube-cloud-api:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values: ["testkube-cloud-api"]
topologyKey: "kubernetes.io/hostname"

Storage class: Use gp3 for EBS-backed PersistentVolumes (MongoDB, NATS):

mongodb:
persistence:
storageClass: "gp3"
size: 50Gi

Troubleshooting

Pods not starting:

kubectl describe pod <pod-name> -n testkube
kubectl logs <pod-name> -n testkube

S3 permission errors (Pod Identity):

# Verify the addon is running
kubectl get ds -n kube-system eks-pod-identity-agent

# Check associations
aws eks list-pod-identity-associations \
--cluster-name <EKS_CLUSTER_NAME> --namespace testkube

# Verify credentials are injected into the API pod
kubectl -n testkube exec deploy/testkube-enterprise-api -- env | grep AWS_CONTAINER

If the environment variables are present, run the aws sts get-caller-identity test pod from Option A — EKS Pod Identity to confirm the service account can assume the expected IAM role.

S3 permission errors (IRSA):

# Verify the annotation is present
kubectl get sa testkube-enterprise-api -n testkube -o yaml

# Verify the OIDC provider ID matches
aws eks describe-cluster --name <EKS_CLUSTER_NAME> \
--query "cluster.identity.oidc.issuer" --output text

gRPC connection issues:

  • Verify HTTP/2 is supported end-to-end through your ingress / load balancer.
  • If using ALB, confirm the target group protocol and check for HTTP/2 support.

MongoDB Atlas connection errors:

  • lookup ... no such host or NXDOMAIN: verify the Atlas PrivateLink DNS name and VPC DNS settings from inside the cluster.
  • server selection error or timeout: verify PrivateLink endpoint status, security groups, subnet routing, and Atlas endpoint approval.
  • NoCredentialProviders with MONGODB-AWS: verify Pod Identity or IRSA credentials are injected into the pod.
  • NoCredentialProviders even though Pod Identity variables are injected: verify the Testkube image supports EKS Pod Identity for MongoDB AWS IAM authentication.
  • not authorized ... collMod: grant the Atlas database user migration privileges such as dbAdmin on the Testkube database.

License issues:

kubectl get secret testkube-license -n testkube \
-o jsonpath='{.data.LICENSE_KEY}' | base64 -d