Running Redis in production Kubernetes environments requires careful planning to ensure high availability, data persistence, and optimal performance. While a single Redis instance might work for development, production workloads demand a Redis Cluster that can handle failures gracefully, scale horizontally, and maintain data consistency—similar to how we set up high-availability PostgreSQL with operators .
In this comprehensive guide, we’ll walk through setting up a production-ready Redis Cluster on Kubernetes with high availability, covering everything from basic concepts to advanced configurations that you can deploy in your own cluster.
Prerequisites Before we begin, ensure you have:
A running Kubernetes cluster (version 1.20 or later)
kubectl configured to communicate with your cluster
Basic understanding of Kubernetes concepts (Pods, Services, StatefulSets)
At least 6GB of available memory across your cluster nodes
Storage provisioner for PersistentVolumes (e.g., local-path, AWS EBS, GCP PD)
Understanding Redis Cluster Architecture Why Redis Cluster? Redis Cluster provides several advantages over standalone Redis, much like how MySQL Operator deployments enhance database reliability in Kubernetes:
Automatic Sharding : Distributes data across multiple nodes for scalability.
High Availability : Replicates data across nodes with automatic failover.
Scalability : Easily add or remove nodes to adjust capacity.
No Single Point of Failure : Decentralized architecture ensures resilience.
Redis Cluster Topology A minimal production Redis Cluster requires:
3 master nodes : Minimum for cluster quorum and automatic failover
3 replica nodes : One replica per master for redundancy
Total: 6 nodes : The standard configuration for production
Each master handles approximately 5461 hash slots (16384 / 3), and each replica continuously synchronizes data from its master.
Setting Up Redis Cluster Step 1: Create Namespace First, create a dedicated namespace for Redis:
1 kubectl create namespace redis-cluster
Step 2: Create ConfigMap for Redis Configuration Create a ConfigMap with Redis cluster configuration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 apiVersion: v1 kind: ConfigMap metadata: name: redis-cluster-config namespace: redis-cluster data: redis.conf: | port 6379 cluster-enabled yes cluster-config-file /data/nodes.conf cluster-node-timeout 5000 appendonly yes appendfilename "appendonly.aof" appendfsync everysec dir /data maxmemory 512mb maxmemory-policy allkeys-lru protected-mode no tcp-backlog 511 timeout 0 tcp-keepalive 300 save 900 1 save 300 10 save 60 10000 loglevel notice logfile ""
Apply the ConfigMap:
1 kubectl apply -f redis-configmap.yaml
Step 3: Create Headless Service A headless service enables direct pod-to-pod communication required for Redis Cluster:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 apiVersion: v1 kind: Service metadata: name: redis-cluster namespace: redis-cluster labels: app: redis-cluster spec: clusterIP: None ports: - port: 6379 targetPort: 6379 name: client - port: 16379 targetPort: 16379 name: gossip selector: app: redis-cluster --- apiVersion: v1 kind: Service metadata: name: redis-cluster-external namespace: redis-cluster labels: app: redis-cluster spec: type: ClusterIP ports: - port: 6379 targetPort: 6379 name: client selector: app: redis-cluster
Apply the services:
1 kubectl apply -f redis-service.yaml
Step 4: Deploy Redis StatefulSet StatefulSets are ideal for Redis Cluster because they provide stable network identities and ordered deployment—concepts we explored in our Istio service mesh setup guide . They offer:
Stable network identities (predictable DNS names)
Ordered deployment and scaling
Persistent storage that follows pods
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 apiVersion: apps/v1 kind: StatefulSet metadata: name: redis-cluster namespace: redis-cluster spec: serviceName: redis-cluster replicas: 6 selector: matchLabels: app: redis-cluster template: metadata: labels: app: redis-cluster spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - redis-cluster topologyKey: kubernetes.io/hostname containers: - name: redis image: redis:7.2-alpine ports: - containerPort: 6379 name: client - containerPort: 16379 name: gossip command: - redis-server - /conf/redis.conf resources: requests: cpu: 100m memory: 256Mi limits: cpu: 500m memory: 512Mi livenessProbe: tcpSocket: port: 6379 initialDelaySeconds: 30 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 3 readinessProbe: exec: command: - redis-cli - ping initialDelaySeconds: 10 periodSeconds: 5 timeoutSeconds: 1 failureThreshold: 3 volumeMounts: - name: conf mountPath: /conf - name: data mountPath: /data volumes: - name: conf configMap: name: redis-cluster-config volumeClaimTemplates: - metadata: name: data spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 1Gi
Apply the StatefulSet:
1 kubectl apply -f redis-statefulset.yaml
Wait for all pods to be ready:
1 kubectl get pods -n redis-cluster -w
You should see output like:
1 2 3 4 5 6 7 NAME READY STATUS RESTARTS AGE redis-cluster-0 1/1 Running 0 2m redis-cluster-1 1/1 Running 0 2m redis-cluster-2 1/1 Running 0 1m redis-cluster-3 1/1 Running 0 1m redis-cluster-4 1/1 Running 0 1m redis-cluster-5 1/1 Running 0 30s
Step 5: Initialize Redis Cluster Once all pods are running, initialize the cluster:
1 2 3 4 5 6 7 8 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli --cluster create \ redis-cluster-0.redis-cluster.redis-cluster.svc.cluster.local:6379 \ redis-cluster-1.redis-cluster.redis-cluster.svc.cluster.local:6379 \ redis-cluster-2.redis-cluster.redis-cluster.svc.cluster.local:6379 \ redis-cluster-3.redis-cluster.redis-cluster.svc.cluster.local:6379 \ redis-cluster-4.redis-cluster.redis-cluster.svc.cluster.local:6379 \ redis-cluster-5.redis-cluster.redis-cluster.svc.cluster.local:6379 \ --cluster-replicas 1
When prompted, type yes to accept the configuration.
Expected output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 >>> Performing hash slots allocation on 6 nodes... Master[0] -> Slots 0 - 5460 Master[1] -> Slots 5461 - 10922 Master[2] -> Slots 10923 - 16383 Adding replica redis-cluster-4:6379 to redis-cluster-0:6379 Adding replica redis-cluster-5:6379 to redis-cluster-1:6379 Adding replica redis-cluster-3:6379 to redis-cluster-2:6379 >>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join . >>> Performing Cluster Check (using node redis-cluster-0:6379) M: [node-id] redis-cluster-0.redis-cluster.redis-cluster.svc.cluster.local:6379 slots:[0-5460] (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered.
Step 6: Verify Cluster Status Check cluster information:
1 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli cluster info
Output should show:
1 2 3 4 5 6 7 cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3
List all cluster nodes:
1 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli cluster nodes
Testing High Availability Test 1: Basic Read/Write Operations 1 2 3 4 5 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli -c SET mykey "Hello Redis Cluster" kubectl exec -it redis-cluster-1 -n redis-cluster -- redis-cli -c GET mykey
The -c flag enables cluster mode, allowing automatic redirection to the correct node.
Test 2: Verify Data Distribution 1 2 3 4 5 6 7 8 for i in {1..100}; do kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli -c SET key$i value$i done kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli --cluster check \ redis-cluster-0.redis-cluster.redis-cluster.svc.cluster.local:6379
Test 3: Simulate Node Failure Delete a master pod to test automatic failover:
1 2 3 4 5 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli role kubectl delete pod redis-cluster-0 -n redis-cluster
Watch the cluster recover:
1 kubectl get pods -n redis-cluster -w
Verify the cluster promoted a replica to master:
1 kubectl exec -it redis-cluster-1 -n redis-cluster -- redis-cli cluster nodes
You should see that one of the replicas has been promoted to master, and when redis-cluster-0 comes back online, it becomes a replica.
Test 4: Data Persistence 1 2 3 4 5 6 7 8 9 10 11 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli -c SET persistent-key "test-value" kubectl delete pods -n redis-cluster --all kubectl wait --for =condition=ready pod -l app=redis-cluster -n redis-cluster --timeout =120s kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli -c GET persistent-key
Monitoring Redis Cluster Using Redis CLI Monitor cluster health:
1 2 3 4 5 6 7 8 9 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli --cluster check \ redis-cluster-0.redis-cluster.redis-cluster.svc.cluster.local:6379 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli INFO memory kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli INFO stats
Deploy Redis Exporter for Prometheus Create a monitoring deployment to expose Redis metrics—similar to how we monitor microservices with Traefik :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 apiVersion: apps/v1 kind: Deployment metadata: name: redis-exporter namespace: redis-cluster spec: replicas: 1 selector: matchLabels: app: redis-exporter template: metadata: labels: app: redis-exporter spec: containers: - name: redis-exporter image: oliver006/redis_exporter:latest ports: - containerPort: 9121 env: - name: REDIS_ADDR value: "redis://redis-cluster-external:6379" resources: requests: cpu: 50m memory: 64Mi limits: cpu: 100m memory: 128Mi --- apiVersion: v1 kind: Service metadata: name: redis-exporter namespace: redis-cluster labels: app: redis-exporter spec: ports: - port: 9121 targetPort: 9121 name: metrics selector: app: redis-exporter
Apply the exporter:
1 kubectl apply -f redis-exporter.yaml
Scaling the Cluster Adding New Nodes To add a new master-replica pair and handle increased load—similar to scaling WordPress on Kubernetes :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 kubectl scale statefulset redis-cluster --replicas=8 -n redis-cluster kubectl wait --for =condition=ready pod -l app=redis-cluster -n redis-cluster --timeout =120s kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli --cluster add-node \ redis-cluster-6.redis-cluster.redis-cluster.svc.cluster.local:6379 \ redis-cluster-0.redis-cluster.redis-cluster.svc.cluster.local:6379 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli --cluster add-node \ redis-cluster-7.redis-cluster.redis-cluster.svc.cluster.local:6379 \ redis-cluster-0.redis-cluster.redis-cluster.svc.cluster.local:6379 \ --cluster-slave kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli --cluster rebalance \ redis-cluster-0.redis-cluster.redis-cluster.svc.cluster.local:6379 \ --cluster-use-empty-masters
Security Best Practices Enable Authentication Update the ConfigMap to add password protection:
1 2 3 4 5 6 data: redis.conf: | # ... existing config ... requirepass YourStrongPasswordHere masterauth YourStrongPasswordHere
Store the password in a Secret:
1 2 3 4 5 6 7 8 9 apiVersion: v1 kind: Secret metadata: name: redis-password namespace: redis-cluster type: Opaque stringData: password: YourStrongPasswordHere
Update the StatefulSet to use the secret:
1 2 3 4 5 6 env: - name: REDIS_PASSWORD valueFrom: secretKeyRef: name: redis-password key: password
Network Policies Restrict network access to Redis using Kubernetes network policies—similar to how we enforce security with Kyverno :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: redis-cluster-policy namespace: redis-cluster spec: podSelector: matchLabels: app: redis-cluster policyTypes: - Ingress ingress: - from: - namespaceSelector: matchLabels: name: application-namespace ports: - protocol: TCP port: 6379 - from: - podSelector: matchLabels: app: redis-cluster ports: - protocol: TCP port: 6379 - protocol: TCP port: 16379
Backup and Disaster Recovery Automated Backups Create a CronJob for regular backups:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 apiVersion: batch/v1 kind: CronJob metadata: name: redis-backup namespace: redis-cluster spec: schedule: "0 2 * * *" jobTemplate: spec: template: spec: containers: - name: backup image: redis:7.2-alpine command: - /bin/sh - -c - | for i in 0 1 2; do kubectl exec redis-cluster-$i -n redis-cluster -- redis-cli BGSAVE done sleep 60 kubectl exec redis-cluster-0 -n redis-cluster -- tar czf /tmp/backup-$(date +%Y%m%d).tar.gz /data # Upload to S3 or other storage restartPolicy: OnFailure
Optimize for Your Workload Adjust memory policy based on use case:
1 2 3 4 5 6 7 8 maxmemory-policy allkeys-lru maxmemory-policy volatile-lru maxmemory-policy noeviction
Enable Transparent Huge Pages For better memory performance:
1 2 3 echo 'madvise' | sudo tee /sys/kernel/mm/transparent_hugepage/enabledecho 'madvise' | sudo tee /sys/kernel/mm/transparent_hugepage/defrag
Troubleshooting Common Issues Issue: Cluster formation fails
1 2 3 4 5 6 7 8 kubectl logs redis-cluster-0 -n redis-cluster kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli PING kubectl exec -it redis-cluster-0 -n redis-cluster -- nslookup redis-cluster-1.redis-cluster.redis-cluster.svc.cluster.local
Issue: Pods stuck in CrashLoopBackOff
1 2 3 4 5 6 7 8 kubectl describe pod redis-cluster-0 -n redis-cluster kubectl get pvc -n redis-cluster kubectl exec -it redis-cluster-0 -n redis-cluster -- cat /conf/redis.conf
Issue: Data loss after restart
1 2 3 4 5 kubectl exec -it redis-cluster-0 -n redis-cluster -- redis-cli CONFIG GET appendonly kubectl exec -it redis-cluster-0 -n redis-cluster -- ls -lh /data/
Conclusion You now have a production-ready Redis Cluster running on Kubernetes with high availability, automatic failover, and data persistence. This setup can handle node failures gracefully, scale horizontally, and maintain data consistency across your cluster.
The Redis Cluster architecture we’ve implemented provides:
Automatic sharding across multiple masters for horizontal scalability
Built-in replication with automatic failover when masters fail
Data persistence using both RDB snapshots and AOF logs
Zero downtime during node failures and rolling updates
Monitoring capabilities for observability and alerting
As your application grows, you can easily scale the cluster by adding more master-replica pairs and rebalancing hash slots. For advanced traffic management and routing, consider integrating with a service mesh using Istio Gateway API . Remember to regularly test your disaster recovery procedures and monitor cluster health to ensure optimal performance.
For additional resources and advanced configurations, check out:
Happy Coding