Installing PostgreSQL Operator on Kubernetes

PostgreSQL is a powerful, open-source relational database management system known for its robustness and extensibility. When deploying PostgreSQL on Kubernetes, using a PostgreSQL Operator can simplify the management of database clusters, including tasks such as provisioning, scaling, backup, and recovery. In this guide, we will walk through the steps to install a PostgreSQL Operator on a Kubernetes cluster using Helm.

Prerequisites

A running Kubernetes cluster (version 1.16 or higher).
kubectl configured to interact with your Kubernetes cluster.
At least one available storage class in your Kubernetes cluster.

There are several PostgreSQL Operators available, such as CrunchyData PostgreSQL Operator, Zalando Postgres Operator, and CloudNativePG. For this guide, we will use the CloudNativePG Operator due to its ease of use and active community support.

Installing CloudNativePG Operator

Step 1: Add the CloudNativePG Repository**

# Using kubectl (recommended)
kubectl apply -f \
  https://raw.githubusercontent.com/cloudnative-pg/cloudnative-pg/release-1.22/releases/cnpg-1.22.0.yaml

# Verify installation
kubectl get deployment -n cnpg-system cnpg-controller-manager

Step 2: Verify the Installation**

# Check operator pods
kubectl get pods -n cnpg-system

# Validate Custom Resource Definitions
kubectl get crds | grep postgresql

After running these commands, you should see the operator pods in a Running state and the PostgreSQL Custom Resource Definitions (CRDs) listed.

Expected CRDs:

backups.postgresql.cnpg.io
clusters.postgresql.cnpg.io
poolers.postgresql.cnpg.io
scheduledbackups.postgresql.cnpg.io

Understanding PostgreSQL High Availability Architecture

PostgreSQL clusters use streaming replication for high availability:

Architecture Components:

┌─────────────────────────────────────────────────────────────┐
│                    Application Layer                         │
│                    (Your Applications)                       │
└────────────┬─────────────────────────────────┬──────────────┘
             │                                 │
             │                                 │
    ┌────────▼────────┐              ┌────────▼────────┐
    │   PgBouncer 1   │              │   PgBouncer 2   │
    │ (Connection     │              │ (Connection     │
    │   Pooling)      │              │   Pooling)      │
    └────────┬────────┘              └────────┬────────┘
             │                                 │
             └─────────────┬───────────────────┘
                           │
              ┌────────────▼────────────┐
              │   Service (ClusterIP)   │
              └────────────┬────────────┘
                           │
         ┌─────────────────┼─────────────────┐
         │                 │                 │
    ┌────▼─────┐     ┌────▼─────┐     ┌────▼─────┐
    │ Primary  │     │ Replica  │     │ Replica  │
    │   Pod    │────▶│   Pod 1  │     │   Pod 2  │
    │          │     │          │     │          │
    └────┬─────┘     └────┬─────┘     └────┬─────┘
         │                │                │
    ┌────▼─────┐     ┌────▼─────┐     ┌────▼─────┐
    │   PVC    │     │   PVC    │     │   PVC    │
    │ (Primary)│     │(Replica 1)│    │(Replica 2)│
    └──────────┘     └──────────┘     └──────────┘

Traffic Flow:

Write Operations → Routed to Primary pod only
Read Operations → Distributed across Primary and Replica pods
Streaming Replication → Primary continuously streams WAL to replicas
Automatic Failover → If primary fails, a replica is promoted

Deploying Your First PostgreSQL Cluster

Step 1: Create a Basic PostgreSQL Cluster

# postgres-cluster.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres-cluster
spec:
  instances: 3
  
  # PostgreSQL configuration
  postgresql:
    parameters:
      max_connections: "200"
      shared_buffers: "256MB"
      effective_cache_size: "1GB"
      work_mem: "16MB"
      maintenance_work_mem: "128MB"
  
  # Storage configuration
  storage:
    size: 20Gi
    storageClass: standard
  
  # Bootstrap configuration
  bootstrap:
    initdb:
      database: appdb
      owner: appuser
      secret:
        name: postgres-credentials
  
  # Resource limits
  resources:
    requests:
      memory: "1Gi"
      cpu: "500m"
    limits:
      memory: "2Gi"
      cpu: "1000m"

Step 2: Create Credentials Secret

# postgres-secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: postgres-credentials
type: kubernetes.io/basic-auth
stringData:
  username: appuser
  password: "YourSecurePassword123!"

Step 3: Deploy the Cluster

# Create the secret
kubectl apply -f postgres-secrets.yaml

# Deploy PostgreSQL cluster
kubectl apply -f postgres-cluster.yaml

# Monitor deployment progress
kubectl get pods -l cnpg.io/cluster=postgres-cluster -w

# Check cluster status
kubectl get cluster postgres-cluster

Step 4: Verify Cluster Status

# Describe the cluster
kubectl describe cluster postgres-cluster

# Check replication status
kubectl exec -it postgres-cluster-1 -- psql -U postgres -c \
  "SELECT * FROM pg_stat_replication;"

Testing PostgreSQL Connectivity

Step 1: Connect to the Database

# Port-forward to access locally
kubectl port-forward svc/postgres-cluster-rw 5432:5432

# Connect using psql client
psql -h localhost -U appuser -d appdb

Step 2: Run Test Operations

-- Create a test table
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    description TEXT,
    price DECIMAL(10, 2),
    stock INTEGER DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Insert test data
INSERT INTO products (name, description, price, stock) VALUES
('Laptop', 'High-performance laptop', 1299.99, 50),
('Mouse', 'Wireless mouse', 29.99, 200),
('Keyboard', 'Mechanical keyboard', 89.99, 150);

-- Query data
SELECT * FROM products WHERE price > 50;

-- Test JSONB capabilities (PostgreSQL strength)
CREATE TABLE user_profiles (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50),
    metadata JSONB
);

INSERT INTO user_profiles (username, metadata) VALUES
('john_doe', '{"age": 30, "city": "New York", "interests": ["coding", "gaming"]}'),
('jane_smith', '{"age": 28, "city": "San Francisco", "interests": ["design", "photography"]}');

-- Query JSON data
SELECT username, metadata->>'city' as city 
FROM user_profiles 
WHERE metadata @> '{"interests": ["coding"]}';

Step 3: Test Replication

# Connect to replica
kubectl exec -it postgres-cluster-2 -- psql -U postgres -d appdb

# Verify data is replicated
SELECT COUNT(*) FROM products;

Setting Up Automated Backups for PostgreSQL

CloudNativePG provides built-in backup capabilities using WAL archiving.

Step 1: Configure Backup Storage

# postgres-cluster-with-backup.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres-cluster
spec:
  instances: 3
  
  postgresql:
    parameters:
      max_connections: "200"
      shared_buffers: "256MB"
  
  storage:
    size: 20Gi
    storageClass: standard
  
  # Backup configuration
  backup:
    barmanObjectStore:
      destinationPath: s3://my-postgres-backups/
      s3Credentials:
        accessKeyId:
          name: aws-credentials
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: aws-credentials
          key: SECRET_ACCESS_KEY
      wal:
        compression: gzip
        maxParallel: 4
    retentionPolicy: "30d"
  
  bootstrap:
    initdb:
      database: appdb
      owner: appuser
      secret:
        name: postgres-credentials

Step 2: Create Scheduled Backups

# postgres-scheduled-backup.yaml
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: postgres-daily-backup
spec:
  # Schedule in cron format (daily at 2 AM)
  schedule: "0 2 * * *"
  backupOwnerReference: self
  cluster:
    name: postgres-cluster

Step 3: Apply Backup Configuration

# Apply backup configuration
kubectl apply -f postgres-cluster-with-backup.yaml

# Create scheduled backup
kubectl apply -f postgres-scheduled-backup.yaml

# List backups
kubectl get backups

# Check backup status
kubectl describe backup <backup-name>

Step 4: Test Backup Restoration

# postgres-restore.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres-restored
spec:
  instances: 3
  
  storage:
    size: 20Gi
  
  bootstrap:
    recovery:
      source: postgres-cluster
      recoveryTarget:
        targetTime: "2024-01-15 10:00:00"
  
  externalClusters:
    - name: postgres-cluster
      barmanObjectStore:
        destinationPath: s3://my-postgres-backups/
        s3Credentials:
          accessKeyId:
            name: aws-credentials
            key: ACCESS_KEY_ID
          secretAccessKey:
            name: aws-credentials
            key: SECRET_ACCESS_KEY

Troubleshooting Common Issues

Issue 1: Pod Stuck in Pending State

Diagnosis:

1	kubectl describe pod <pod-name>

Common Causes & Solutions:

Cause	Solution
Insufficient resources	Scale nodes or reduce resource requests
Storage unavailable	Check StorageClass and CSI driver
Node affinity rules	Review affinity/anti-affinity settings

Issue 2: Database Connection Failures

PostgreSQL:

# Check service endpoints
kubectl get endpoints postgres-cluster-rw

# Test connection from pod
kubectl run -it --rm debug --image=postgres:15 --restart=Never -- \
  psql -h postgres-cluster-rw -U appuser -d appdb

Issue 3: Replication Lag

PostgreSQL Check:

SELECT 
    client_addr,
    state,
    sync_state,
    replay_lag
FROM pg_stat_replication;

Issue 4: Backup Failures

Diagnosis Steps:

# PostgreSQL
kubectl logs -l cnpg.io/cluster=postgres-cluster -c postgres

# MySQL
kubectl get mysqlbackups
kubectl describe mysqlbackup <backup-name>
kubectl logs <backup-pod>

PostgreSQL Performance Tips

postgresql:
  parameters:
    # Memory settings
    shared_buffers: "4GB"
    effective_cache_size: "12GB"
    work_mem: "64MB"
    maintenance_work_mem: "1GB"
    
    # Checkpoint settings
    checkpoint_completion_target: "0.9"
    wal_buffers: "16MB"
    
    # Query planner
    random_page_cost: "1.1"  # For SSD storage
    effective_io_concurrency: "200"
    
    # Write performance
    synchronous_commit: "on"
    wal_compression: "on"

Conclusion

By following this guide, you have successfully installed the CloudNativePG PostgreSQL Operator on your Kubernetes cluster, deployed a highly available PostgreSQL cluster, and set up automated backups. With the operator managing your PostgreSQL instances, you can focus on developing your applications while ensuring your database is robust, scalable, and resilient.