CalypsoAI On-Prem Database Backups
Introduction
Regularly backing up your CalypsoAI platform database is essential for business continuity and disaster recovery. This article walks you through best practices for backing up and restoring your database, whether you’re running it inside your EKS cluster or using an external solution like AWS RDS.
Backup Strategy
In-Cluster Database
If your database is hosted within the CalypsoAI EKS cluster, you’ll use the pg_dump
utility. Here’s how to get started:
1. Install `pg_dump`
-
MacOS:
brew install postgresql@15
-
Ubuntu:
sudo apt-get update
sudo apt-get install postgresql-client-15
2. Prepare Your Environment
Create a .env
file in your working directory with the following variables (adjust values as needed):
PGHOST=localhost
PGPORT=5432
PGUSER=postgres
PGPASSWORD=change_me
NAMESPACE=cai-moderator
POD_NAME=cai-moderator-postgres-postgresql-0
BACKUP_DIR=./pg_backups
3. Run the Backup Script
Save the following script in the same directory as your .env
file and make it executable (chmod +x
):
#!/bin/bash
set -euo pipefail
set -a
source .env
set +a
mkdir -p "$BACKUP_DIR"
kubectl -n "$NAMESPACE" port-forward pod/"$POD_NAME" "$PGPORT":"$PGPORT" >/dev/null 2>&1 &
PF_PID=$!
sleep 10
trap "kill $PF_PID" EXIT
for DB in moderator keycloak moderator_metrics; do
echo "→ Backing up $DB to $BACKUP_DIR/${DB}_$(date +%F).dump"
pg_dump \
-h "$PGHOST" \
-p "$PGPORT" \
-U "$PGUSER" \
-F c \
-f "$BACKUP_DIR/${DB}_$(date +%F).dump" \
"$DB"
done
echo "Backup Complete"
This will generate a backup file for each database in your specified backup directory.
Externally Hosted Database (e.g., AWS RDS)
If your database is hosted externally (like in AWS RDS), use your cloud provider’s native backup solutions. For AWS RDS, refer to Amazon’s documentation on automated backups.
Restoring From Backup
In-Cluster Database
To restore an in-cluster database:
-
Scale Down the Application
kubectl -n cai-moderator scale deployment/cai-moderator --replicas=0
-
Run the Restore Script
Use the same
.env
file as above. Save and execute the following script:#!/bin/bash
set -euo pipefail
set -a
source .env
set +a
kubectl -n "$NAMESPACE" port-forward pod/"$POD_NAME" "$PGPORT":"$PGPORT" >/dev/null 2>&1 &
PF_PID=$!
trap "kill $PF_PID" EXIT
sleep 10
DATABASES="moderator keycloak moderator_metrics"
DUMP_FILES=""
echo "→ Checking for backup files..."
for DB in $DATABASES; do
echo "→ Looking for latest dump for $DB in $BACKUP_DIR …"
LATEST_DUMP=$(ls -1t "${BACKUP_DIR}/${DB}_"[0-9]*.dump 2>/dev/null | head -n1)
if [[ -z "$LATEST_DUMP" ]]; then
echo "❌ No dump file found for '$DB'"
echo "Cannot proceed with restore - missing backup files"
exit 1
fi
DUMP_FILES="$DUMP_FILES$DB:$LATEST_DUMP "
echo "✓ Found dump file for $DB: $LATEST_DUMP"
done
echo "✓ All backup files found - proceeding with restore"
for DB in $DATABASES; do
echo "→ Restoring database: $DB"
CURRENT_DUMP=$(echo "$DUMP_FILES" | tr ' ' '\n' | grep "^$DB:" | cut -d: -f2)
echo "→ Using dump file: $CURRENT_DUMP"
echo "→ Restoring database: $DB (with original ownership)"
pg_restore \
-h "$PGHOST" \
-p "$PGPORT" \
-U "$PGUSER" \
-d postgres \
--create \
--clean \
--verbose \
"$CURRENT_DUMP" \
|| true
echo "→ Configuring TimescaleDB for $DB"
psql -h "$PGHOST" -p "$PGPORT" -U "$PGUSER" -d "$DB" <<'EOF'
CREATE EXTENSION IF NOT EXISTS timescaledb;
SELECT timescaledb_post_restore();
EOF
done
echo "Restore Complete" -
Scale the Application Back Up
kubectl -n cai-moderator scale deployment/cai-moderator --replicas=1
Externally Hosted Database
For externally hosted databases, follow your cloud provider’s standard restore procedure. For AWS RDS, see restoring from a snapshot.
Note: After restoring, update your deployment’s database connection settings (e.g., set the CAI_MODERATOR_DB_HOST
environment variable to the new cluster endpoint).
Questions?
If you need additional assistance, please contact your CalypsoAI support representative.