Backup & Disaster Recovery
The platform implements an automated “Snapshot & Ship” strategy for disaster recovery.
💾 Backup Strategy
A daily cron job (11:00 PM) executes a backup.sh script that performs the following steps:
- Database Dump: Runs
pg_dumpallinside the PostgreSQL container. - Certificates: Snapshots the Traefik
acme.jsonfile to preserve SSL certificates and avoid Let’s Encrypt rate limits on restore. - Bundling: Collects all project directories and
.envfiles. - Cloud Storage: Uploads the compressed
.tar.gzarchive to a secure AWS S3 bucket. - Cleanup: Deletes the local archive after a successful upload to save disk space.
🚑 Restoration Procedure
To restore the environment on a new host:
- Extract the backup archive to
/root. - Restore SSL certificates:
docker cp acme.json traefik_container:/letsencrypt/acme.json. - Startup core infrastructure.
- Restore Database:
cat dump.sql | docker exec -i db_container psql -U user -d db.
⚙️ Log Rotation Policy
To prevent disk exhaustion, a global Docker policy is enforced via /etc/docker/daemon.json:
- Max size: 100MB per file.
- Max files: 3 files per container.
Source: Internal Infrastructure Manual