Backup and Restore
The operator uses rclone to sync PVC data to and from an S3-compatible backend. All backup operations are driven by a single Secret named s3-backup-credentials in the operator namespace (the namespace where the operator pod runs, typically openclaw-operator-system).
S3 Credentials Secret
Section titled “S3 Credentials Secret”Create the Secret before enabling any backup or restore feature:
apiVersion: v1kind: Secretmetadata: name: s3-backup-credentials namespace: openclaw-operator-system # must match the operator namespacestringData: S3_ENDPOINT: "https://s3.us-east-1.amazonaws.com" # or any S3-compatible URL S3_BUCKET: "my-openclaw-backups" S3_ACCESS_KEY_ID: "<key-id>" S3_SECRET_ACCESS_KEY: "<secret-key>" # S3_REGION: "us-east-1" # optional - see belowS3_ENDPOINT and S3_BUCKET are required. The operator uses rclone’s S3 backend, which is compatible with AWS S3, Backblaze B2, MinIO, Cloudflare R2, Wasabi, Google Cloud Storage (S3-compatible), and any other S3-compatible service.
| Key | Required | Description |
|---|---|---|
S3_ENDPOINT | Yes | S3-compatible endpoint URL (e.g., https://s3.us-east-1.amazonaws.com) |
S3_BUCKET | Yes | Bucket name for backups |
S3_ACCESS_KEY_ID | No | Access key ID. When omitted (together with S3_SECRET_ACCESS_KEY), rclone uses --s3-env-auth=true to authenticate via the provider’s native credential chain. |
S3_SECRET_ACCESS_KEY | No | Secret access key. When omitted (together with S3_ACCESS_KEY_ID), rclone uses --s3-env-auth=true. |
S3_REGION | No | S3 region (e.g., us-east-1). Required for MinIO instances configured with a custom region. Without this, rclone defaults to us-east-1, which causes authentication failures on providers using a different region. |
S3_PROVIDER | No | rclone S3 provider (default: Other). Set to AWS for native AWS credential chain, GCS for Google Cloud Storage, Ceph for Ceph/RadosGW, etc. Setting the correct provider enables provider-specific auth flows and optimizations. See rclone S3 providers. |
When backups run automatically
Section titled “When backups run automatically”| Trigger | Condition |
|---|---|
| Pre-delete backup | Always runs when a CR is deleted, unless openclaw.rocks/skip-backup: "true" annotation is set or persistence is disabled. Subject to spec.backup.timeout (default: 30m) - if the backup does not complete within the timeout, it is skipped and deletion proceeds. |
| Pre-update backup | Runs before each auto-update when spec.autoUpdate.backupBeforeUpdate: true (the default). |
| Periodic (scheduled) | Runs on a cron schedule when spec.backup.schedule is set. See Periodic / scheduled backups below. |
If the s3-backup-credentials Secret does not exist, pre-delete and pre-update backups are silently skipped (deletion and updates proceed normally), and the periodic backup CronJob is not created (a ScheduledBackupReady=False condition is set with reason S3CredentialsMissing).
Backup path format
Section titled “Backup path format”Backups are stored at:
s3://<bucket>/backups/<tenantId>/<instanceName>/<timestamp>Where:
<tenantId>is the value of theopenclaw.rocks/tenantlabel on the instance, or derived from the namespace (e.g., namespaceoc-tenant-abcyieldsabc), or the namespace name itself.<instanceName>ismetadata.nameof theOpenClawInstance.<timestamp>is an RFC3339 timestamp at the time the backup job runs.
The path of the last successful backup is recorded in status.lastBackupPath.
Backup timeout
Section titled “Backup timeout”Pre-delete backups are subject to a configurable timeout (default: 30 minutes). If the backup does not complete within the timeout — whether due to a stuck Job, pod termination issues, or S3 credential errors — the operator logs a warning, emits a BackupTimedOut event, sets the BackupComplete=False condition with reason BackupTimedOut, and proceeds with deletion.
Configure the timeout via spec.backup.timeout:
spec: backup: timeout: "1h" # Allow up to 1 hour for pre-delete backups (default: 30m, min: 5m, max: 24h)Skip backup on delete
Section titled “Skip backup on delete”To delete an instance immediately without waiting for a backup (e.g., the S3 backend is unavailable):
kubectl annotate openclawinstance my-agent openclaw.rocks/skip-backup=truekubectl delete openclawinstance my-agentRestoring an instance
Section titled “Restoring an instance”Set spec.restoreFrom to an existing backup path. The operator runs an rclone restore job to populate the PVC before starting the StatefulSet, then clears the field automatically:
spec: restoreFrom: "backups/my-tenant/my-agent/2026-01-15T10:30:00Z"To find available backups, list the S3 bucket directly (e.g., aws s3 ls s3://my-openclaw-backups/backups/). The status.lastBackupPath field on any existing instance shows its last backup path.
Restore behavior:
- The restore Job runs before the StatefulSet is created (reconcile order: PVC -> restore Job -> StatefulSet)
spec.restoreFromis cleared automatically after a successful restore and the original path is recorded instatus.restoredFrom- The restore Job uses
spec.backup.serviceAccountNamewhen set, so workload identity (IRSA/Pod Identity) works for restores - If the restore Job fails, the operator sets
RestoreComplete=Falseand retries. Delete the failed Job to trigger a fresh attempt
Clone / migrate an instance
Section titled “Clone / migrate an instance”spec.restoreFrom works on new instances (with empty PVCs), not just existing ones. This enables cloning and cross-namespace migration workflows.
Example - clone instance A from namespace X to namespace Y:
# 1. Check the source instance's last backup path:# kubectl get openclawinstance my-agent -n ns-x -o jsonpath='{.status.lastBackupPath}'# -> backups/tenant-x/my-agent/2026-03-15T02:00:00Z
# 2. Create a new instance in the target namespace with restoreFrom:apiVersion: openclaw.rocks/v1alpha1kind: OpenClawInstancemetadata: name: my-agent-clone namespace: ns-yspec: image: repository: ghcr.io/openclaw/openclaw tag: latest restoreFrom: "backups/tenant-x/my-agent/2026-03-15T02:00:00Z" backup: serviceAccountName: "openclaw-backup" # if using workload identityThe operator creates the PVC, runs the restore Job (rclone sync from S3 to the new PVC), then starts the StatefulSet with the restored data. The new instance gets a fresh gateway token - the source instance is unaffected.
ArgoCD integration: The operator auto-clears spec.restoreFrom after a successful restore. To prevent ArgoCD from detecting this as drift, add it to ignoreDifferences:
apiVersion: argoproj.io/v1alpha1kind: Applicationspec: ignoreDifferences: - group: openclaw.rocks kind: OpenClawInstance jsonPointers: - /spec/restoreFromPeriodic / scheduled backups
Section titled “Periodic / scheduled backups”Set spec.backup.schedule to a cron expression to enable periodic backups:
spec: backup: schedule: "0 2 * * *" # Daily at 2 AM UTC historyLimit: 3 # Successful job runs to retain (default: 3) failedHistoryLimit: 1 # Failed job runs to retain (default: 1)The operator creates a Kubernetes CronJob (<instance>-backup-periodic) that:
- Mounts the PVC with fsGroup matching the StatefulSet (hot backup - no downtime or StatefulSet scale-down)
- Uses pod affinity to co-locate on the same node as the StatefulSet pod (required for RWO PVCs)
- Stores each run under a unique timestamped path:
backups/<tenantId>/<instanceName>/periodic/<YYYYMMDDTHHMMSSz> - Uses
ConcurrencyPolicy: Forbidto prevent overlapping backup runs - Runs with the same rclone image and security context (UID/GID 1000) as on-delete backups
Requirements: persistence must be enabled and the s3-backup-credentials Secret must exist. If either is missing, the CronJob is not created and a ScheduledBackupReady=False condition is set.
Removing the schedule: set spec.backup.schedule to an empty string (or remove the backup section entirely) and the CronJob is automatically deleted.
Workload Identity (cloud-native auth)
Section titled “Workload Identity (cloud-native auth)”Instead of static credentials, you can use your cloud provider’s workload identity to authenticate backup Jobs:
- AWS EKS: IRSA or EKS Pod Identity with
S3_PROVIDER=AWS - GKE: Workload Identity Federation with
S3_PROVIDER=GCS(using GCS S3-compatible endpoint) - AKS: Workload Identity with static HMAC keys or a compatible S3 provider
The setup has three parts: (1) a ServiceAccount with provider-specific annotations, (2) the s3-backup-credentials Secret without static keys, and (3) spec.backup.serviceAccountName on the instance.
Example (AWS IRSA):
- Create an IRSA-annotated ServiceAccount in the instance namespace:
apiVersion: v1kind: ServiceAccountmetadata: name: openclaw-backup namespace: oc-tenant-my-tenant annotations: eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/openclaw-backup-role- Omit
S3_ACCESS_KEY_IDandS3_SECRET_ACCESS_KEYfrom the credentials Secret and setS3_PROVIDER:
apiVersion: v1kind: Secretmetadata: name: s3-backup-credentials namespace: openclaw-operator-systemstringData: S3_ENDPOINT: "https://s3.us-east-1.amazonaws.com" S3_BUCKET: "my-openclaw-backups" S3_REGION: "us-east-1" S3_PROVIDER: "AWS" # enables AWS-native credential chain- Reference the ServiceAccount in the instance spec:
spec: backup: schedule: "0 2 * * *" serviceAccountName: "openclaw-backup"When S3_ACCESS_KEY_ID and S3_SECRET_ACCESS_KEY are omitted, the operator passes --s3-env-auth=true to rclone, which uses the provider’s native credential chain. The serviceAccountName is set on all backup and restore Job pods so they inherit the cloud IAM role.
Setting S3_PROVIDER to the correct value (e.g., AWS, GCS) enables provider-specific optimizations in rclone. When left unset, it defaults to Other which works with any S3-compatible backend using static credentials.