How to back up and restore your Kubernetes cluster resources and persistent volumes?
Kubernetes, as we know, coordinates a highly available cluster of computers that are connected to work as a single unit. Kubernetes contains a number of abstractions that allows deployment of containerized applications to the cluster without attaching them to individual machines.
In short, Kubernetes is –
- Portable: public, private, hybrid, multi-cloud
- Extensible: modular, pluggable, hook able, composable
- Self-healing: auto-placement, auto-restart, auto-replication, auto-scaling
In this post, we are going to take look at steps on how to back up and restore your Kubernetes cluster resources and persistent volumes using Velero open-source tool.
Quick Snapshot
Why Backup?
First off, let’s understand the typical scenarios where you would need to use backup
- In cases of recovery from disaster, it can reduce time for recovery.
- Migration of Kubernetes resources from one cluster to another or to a newer version of Kubernetes.
- Replication of the environment for debugging, development, etc.,
Now that we are clear on why backup is needed, we can list out objects to back up in the next section.
What to Backup ?
- Kubernetes resources are stored in etcd store. etcd is a consistent and highly-available key-value store used as Kubernetes’ backing store for all cluster data. You can find in-depth information about etcd in the official documentation.
- Application data i.e., persistent volumes, for stateful applications running on your cluster.
How Velero Works
Velero (formerly Heptio Ark) gives you tools to back up and restore your Kubernetes cluster resources and persistent volumes. Velero consists of:
- A server that runs on your cluster
- A command-line client that runs locally
Each Velero operation, for example, on-demand backup, scheduled backup, restore, etc., is a custom resource, defined with a Kubernetes Custom Resource Definition (CRD) and stored in etcd store.
When you run command velero backup create test-backup
:
- The Velero client makes a call to the Kubernetes API server to create a
Backup
object. - The
BackupController
looks the newBackup
object and performs validation. - The
BackupController
begins the backup operation. It collects the data to back up by querying the API server for resources. - The
BackupController
makes a call to the object storage service e.g., AWS S3 to upload the backup file.
In the next section,we will take look at steps on how to back up and restore your Kubernetes cluster resources and persistent volumes.
Prerequisites
Following are the prerequisites that are required for this quick start
- A Kubernetes cluster with the latest stable release of Kubernetes
- Kubectl CLI
Before the installation, let us check if we have got the right Kubernetes version.
Step #1.Download Velero
Download the latest release of Velero with the below command:
curl -LO https://get.helm.sh/helm-v3.0.0-linux-amd64.tar.gz
I’m using Linux, for other platforms, see the releases page. Untar the download file and move the velero executable to /usr/local/bin
or your path.
Create a Velero-specific credentials file (credentials-velero
) in your local directory:
echo "[default]
aws_access_key_id = minio
aws_secret_access_key = minio123" > credentials-velero
Velero needs appropriate storage providers to store backup and snapshot data, For this demo, we are going to use Minio, an S3-compatible storage service that runs locally on the cluster. The above credentials would be used by Minio instance. Refer list of Supported storage providers.
Now that we have credentials and cluster ready, we can install the Velero server.
Step #2.Install Velero Server
In the below steps, we would be starting the server and the local storage service. In the Velero folder run the below commands:
Configure local storage service using below command:
kubectl apply -f examples/minio/00-minio-deployment.yaml
Start Velero server by using below command:
velero install \
--provider aws \
--bucket velero \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio.velero.svc:9000
For now, we are assuming Velero is running within a local cluster without a volume provider capable of snapshots.
Check if Velero deployments are successfully created.
Step #3.Deploy Sample Application
Next step is to deploy sample nginx
application on the cluster with the following command:
kubectl apply -f velero/examples/nginx-app/base.yaml
Check if Sample application deployments are successfully created.
Step #4.Backup
Now we will be creating a backup for any object that matches the app=nginx
label selector:
velero backup create nginx-backup --selector app=nginx
If you want to backup all objects except those matching the label then you can use 'backup notin (ignore)'
option.
There are also options for creating scheduled backups based on a cron expression.
To verify if the backup has completed, use describe command as below:
velero backup describe nginx-backup
We now have backup operation completed, to test restore operation, we will be deleting the namespace.
Verify if Nginx service and deployment are deleted:
It usually takes few minutes for the namespace to be fully cleaned up.
Step #5.Restore
To list the backups we have created, use the below command:
velero restore get
To restore the backup we have created, use the below command:
velero restore create --from-backup nginx-backup
After the restore finishes, you can check if the restored deployments back in namespace:
If there are errors or warnings during the restore operation, you can use the below command to check the details:
velero restore describe <RESTORE_NAME>
Congrats! we have successfully made backup and restore them.
If for some limitations, you’re not able to use Velero then you can always use Kubectl
CLI to export resource definitions from your existing Kubernetes cluster and then apply them to your target cluster. Following is the command to export objects.
kubectl get deployment -o=yaml --export > deployments.yaml
Limitations
Below are known limitations of Velero
- Velero currently supports a single set of credentials per provider. It’s not yet possible to use different credentials for different locations.
- Volume snapshots are limited by where your provider allows you to create snapshots. For example, AWS and Azure do not allow you to create a volume snapshot in a different region than where the volume is.
- Each Velero backup has one
BackupStorageLocation
, and oneVolumeSnapshotLocation
per volume provider. It is not possible to send a single Velero backup to multiple backup storage locations simultaneously, or a single volume snapshot to multiple locations simultaneously. - Cross-provider snapshots are not supported.
Like this post? Don’t forget to share it!
Useful Resources
- Documentation
- Troubleshooting docs
- Supported storage providers
- Examples
- Monitoring Docker containers using Prometheus + cAdvisor + Grafana
- Weave Scope Introduction + Kubernetes tutorial
- Prometheus vs WeaveScope vs DataDog vs Sysdig monitoring tools compared
- ULTIMATE GUIDE to Coursera Specializations That Will Make Your Career Better (Over 100+ Specializations covered)
- Google Cloud Courses Collection
[…] How to back up and restore your Kubernetes cluster resources and persistent volumes? […]