Aerospike Backup Service (ABS)
Overview
The Aerospike Backup Service (ABS) runs on a virtual machine or Docker container and provides a set of REST API endpoints to back up and restore an Aerospike Database cluster. You can perform full and incremental backups and set different backup policies and schedules. There are also several monitoring endpoints to check backup information and the health of the service in general.
See Install for installation and setup instructions to get ABS connected to a sample database for testing.
ABS reads configurations from a YAML file provided when the service is launched. This file arranges the source database clusters, the destination storage, one or more backup policies, and one or more backup routines. See Configuration for details about these configuration files.
The ABS REST API can update the configuration and trigger backups on demand. Restore operations are performed with calls to the REST API. See API Usage Examples to learn how to use the REST API to perform common backup and restore tasks.
Use the OpenAPI generation script included in the GitHub repository to generate an OpenAPI specification for the service that you can host locally. A pre-built OpenAPI specification explaining all configuration and API parameters is also available in Swagger format hosted on GitHub here.
Backups made with ABS are compatible with the asbackup
and asrestore
tools.
Follow the instructions in Direct restore using a specific backup to restore backups done with asbackup
tool that are stored anywhere.
Monitoring
The service exposes a wide variety of system metrics that Prometheus can scrape, including the following application metrics:
Name | Description |
---|---|
aerospike_backup_service_runs_total | Successful backup runs counter |
aerospike_backup_service_incremental_runs_total | Successful incremental backup runs counter |
aerospike_backup_service_skip_total | Full backup skip counter |
aerospike_backup_service_incremental_skip_total | Incremental backup skip counter |
aerospike_backup_service_failure_total | Full backup failure counter |
aerospike_backup_service_incremental_failure_total | Incremental backup failure counter |
aerospike_backup_service_duration_millis | Full backup duration in milliseconds |
aerospike_backup_service_incremental_duration_millis | Incremental backup duration in milliseconds |
/metrics
exposes metrics on port 8000 for Prometheus to check performance of the backup service. See Prometheus documentation for instructions./health
allows monitoring systems to check the service health./ready
checks whether the service is able to handle requests./api-docs
serves the API documentation in Swagger UI format.
See the official Kubernetes documentation on liveness and readiness probes for more information.
FAQ
What happens when a backup doesn’t finish before another starts in the same routine?
Full Backups:
- Full backups may be run in parallel.
- Full backups always take priority over incremental backups. If an incremental backup is running when a full backup is scheduled, the full backup will start as planned, and the incremental backup will continue running without interruption.
Incremental Backups:
- Incremental backups are skipped if any other backup, full or incremental, is still running.
- Incremental backups will not run until at least one full backup has been successfully completed.
Can multiple backup routines be performed simultaneously?
Yes, multiple backup routines can run in parallel. Furthermore, it is possible to back up different namespaces from the same cluster using separate routines with different schedules, all running simultaneously.
To manage resource utilization, you can configure the aerospike-clusters.CLUSTER_NAME.max-parallel-scans
property to limit the number of read threads operating on a single cluster.
Which storage providers are supported?
The backup service supports the following storage providers:
- AWS S3 (or compatible services such as MinIO)
- Microsoft Azure
- Google Cloud Storage
- Local storage (files stored on the same machine where the backup service is running)