Run Aerospike backup
This page describes how to use the Aerospike backup tool (asbackup
).
Get startedโ
To run asbackup
, specify the following options:
--host
: Cluster to back up.--namespace
: Namespace to back up;asbackup
backs up one namespace at a time.--directory
: Local directory for the backup files.
Exampleโ
A cluster contains a node with IP address 1.2.3.4
.
To back up the test
namespace on this cluster to the directory backup_2024_08_24
, run the following command:
asbackup --host 1.2.3.4 --namespace test --directory backup_2024_08_24
Data is stored in multiple files with the .asb file extension.
By default, each backup file is limited to 250 MiB.
When this limit is reached, asbackup
creates a new file.
For the exact backup file format, see the file format specification at the Backup File Format repository on GitHub.
asbackup --host 1.2.3.4 --namespace test --directory ./output_dir --compress zstd
In this example, asbackup
writes data from the test
namespace as multiple files to the output_dir
directory, compressing them using the ZSTD algorithm.
Run with the -Z
or --help
option to see an overview of all supported command-line options.
Back up to a single fileโ
You can back up the cluster to a single file rather than a directory:
asbackup --host HOST --namespace NAME --output-file FILENAME
Before you back up data to a single file, asbackup
runs an estimate on the namespace being backed up and uses it to calculate a 99.9% confidence upper bound on the total size of the resulting file.
The number of estimate samples taken can be controlled with --estimate-samples
, with the default being 10000, just as in backup-to-directory size estimate mode.
Connection optionsโ
The following options are available when specifying a cluster for backup:
Option | Default | Description |
---|---|---|
-h HOST1:TLSNAME1:PORT1,... or --host HOST1:TLSNAME1:PORT1,... | 127.0.0.1 | Host that acts as the entry point to the cluster. Any nodes in the cluster can be specified. The remaining nodes are discovered automatically. |
-p PORT or --port PORT | 3000 | Port to connect to. |
-U USER or --user USER | - | User name with read permission. Mandatory if the server has security enabled. |
-P PASSWORD or --password | - | Password to authenticate the given user. The first form passes the password on the command line. The second form prompts for the password. |
-A or --auth | INTERNAL | Set authentication mode when user and password are defined. Modes are (INTERNAL, EXTERNAL, EXTERNAL_INSECURE, PKI) and the default is INTERNAL. This mode must be set EXTERNAL when using LDAP. |
-l or --node-list ADDR1:TLSNAME1:PORT1,... | localhost:3000 | While --host and --port automatically discover all cluster nodes, --node-list backs up a subset of cluster nodes by first calculating the subset of partitions owned by the listed nodes, and then backing up that list of partitions. This option is mutually exclusive with --partition-list and --after-digest . |
--parallel N | 1 | Maximum number of scans to run in parallel. If only one partition range is given, or the entire namespace is being backed up, the range of partitions is evenly divided by this number to be processed in parallel. Otherwise, each filter cannot be parallelized individually, so you may only achieve as much parallelism as there are partition filters. |
--tls-enable | disabled | Indicates a TLS connection should be used. |
-S or --services-alternate | false | Set this to true to connect to Aerospike node's alternate-access-address . |
--prefer-racks RACKID1,... | disabled | A comma separated list of rack IDs to prefer when reading records for a backup. This is useful for limiting cross datacenter network traffic. |
Timeout optionsโ
The following parameters are available to specify between retries during data backup:
Option | Default | Description |
---|---|---|
--socket-timeout MS | 10000 | Socket timeout in milliseconds. If this value is 0, it is set to total-timeout. If both are 0, there is no socket idle time limit. |
--total-timeout MS | 0 | Total socket timeout in milliseconds. Default is 0, that is, no timeout. |
--max-retries N | 5 | Maximum number of retries before aborting the current transaction. |
--sleep-between-retries MS | 0 | The amount of time to sleep between retries. |
TLS optionsโ
The following security options are available for authentication:
Option | Default | Description |
---|---|---|
--tls-cafile=TLS_CAFILE | Path to a trusted CA certificate file. | |
--tls-capath=TLS_CAPATH | Path to a directory of trusted CA certificates. | |
--tls-name=TLS_NAME | Default TLS name used to authenticate each TLS socket connection. This must match the cluster name. | |
--tls-protocols=TLS_PROTOCOLS | Set the TLS protocol selection criteria. This format is the same as Apache's SSL Protocol. If not specified, asrestore uses TLSv1.2 if supported. Otherwise it uses -all +TLSv1 . | |
--tls-cipher-suite=TLS_CIPHER_SUITE | Set the TLS cipher selection criteria. The format is the same as OpenSSL's Cipher List Format. | |
--tls-keyfile=TLS_KEYFILE | Path to the key for mutual authentication (if Aerospike cluster supports it). | |
--tls-keyfile-password=TLS_KEYFILE_PASSWORD | Password to load protected TLS-keyfile. Can be one of the following: 1) Environment variable: env:VAR 2) File: file:PATH 3) String: PASSWORD User will be prompted on command line if --tls-keyfile-password specified and no password is given. | |
--tls-certfile=TLS_CERTFILE <path> | Path to the chain file for mutual authentication if the Aerospike cluster supports it. | |
--tls-cert-blacklist <path> | Path to a certificate blocklist file. The file should contain one line for each blocklisted certificate. Each line starts with the certificate serial number expressed in hex. Each entry may optionally specify the issuer name of the certificate (serial numbers are only required to be unique per issuer). Example: 867EC87482B2 /C=US/ST=CA/O=Acme/OU=Engineering/CN=TestChainCA | |
--tls-crl-check | Enable CRL checking for leaf certificate. An error occurs if a valid CRL files cannot be found in TLS_CAPATH . | |
--tls-crl-checkall | Enable CRL checking for entire certificate chain. An error occurs if a valid CRL files cannot be found in TLS_CAPATH . | |
--tls-log-session-info | Enable logging session information for each TLS connection. |
TLS_NAME
is only used when connecting with a secure TLS enabled server.
The following example creates a backup with the following parameters:
- Cluster nodes
1.2.3.4
and5.6.7.8
- Port 3000
- Namespace
test
- Output directory
backup_2015_08_24
- TLS enabled
HOST is "HOST1
:TLSNAME1
:PORT1
,...".
asbackup --host 1.2.3.4:cert1:3000,5.6.7.8:cert2:3000 --namespace test --directory backup_2015_08_24 --tls-enable --tls-cafile /cluster_name.pem --tls-protocols TLSv1.2 --tls-keyfile /cluster_name.key --tls-certfile /cluster_name.pem
Output optionsโ
The following options are available for the backup files that the backup tool creates:
Option | Default | Description |
---|---|---|
-d PATH or --directory PATH | - | Directory to store the .asb backup files in. If the directory does not exist, it will be created before use. Mandatory, unless --output-file or --estimate is given. |
-o PATH or --output-file PATH | - | Single file to write the backup to. - means stdout . Mandatory, unless --directory or --estimate is given. |
-q DESIRED-PREFIX or --output-file-prefix DESIRED-PREFIX | Must be used with the --directory option. A desired prefix for all output files. | |
-e or --estimate | - | Specified in lieu of --directory or --output-file , estimates the average size of a single record in the backup file. Useful for estimating the expected size of a backup before actually starting it. Multiply the returned value by the number of records in the namespace and add 10% for overhead. This option is mutually exclusive to --remove-artifacts and --continue . |
--estimate-samples N | 10000 | Sets the number of record samples to take in a backup estimate. This also sets the number of estimate samples taken for the estimate run before backup-to-file. |
-F LIMIT or --file-limit LIMIT | 250 MiB | File size limit (in MiB) for --directory . If a .asb backup file crosses this size threshold, asbackup will switch to a new file. |
-r or --remove-files | - | Clear directory or remove output file. By default, asbackup refuses to write to a non-empty directory or to overwrite an existing backup file. This option clears the given --directory or removes an existing --output-file . Mutually exclusive to --continue . |
--remove-artifacts | - | Clear directory or remove output file, like --remove-files , without running a backup. This option is mutually exclusive to --continue and --estimate . |
-C or --compact | - | Do not base-64 encode BLOB values. For better readability of backup files, asbackup base-64 encodes BLOB values by default. This option disables the encoding step, which saves space in the backup file. However, be prepared to encounter odd-looking binary data in your backup files. |
-N BANDWIDTH or --nice BANDWIDTH | - | Throttles asbackup 's write operations to the backup file(s) to not exceed the given bandwidth in MiB/s. Effectively also throttles the scan on the server side as asbackup refuses to accept more data than it can write. |
-y ENCRYPTION-ALG or --encrypt ENCRYPTION-ALG | none | Encryption algorithm to be used on backup files as they are written. The options available are aes128 and aes256 . This option must be accompanied by either --encryption-key-file or --encryption-key-env . Refer to compression and encryption |
-z COMPRESSION-ALG or --compress COMPRESSION-ALG | none | Compression algorithm to be used on backup files as they are written. The options available are zstd . See compression and encryption. |
--compression-level N | 3 | zstd compression level to be used. See the zstd manual for more information. |
-v or --verbose | disabled | Output considerably more information about the running backup. |
-m or --machine PATH | - | Output machine-readable status updates to the given path, typically a FIFO. |
-L or --records-per-second RPS | 0 | Available only for Aerospike Database 4.7 and later. Limit total returned records per second (RPS). If RPS is zero (the default), a records-per-second limit is not applied. |
Specify incremental backupโ
Use the argument YYYY-MMM-DD_HH:MM:SS
as the time stamp variable to specify how the backup tool creates incremental backups:
-a
or--modified-after
YYYY-MMM-DD_HH:MM:SS
backs up keys time-stamped after the argument.-b
or--modified-before
YYYY-MMM-DD_HH:MM:SS
backs up keys time-stamped before the argument.
You may also back up partitions to create incremental backups. Refer to Partition list.
Namespace data selection optionsโ
The following options are available to specify the target namespace:
Option | Default | Description |
---|---|---|
-n NAMESPACE or --namespace NAMESPACE | - | Namespace to back up. Mandatory. |
-s SETS or --set SETS | All sets | The set(s) to back up. May pass in a comma-separated list of sets to back up. Starting with asbackup 3.9.0, Database 5.2 or later is required for multi-set backup. Note: multi-set backup cannot be used with --filter-exp . |
-B BIN1,BIN2,... or --bin-list BIN1,BIN2,... | All bins | The bins to back up. |
-x or --no-bins | - | Only back up record metadata (digest, TTL, generation count, key). WARNING: No data (bin contents) is backed up. This is not meant for restoration, only testing. This command is unrelated to the legacy single-bin option in the Aerospike database configuration file for Database versions 6.4 and earlier. |
-R or --no-records | - | Do not back up any record data (metadata or bin data). By default, asbackup includes record data, secondary index definitions, and UDF modules. |
-I or --no-indexes | - | Do not back up any secondary index definitions. |
-u or --no-udfs | - | Do not back up any UDF modules. |
-M or --max-records N | 0 = all records. | An approximate limit for the number of records to process. Note: this option is mutually exclusive to --partition-list and --after-digest . |
-a YYYY-MM-DD_HH:MM:SS or --modified-after YYYY-MM-DD_HH:MM:SS | - | Back up data with last-update-time after the specified date-time. The system's local timezone applies. Starting with asbackup 3.9.0, Database 5.2 or later is required. |
-b YYYY-MM-DD_HH:MM:SS or --modified-before YYYY-MM-DD_HH:MM:SS | - | Back up data with last-update-time before the specified date-cal timezone applies. Starting with asbackup 3.9.0, Database 5.2 or later is required. |
--no-ttl-only | - | Include only records that have no TTL; that is, persistent records. Starting with asbackup 3.9.0, Database 5.2 or later is required. |
Use compression and encryption during backupโ
You can compress and encrypt backup file data before it is written to the backup file with --compress
and --encrypt
.
Enable an option by passing it to asbackup
and include your chosen algorithm.
Compressionโ
ZSTD, from the Facebook libztsd repository on GitHub, is the only compression algorithm available for asbackup
.
For example:
asbackup --host HOST --namespace NAME --compress zstd --compression-level 3
The compression level, set with the optional --compression-level
flag, is an integer described in the zstd manual.
Set the default compression level with the ZSTD_CLEVEL_DEFAULT
parameter.
Encryptionโ
There are two available encryption algorithms:
Algorithm | Description |
---|---|
aes128 | AES 128-bit key-digest encryption, which uses the CTR128 algorithm to encrypt data. The SHA256 hash of the encryption key generates the key used by CTR128. |
aes256 | AES 256-bit key-digest encryption, which uses a 256-bit digest of the key for encryption and AES256 as the base encryption algorithm. |
For encryption, you must provide a private key.
The private encryption key may be in PEM format (with --encryption-key-file
), or a base64 encoded key passed in through an environment variable (with --encryption-key-env
).
For example, using an encryption key file:
asbackup --host HOST --namespace NAME --encrypt aes128 --encryption-key-file KEY.PEM
Using an environment variable:
export PRIVATE_KEY='PRIVATE KEY'
asbackup --host HOST --namespace NAME --encrypt aes256 --encryption-key-env PRIVATE_KEY
Replace PRIVATE_KEY
with the contents of your private key file between the header and footer.
In the following example the key starts with b3Blb
and ends with eNfNpA=
:
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAACmFlczI1Ni1jdHIAAAAGYmNyeXB0AAAAGAAAABDWTq8LwB
zXg7xnGj4VNY3GAAAAEAAAAAEAAAAzAAAAC3NzaC1lZDI1NTE5AAAAIHuu8YsX03XGjJ1L
YFbehI4Ha7g8EVybKB3dAAPt/iFq3u9eNfNpA=
-----END OPENSSH PRIVATE KEY-----
When restoring compressed or encrypted backup files with asrestore
, you must provide the same compression or encryption values that were used to create the backup.
Partition scanning backup optionsโ
Partition listโ
Back up a list of partition filters using -X, --partition-list LIST
. Partition filters can be ranges, individual partitions, or records after a specific digest within a single partition.
This option is mutually exclusive with the -D
, --after-digest
option described in After specific digest, --node-list
, and --max-records
.
Default number of partitions to back up: 0 to 4095: all partitions.
LIST
format:FILTER1,FILTER2,...
FILTER
format:BEGIN-PARTITION -PARTITION-COUNT|DIGEST
BEGIN-PARTITION
: 0 to 4095.- Either the optional
PARTITION-COUNT
: 1 to 4096. Default: 1 - Or the optional
DIGEST
: Base64-encoded string of desired digest to start at in specified partition.
When using multiple partition filters, each partition filter is a single scan call and cannot be parallelized with the parallel
option.
For more parallelizability, break up the partition filters or run a backup using only one partition filter.
When backing up only a single partition range, the range is automatically divided into parallel
segments of near-equal size, each of which is backed up in parallel.
Examplesโ
-X 361
- Back up only partition 361
-X 361,529,841
- Back up partitions 361, 529, and 841
-X 361-10
- Back up 10 partitions, starting with 361 and including 370.
-X VSmeSvxNRqr46NbOqiy9gy5LTIc=
- Back up all records after the digest
VSmeSvxNRqr46NbOqiy9gy5LTIc=
in its partition (which in this case is partition 2389)
-X 0-1000,2222,EjRWeJq83vEjRRI0VniavN7xI0U=
- Back up partitions 0 to 999 (1000 partitions starting from 0)
- Then back up partition 2222
- Then back up all records after the digest
EjRWeJq83vEjRRI0VniavN7xI0U=
in its partition
After specific digestโ
-D
, --after-digest DIGEST
Back up records after the specified record digest in that record's partition and all succeeding partitions.
This option is mutually exclusive with the -X
, --partition-list
option described in Partition filter, --max-records
, and --node-list
.
DIGEST
format: Base64-encoded string of desired digest. This is the same encoding used for backup of digests, so you can copy-and-paste digest identifiers from backup files to use as the command-line argument with-D
.
Exampleโ
-D EjRWeJq83vEjRRI0VniavN7xI0U=
Filter expressionโ
Backups can be made of only a subset of data matching a provided Aerospike Expression.
You must provide the base-64 encoding of the filter expression, which can be generated using the C client (as_exp_build_b64
) or the Java client (Expression.getBytes()
).
This option is mutually exclusive with multi-set backup, which is triggered by passing --set
with more than one set specified.
To build an expression that filters for bin "name" = "bob"
, first, build the expression in the C client and print out its base 64 encoding:
as_exp_build_b64(b64_exp, as_exp_cmp_eq(as_exp_bin_str("name"), as_exp_str("bob")));
printf("%s\n", b64_exp);
This should print kwGTUQOkbmFtZaQDYm9i
. Then, to run a backup with this filter expression, run:
asbackup --filter-exp kwGTUQOkbmFtZaQDYm9i ...
Backup resumptionโ
If a backup job is interrupted, for example if you stop the backup with Ctrl-C, or it fails for any reason other than a failure to write to the disk, the backup state is saved to a .state
file.
Pass the path to this .state
file to the --continue
flag to resume the backup.
All of the same command line arguments, except --remove-files
, must be used when continuing a backup.
Option | Default | Description |
---|---|---|
--continue STATE-FILE | disabled | Enables the resumption of an interrupted backup from provided state file. All other command line arguments should match those used in the initial run (except --remove-files , which is mutually exclusive with --continue ). |
--state-file-dst | see below | Specifies where to save the backup state file to. If this points to a directory, the state file is saved within the directory using the same naming convention as backup-to-directory state files. If this does not point to a directory, the path is treated as a path to the state file. |
Default backup state file locationโ
For backups to a file, the backup state is saved to a file with the same name and location as the backup file with .state
appended to the filename.
For backups to a directory, the backup state is saved in the directory with name NAMESPACE.asb.state
.
If you supply a prefix with --output-file-prefix
the prefix is used in place of NAMESPACE
.
Back up individual hostsโ
Use --node-list NODE1:PORT,NODE2:PORT
to back up data on specific hosts on a partition basis. PORT
is the Aerospike service port, by default 3000. The --node-list
flag is particularly useful when running multiple asbackup
processes, for example one per Aerospike host.
Throttle data backupโ
If asbackup
can retrieve data from the database faster than it can write data, you may need to throttle the retrieval rate. Use the --nice RATE
flag to restrict the rate at which data is written. The rate is specified in MB/s.
Write to stdout and pipingโ
Instead of --output-file
or --directory
, use -
to write the backup data to stdout
. This is useful for pipes. The following example writes backup data to stdout
with -
, and pipes the output to gzip
to create a compressed file:
asbackup --host HOST --namespace NAME --output-file - | gzip > FILENAME.GZ
The gzip
utility is single-threaded. Using gzip
can cause single-CPU core saturation and create a bottleneck. To take advantage of multi-core archive utilities, consider using xz
instead.
You can use the --compress
runtime option to compress backup data. See Use compression and encryption during backup for more information.
Configure asbackup
with configuration filesโ
You can configure asbackup
using a configuration file. See Aerospike Tools Configuration for more information.
The following options control configuration file behavior:
Option | Default | Description |
---|---|---|
--no-config-file | disabled | Do not read any configuration file. The configuration file options --no-config-file and only-config-file are mutually exclusive. |
--instance SUFFIX | - | In the configuration file, you can specify a group of clusters that share a common suffix with the --instance option. Refer to Instances for more information. |
--config-file PATH | - | Read this file after default configuration file. |
--only-config-file PATH | - | Read only this configuration file. The configuration files options --no-config-file and only-config-file are mutually exclusive. |