Skip to main content
Loading

Aerospike Benchmark (asbench)

The Aerospike Benchmark tool is a C-based tool that measures the performance of an Aerospike cluster. It can mimic real-world workloads with configurable record structures, access patterns, and UDF calls.

info

You can use any port between 1024 and 65535 for Aerospike, as long as the port is not in use by an existing process.

Usageโ€‹

The --help option of asbench gives an overview of all supported command line options. The -V or --version option prints the current version of asbench.

asbench --help

Connection optionsโ€‹

OptionDefaultDescription
-h or --hosts HOST_1:TLSNAME_1:PORT_1[,...]127.0.0.1List of seed hosts. TLSNAME is only used when connecting with a secure TLS-enabled server. If the port is not specified, the default port is used. IPv6 addresses must be enclosed in square brackets.
-p or --port PORT3000Default port on which to connect to Aerospike.
-U or --user USERUser name. This is mandatory if security is enabled on the server.
-PUser's password for Aerospike servers that require authentication. The system responds by asking you to enter the password.
--auth MODEINTERNALSet the authentication mode when user/password is defined. Replace MODE with one of the following: INTERNAL, EXTERNAL, EXTERNAL_INSECURE, or PKI. This mode must be set to EXTERNAL when using LDAP.
-tls or --tls-enabledisabledUse TLS/SSL sockets.
--services-alternatefalseUse to connect to alternate-access-address when the cluster nodes publish IP addresses through access-address which are not accessible over WAN, or to connect to alternate IP addresses accessible over WAN through alternate-access-address.

TLS optionsโ€‹

OptionDefaultDescription
--tls-cafile=TLS_CAFILEPath to a trusted CA certificate file.
--tls-capath=TLS_CAPATHPath to a directory of trusted CA certificates.
--tls-name=TLS_NAMEDefault TLS name used to authenticate each TLS socket connection. This must match the cluster name.
--tls-protocols=TLS_PROTOCOLS-all +TLSv1.2 if the connection supports TLSv1.2. -all +TLSv1 if it doesn't.TLS protocol selection criteria. Uses Apache's SSLProtocol format.
--tls-cipher-suite=TLS_CIPHER_SUITETLS cipher selection criteria. Uses Open_SSL's Cipher List Format.
--tls-keyfile=TLS_KEYFILEPath to the key for mutual authentication, if the Aerospike cluster supports it.
--tls-keyfile-password=TLS_KEYFILE_PASSWORDPassword to load a protected TLS keyfile.

Replace TLS_KEYFILE_PASSWORD with one of the following:
- An environment variable env:VAR.
- The path to a file file:PATH.
- A string: PASSWORD.

If --tls-keyfile-password is specified and no password is provided, the system responds by asking you to enter the user's password.
--tls-certfile=TLS_CERTFILE PATHPath to the chain file for mutual authentication, if the Aerospike cluster supports it).
--tls-cert-blacklist PATHPath to a certificate blacklist file. The file should contain one line for each blacklisted certificate. Each line starts with the certificate serial number expressed in hexadecimal format. Serial numbers are only required to be unique per issuer. Each entry may optionally specify the issuer name of the certificate. Example:
86EC7A484 /C=US/ST=CA/O=Acme/OU=Eng/CN=TestChainCA
--tls-crl-checkEnable CRL checking for leaf certificate. An error occurs if a valid CRL file cannot be found in tls_capath.
--tls-crl-check-allEnable CRL checking for the entire certificate chain. An error occurs if a valid CRL file cannot be found in tls_capath.
--tls-log-session-infoLog TLS connected session info.
--tls-login-onlyUse TLS for node login only.

The TLS name is only used when connecting with a secure TLS-enabled server.

The following example runs the default benchmark on a cluster of nodes 1.2.3.4 and 5.6.7.8 using:

  • The default Aerospike port of 3000.
  • TLS configured.
  • Namespace TEST.
asbench --hosts 1.2.3.4:cert1:3000,5.6.7.8:cert2:3000 --namespace TEST --tls-enable --tls-cafile /cluster_name.pem --tls-protocols TLSv1.2 --tls-keyfile /cluster_name.key --tls-certfile /cluster_name.pem

Global optionsโ€‹

OptionDefaultDescription
-z or --threads THREAD_COUNT16Number of threads used to perform synchronous read/write commands. Replace THREAD_COUNT with the number of threads.
--compressdisabledEnable binary data compression through the Aerospike client. Internally, this sets the compression policy to true.
--socket-timeout MS30000Read/Write socket timeout in milliseconds. Replace MS with the number of milliseconds.
--read-socket-timeout MS30000Read socket timeout in milliseconds. Replace MS with the number of milliseconds.
--write-socket-timeout MS30000Write socket timeout in milliseconds. Replace MS with the number of milliseconds.
-T or --timeout MS0Read/Write total timeout in milliseconds. Replace MS with the number of milliseconds.
--read-timeout MS0Read total timeout in milliseconds. Replace MS with the number of milliseconds.
--write-timeout MS0Write total timeout in milliseconds. Replace MS with the number of milliseconds.
--max-retries RETRIES_COUNT1Maximum number of retries before aborting the current transaction. Replace RETRIES_COUNT with the number of retries.
-d or --debugdisabledRun benchmark in debug mode.
-S or --shareddisabledUse shared memory cluster tending.
-C or --replica REPLICA_TYPEmasterWhich replica to use for reads.

Replace REPLICA_TYPE with one of the following:
- master: Always use the node containing the master partition.
- any: Distribute reads across master and proles in round-robin fashion.
- sequence: Always try master first. If master fails, try proles in sequence.
- preferRack: Always try node on the same rack as the benchmark first. If no nodes on the same rack, use sequence. This option requires you to set rack-id.
--rack-id RACK_IDWhich rack this instance of the asbench resides on. Required with replica policy prefer-rack. Replace RACK_ID with the rack's ID.
-N or --read-mode-ap READ_MODEoneRead mode for AP (availability) namespaces. Replace READ_MODE with either one or all.
-B or --read-mode-sc SC_READ_MODEsessionRead mode for SC (strong consistency) namespaces. Replace SC_READ_MODE with one of the following: session, linearize, allowReplica, or allowUnavailable.
-M or --commit-level LEVELallWrite commit guarantee level. Replace LEVEL with either all or master.
-Y or --conn-pools-per-node POOLS_COUNT1Number of connection pools per node. Replace POOLS_COUNT with the number of connection pools.
-D or --durable-deletedisabledAll transactions set the durable-delete flag which indicates to the server that if the transaction results in a delete, generate a tombstone for the deleted record.
--send-keydisabledEnables the key policy AS_POLICY_KEY_SEND, which sends the key value in addition to the key digest.
--sleep-between-retries0Enables sleep between retries if a transaction fails and the timeout was not exceeded.
-c or --async-max-commands COMMAND_COUNT50Maximum number of concurrent asynchronous commands that are active at any time. Replace COMMAND_COUNT with the number of commands.
-W or --event-loops THREAD_COUNT1Number of event loops (or selector threads) when running in asynchronous mode. Replace THREAD_COUNT with the number of threads.

Namespace and record format optionsโ€‹

OptionDefaultDescription
-n or --namespace NAMESPACE_NAMEtestAerospike namespace to perform all operations under. Replace NAMESPACE_NAME with the name of the namespace.
-s or --set SET_NAMEtestsetAerospike set to perform all operations in. Replace SET_NAME with the name of the set.
-b or --bin BIN_NAMEtestbinBase name to use for bins. Replace BIN_NAME with the name you want to use.

The first bin is named BIN_NAME, the second is BIN_NAME_2, the third BIN_NAME_3.
-K or --start-key KEY_STARTING_VALUE0Set the starting value of the working set of keys. Replace KEY_STARTING_VALUE with the starting value.

If you are using an insert workload, start-key indicates the first value to write. Otherwise, start-key indicates the smallest value in the working set of keys.
-k or --keys KEYS_COUNT1000000Set the number of keys the client is dealing with. Replace KEYS_COUNT with the number of keys.

If you are using an insert workload, the client writes this number of keys, starting from value = start-key. Otherwise, the client reads and updates randomly across the values between start-key and start-key + num_keys.
-o or --object-spec OBJECT_TYPEI4Set the bin specifications. Replace OBJECT_TYPE with a comma-separated list of bin specifications. See object spec for more details.
--compression-ratio RATIO1Sets the compression ratio for binary data. Replace RATIO with the desired ratio. This option causes the benchmark tool to generate binary data which is roughly compressed by this proportion.

Note: this is only applied to B<n> binary data, not to any of the other types of record data.
-e or --expiration-time0Set the TTL of all records written in write transactions.

Available options are:
- -1: No TTL, never expire.
- -2: Do not modify the record TTL with this write transaction.
- 0: Adopt the default TTL value from the namespace.
- >0: TTL of the record in seconds.

Object specโ€‹

The object spec is a flexible way to describe how to structure records being written to the database. The object spec is a comma-separated list of bin specs. Each bin spec is one of the following:

Variable Scalars:

TypeFormatDescription
BooleanbA random boolean bin/value.
IntegerI<n>A random integer with the lower n bytes randomized (and the rest remaining 0). n can range from 1 to 8.

Note: the nth byte is guaranteed to not be 0, except in the case n=1.
DoubleDA random double bin/value (8 bytes).
StringS<n>A random string of length n of either lowercase letters a-z or numbers 0-9.
Binary DataB<n>Random binary data of length n bytes.

Note: if --compression-ratio is set, only the first ratio * n bytes are random. The rest are 0.

Constant Scalars:

TypeFormatExampleNotes
Const Booleantrue/T or false/FtrueThe full-word forms are case-insensitive, but T and F must be capitalized.
Const IntegerA decimal, hex (0x...), or octal (0...) number123
Const DoubleA decimal number with a .123.456Const doubles may optionally be followed by "f" or "F", but must always contain a decimal.
Const StringA backslash-escaped string enclosed in double quotes"this -> \" is a double quote\n"Strings are backslash-escaped, but so are most terminals, so make sure to escape your backslashes with backslashes when writing object specs in a command line argument. Additionally, double quotes are often special characters, so escape those too.

Collection Bins:

TypeFormatNotes
List[BIN_SPEC,...]A list of one or more bin specs separated by commas.
Map{SCALAR_BIN_SPEC:BIN_SPEC,...}A list of one or more mappings from a scalar bin spec to a bin spec. Anything but a List or Map. These describe the key-value pairs that the Map contains.

Multipliersโ€‹

Multipliers are positive integer constants, followed by an asterisk *, preceding a bin spec.

In the root-level object spec, multipliers indicate how many times to repeat a bin spec across separate bins. For example, the following object specs are equivalent:

I, I, I, I, S10, S10, S10         = 4*I, 3*S10
123, 123, 123, "string", "string" = 3*123, 2*"string"

In a list, multipliers indicate how many times to repeat a bin spec in the list. The following are equivalent:

[I, I, I, I, S10, S10, S10] = [4*I, 3*S10]

In a map, multipliers must precede variable scalar keys. Multipliers indicate how many unique key-value pairs of the given format to insert into the map. Multipliers may not precede const key bin specs or value bin specs in a key-value mapping. The following are equivalent:

{I:B10, I:B10, I:B10} = {3*I:B10}

Workloadsโ€‹

The benchmark tool uses workloads to interact with the Aerospike database. The workload types are:

  • I: Linear Insert. Runs over the range of keys specified and inserts a record with that key.
  • RU,READ_PCT[,READ_PCT_ALL_BINS[,WRITE_PCT_ALL_BINS]]: Random Read/Update. Randomly picks keys, and either writes a record with that key or reads a record from database with that key, with probability according to the given read percentage.
    • Replace READ_PCT with a number between 0 and 100. 0 means only do writes. 100 means only do reads.
    • Starting with asbench 1.5 (Tools 7.1), you may optionally provide READ_PCT_ALL_BINS,WRITE_PCT_ALL_BINS to indicate the percentage of reads and writes that read the entire record. Otherwise only read the first bin. Default is 100.
  • RR,READ_PCT[,<READ_PCT_ALL_BINS[,REPLACE_PCT_ALL_BINS]]: Random Read/Replace/Function. Same as RU, except replaces record instead of updates.
  • RUF,READ_PCT,WRITE_PCT[,READ_PCT_ALL_BINS[,WRITE_PCT_ALL_BINS]]: Random Read/Update/Function. Same as RU, except may also perform an apply command on the random key with a given UDF function.
    • The percentage of operations that are function calls (UDFs) is 100 - READ_PCT - WRITE_PCT. This value must not be negative, which is checked for at initialization.
  • RUD,READ_PCT,WRITE_PCT[,READ_PCT_ALL_BINS[,WRITE_PCT_ALL_BINS]]: Random Read/Update/Delete. Same as RU, except may also perform deletes on a random key.
    • The percentage of operations that deletes is 100 - READ_PCT - WRITE_PCT. This value must not be negative, which is checked for at initialization.
  • DB: Delete bins. Same as I, but deletes the record with the given key from the database.
    • In order for DB to delete entire records, it must delete every bin that the record contains. Since bin names are based on their position in the object spec, when running this workload, verify you are using the same object spec you used to generate the records being deleted.
    • If you only want to delete a subset of bins, use write-bins to select which bins to delete. DB deletes all bins by default and uses write-bins to determine which bins to delete.

Workload optionsโ€‹

OptionDefaultDescription
--read-binsall binsSpecifies which bins from the object-spec to load from the database on read transactions. Must be given as a comma-separated list of increasing bin numbers, starting from 1.
--write-binsall binsSpecifies which bins from the object-spec to generate and store in the database on write transactions. Must be given as a comma-separated list of bin numbers, starting from 1.
-R or --randomdisabledUse dynamically-generated random bin values for each write transaction instead of fixed values (one per thread) created at the beginning of the workload.
-t or --duration SECONDS10 for RU,RUF workloads, 0 for I,DB workloadsSpecifies the minimum amount of time the benchmark runs for. Replace SECONDS with the number of seconds. For random workloads with no finite amount of work needing to be done, this value must be above 0 for anything to happen. For workloads with a finite amount of work, like linear insertion/deletion, set this value to 0.
-w or --workload WORKLOAD_TYPERU,50Desired workload. Replace WORKLOAD_TYPE with the workload type.
--workload-stages PATH/TO/WORKLOAD_STAGES.yamldisabledAccepts a path to a workload stages YAML file, which should contain a list of workload stages to run through. See workload stages.
-g or --throughput TPS0Throttle transactions per second to a maximum value. Replace TPS with transactions per second. If transactions per second is zero, throughput is not throttled.
--batch-size SIZE1Enable all batch modes with a number of records to process in each batch call. Replace SIZE with the number of records. Batch mode is applied to the read, write, and delete transactions in I, RU, RR, RUF, and RUD workloads. If batch size is 1, batch mode is disabled.
--batch-read-size SIZE1Enable batch read mode with a number of records to process in each batch read call. Replace SIZE with the number of records. Batch read mode is applied to the read transactions in RU, RR, RUF, and RUD workloads. If batch read size is 1, batch read mode is disabled. Batch read size takes precedence over batch size.
--batch-write-size SIZE1Enable batch write mode with a number of records to process in each batch write call. Replace SIZE with the number of records. Batch write mode is applied to the write transactions in I, RU, RUF, and RUD workloads. If batch write size is 1, batch write mode is disabled. Batch write size takes precedence over batch size.
--batch-delete-size SIZE1Enable batch delete mode with a number of records to process in each batch delete call. Replace SIZE with the number of records. Batch delete mode is applied to the delete transactions in RUD and BD workloads. If batch delete size is 1, batch delete mode is disabled. Batch delete size takes precedence over batch size.
-a or --asyncdisabledEnable asynchronous mode, which uses the asynchronous variant of every Aerospike C Client method for transactions.

Workload stagesโ€‹

You can run multiple different workloads in sequence using the --workload-stages option with a workload stage configuration file, which is in YAML format. The configuration file should only be a list of workload stages in the following format:

- stage: 1
# required arguments
workload: <workload type>
# optional arguments
duration: <seconds>
tps : max possible with 0 (default), or specified transactions per second
object-spec: Object spec for the stage. Otherwise, inherits from the previous
stage, with the first stage inheriting the global object spec.
key-start: Key start, otherwise inheriting from the global context
key-end: Key end, otherwise inheriting from the global context
read-bins: Which bins to read if the workload includes reads
write-bins: Which bins to write to if the workload includes writes
pause: max number of seconds to pause before the stage starts. Waits a random
number of seconds between 1 and the pause.
async: when true/yes, uses asynchronous commands for this stage. Default is false
random: when true/yes, randomly generates new objects for each write. Default is false
batch-size: specifies the batch size for all batch transactions for this stage. Default is 1
batch-read-size: specifies the batch size of reads for this stage. Takes precedence over batch-size. Default is 1
batch-write-size: specifies the batch size of writes for this stage. Takes precedence over batch-size. Default is 1
batch-delete-size: specifies the batch size of deletes for this stage. Takes precedence over batch-size. Default is 1
- stage: 2
...

Each stage must begin with stage: STAGE_NUMBER, where STAGE_NUMBER is the position of the stage in the list. The stages must appear in order.

When arguments say they inherit from the global context, the value they inherit either comes from a command line argument, or is the default value if no command line argument for that value was given.

Below is an example workload stages file, call it workload.yml.

- stage: 1
duration: 60
workload: RU,80
object-spec: I2,S12,[3*I1]
- stage: 2
duration: 30
workload: I
object-spec: {5*S1:I1}

To use workload.yaml with asbench, run the following.

asbench --workload-stages=workload.yml

Latency histogramsโ€‹

There are multiple ways to record latencies measured throughout a benchmark run. All latencies are recorded in microseconds.

OptionDefaultDescription
--output-filestdoutSpecifies an output file to write periodic latency data, which enables tracking of transaction latencies in microseconds in a histogram. Currently uses a default layout. The file is opened in append mode.
-L or --latencydisabledEnables the periodic HDR histogram summary of latency data.
--percentiles P_1[,P_2[,P_3...]]"50,90,99,99.9,99.99"Specifies the latency percentiles to display in the periodic latency histogram.
--output-period SECONDS1Specifies the period between successive snapshots of the periodic latency histogram. Replace SECONDS with the period in seconds.
--hdr-hist PATH_TO_OUTPUTdisabledEnables the cumulative HDR histogram and specifies the directory to dump the cumulative HDR histogram summary.

Periodic latency histogramโ€‹

Periodic latency data is stored in the --output-file specified and recorded in histograms with three ranges of fixed bucket sizes. The three ranges are not configurable. There is one histogram for reads, one for writes, and one for UDF calls.

The three ranges are:

  • 100us to 4000us, bucket width 100us
  • 4000us to 64000us, bucket width 1000us
  • 64000us to 128000us, bucket width 4000us

Format of the histogram output file:โ€‹

HIST_NAME UTC_TIME, PERIOD_TIME, TOTAL_LATENCIES, BUCKET_1_LOWER_BOUND:BUCKET_1_LATENCIES, ...
  • HIST_NAME: Name of the histogram, either read_hist, write_hist, or udf_hist.
  • UTC_TIME: UTC time of the end of the interval.
  • PERIOD_TIME: Length of the interval in seconds.
  • TOTAL_LATENCIES : Total number of transaction latencies recorded in the interval.
  • BUCKET_1_LOWER_BOUND: Each bucket with at least one latency recorded is displayed in ascended number of interval lower bounds.
  • BUCKET_1_LATENCIES: Number of transaction latencies falling within the bucket's range.

HDR histogramโ€‹

Transaction latencies can also be recorded in an HDR histogram. There is one HDR histogram for reads, one for writes, and one for UDF calls.

Use one of the following to enable HDR histograms:

  • --latency: Displays select percentiles from the HDR histograms every output-period seconds.
    • The percentiles that are printed when --latency is enabled can be configured with --percentiles followed by a comma-separated list of percentiles. This list must be in ascending order. No percentile can be less than 0, or greater than or equal to 100.
  • --hdr-hist: Writes the full HDR histograms to the given directory in both a human-readable text format (.txt) and a binary encoding of the HDR histogram (.hdrhist).

UDFsโ€‹

UDF calls are made in RUF (read/update/function) workloads, being the "function" part of that workload. A key is chosen at random from the range of keys given, and an Aerospike apply call is made on that key with the given UDF function (--udf-function-name) from the given UDF package (--udf-package-name). Optionally, --udf-function-values may be supplied, which takes an object spec and randomly generates arguments every call.

note

The UDF function arguments follow the same rules as the object spec used on records. They randomly generate for every call only if --random is supplied as an argument.

OptionDefaultDescription
-upn or --udf-package-name PACKAGE_NAME-The package name for the UDF to be called. Replace PACKAGE_NAME with the package name.
-ufn or --udf-function-name FUNCTION_NAME-Name of the UDF function in the package to be called. Replace FUNCTION_NAME with the function name.
-ufv or --udf-function-values FUNCTION_VALUESnoneArguments to be passed to the UDF when called, which are given as an object spec (see object spec). Replace FUNCTION_VALUES with the arguments.