Skip to content
Webinar - May 13th: How Criteo powers real-time decisions with a reduced footprintRegister now

Validation tool (asvalidation)

This page describes Aerospike’s validation tool (asvalidation) which scans all records in a namespace and validates bins with Collection Data Type (CDT) values (List and Map bins). Optionally, it attempts to repair any damage detected. Records with unrecoverable CDT errors are written to output if an output file is specified. Records without CDTs or detected errors are ignored.

Download asvalidation

Download asvalidation, or visit its repo aerospike/aerospike-tools-validation .

Modes of operation

You can run asvalidation in validation mode or fix mode. Aerospike recommends first running in validation mode, limited by --max-records, to see the kinds of errors it discovers. Afterwards, run it in fix mode.

  • Validation mode discovers problems and produces a report. This is the default mode.
  • Fix mode attempts to correct discovered problems where possible. It is triggered by the --cdt-fix-ordered-list-unique option. Fix counts should report zero in the summary printout.

Run in validation mode

In the following example, myNamespace is checked and output is written to asvalidationOutput.txt.

  • asvalidation was run in validation mode, because the --cdt-fix-ordered-list-unique option was not specified.
  • It found 100 lists: 100 Lists.
  • It found no Maps: 0 Maps.
  • 10 of the lists were corrupted: 10 Need Fix under Lists.
  • The reason for corruption is shown as out of order: 10 Order.
  • Because it was run in validation mode, no fixes were applied: 0 Fixed.
> asvalidation -n myNamespace -o asvalidationOutput.txt
...
2024-05-01 22:12:28 GMT [INF] [24662] Found 10 invalid record(s) from 1 node(s), 2620 byte(s) in total (~262 B/rec)
2024-05-01 22:12:28 GMT [INF] [24662] CDT Mode: validate
2024-05-01 22:12:28 GMT [INF] [24662] 100 Lists
2024-05-01 22:12:28 GMT [INF] [24662] 0 Unfixable
2024-05-01 22:12:28 GMT [INF] [24662] 0 Has non-storage
2024-05-01 22:12:28 GMT [INF] [24662] 0 Corrupted
2024-05-01 22:12:28 GMT [INF] [24662] 0 Invalid Keys
2024-05-01 22:12:28 GMT [INF] [24662] 10 Need Fix
2024-05-01 22:12:28 GMT [INF] [24662] 0 Fixed
2024-05-01 22:12:28 GMT [INF] [24662] 0 Fix failed
2024-05-01 22:12:28 GMT [INF] [24662] 10 Order
2024-05-01 22:12:28 GMT [INF] [24662] 0 Padding
2024-05-01 22:12:28 GMT [INF] [24662] 0 Maps
2024-05-01 22:12:28 GMT [INF] [24662] 0 Unfixable
2024-05-01 22:12:28 GMT [INF] [24662] 0 Has duplicate keys
2024-05-01 22:12:28 GMT [INF] [24662] 0 Has non-storage
2024-05-01 22:12:28 GMT [INF] [24662] 0 Corrupted
2024-05-01 22:12:28 GMT [INF] [24662] 0 Invalid Keys
2024-05-01 22:12:28 GMT [INF] [24662] 0 Need Fix
2024-05-01 22:12:28 GMT [INF] [24662] 0 Fixed
2024-05-01 22:12:28 GMT [INF] [24662] 0 Fix failed
2024-05-01 22:12:28 GMT [INF] [24662] 0 Order
2024-05-01 22:12:28 GMT [INF] [24662] 0 Padding

Run in fix mode

In the following example, myNamespace is checked and output is written to asvalidationOutput.txt.

  • Fixing is triggered by the --cdt-fix-ordered-list-unique option.

  • Fixes are applied to the server. Failed fixes can be due to (but not limited to) an unsupported server version.

Terminal window
asvalidation -n myNamespace -o asvalidationOutput.txt -r --cdt-fix-ordered-list-unique
validation of 127.0.0.1 (namespace: myNamespace, set: [all], bins: [all], after: [none], before: [none]) to asvalidationOutput.txt
2024-05-01 22:08:25 GMT [INF] [10999] [src/main/aerospike/as_cluster.c:132][as_cluster_add_nodes_copy] Add node BB909000027000A 127.0.0.1:3000
2024-05-01 22:08:25 GMT [INF] [10999] Processing 1 node(s)
2024-05-01 22:08:25 GMT [INF] [10999] Node ID Objects Replication
2024-05-01 22:08:25 GMT [INF] [10999] BB909000027000A 130 1
2024-05-01 22:08:25 GMT [INF] [10999] Namespace contains 130 record(s)
2024-05-01 22:08:25 GMT [INF] [10999] Created new output file temp.txt
2024-05-01 22:08:25 GMT [INF] [11018] Starting validation for node BB909000027000A
2024-05-01 22:08:25 GMT [INF] [11018] Completed validation for node BB909000027000A, records: 30, size: 9200 (~306 B/rec)
2024-05-01 22:08:26 GMT [INF] [11017] 23% complete (~8 KiB/s, ~30 rec/s, ~306 B/rec)
2024-05-01 22:08:26 GMT [INF] [11017] ~3s remaining
2024-05-01 22:08:26 GMT [INF] [11017] Found 30 invalid record(s) from 1 node(s), 9200 byte(s) in total (~306 B/rec)
2024-05-01 22:08:26 GMT [INF] [11017] CDT Mode: fix
2024-05-01 22:08:26 GMT [INF] [11017] 110 Lists
2024-05-01 22:08:26 GMT [INF] [11017] 0 Unfixable
2024-05-01 22:08:26 GMT [INF] [11017] 0 Has non-storage
2024-05-01 22:08:26 GMT [INF] [11017] 0 Corrupted
2024-05-01 22:08:26 GMT [INF] [11017] 0 Invalid Keys
2024-05-01 22:08:26 GMT [INF] [11017] 10 Need Fix
2024-05-01 22:08:26 GMT [INF] [11017] 10 Fixed
2024-05-01 22:08:26 GMT [INF] [11017] 0 Fix failed
2024-05-01 22:08:26 GMT [INF] [11017] 10 Order
2024-05-01 22:08:26 GMT [INF] [11017] 0 Padding
2024-05-01 22:08:26 GMT [INF] [11017] 20 Maps
2024-05-01 22:08:26 GMT [INF] [11017] 10 Unfixable
2024-05-01 22:08:26 GMT [INF] [11017] 10 Has duplicate keys
2024-05-01 22:08:26 GMT [INF] [11017] 0 Has non-storage
2024-05-01 22:08:26 GMT [INF] [11017] 0 Corrupted
2024-05-01 22:08:26 GMT [INF] [11017] 0 Invalid Keys
2024-05-01 22:08:26 GMT [INF] [11017] 10 Need Fix
2024-05-01 22:08:26 GMT [INF] [11017] 0 Fixed
2024-05-01 22:08:26 GMT [INF] [11017] 0 Fix failed
2024-05-01 22:08:26 GMT [INF] [11017] 10 Order
2024-05-01 22:08:26 GMT [INF] [11017] 0 Padding

Minimal options for asvalidation

Following is the minimal set of asvalidation options.

OptionDescription
--cdt-fix-ordered-list-uniqueFix lists where elements were not stored in order and remove duplicate elements. Without this option, the tool only validates, not fixes.
--no-cdt-check-map-keysDo not check CDT Map keys.
-oOutput file name for validation report.
-dOutput directory.
--helpGet a comprehensive list of options for tool.

Namespace data selection options

OptionDefaultDescription
-n NAMESPACE or --namespace NAMESPACE-Mandatory. Namespace to validate.
-s SETS or --set SETSAll setsSet(s) to validate. May pass in a comma-separated list of sets to validate.
-B BIN1,BIN2,... or --bin-list BIN1,BIN2,...All binsBins to validate.
-M or --max-records N0 = all records.Approximate limit for the number of records to process.

Partition scanning validation options

Scan a list of partition filters. Partition filters can be ranges, individual partitions, or records after a specific digest within a single partition.

This option is mutually exclusive with --node-list.

OptionDefault
-X, --partition-list LISTnot used

Default partitions to scan: 0 to 4095 (all partitions)

  • LIST format: FILTER1,FILTER2,…
  • FILTER format: BEGIN-PARTITION -PARTITION-COUNT|DIGEST
    • BEGIN-PARTITION: 0 to 4095.
    • Either the optional PARTITION-COUNT: 1 to 4096. Default: 1
    • Or the optional DIGEST: Base64-encoded string of desired digest to start at in specified partition.
ExampleDescription
-X 361Validate only partition 361.
-X 361-10Validate 10 partitions, starting with 361 and including 370.
-X VSmeSvxNRqr46NbOqiy9gy5LTIc=Validate all records after the digest VSmeSvxNRqr46NbOqiy9gy5LTIc= in its partition.
-X 0-1000,2222,EjRWeJq83vEjRRI0VniavN7xI0U=Validate partitions 0 to 999 (1000 partitions starting from 0), partition 2222, and all records after the digest EjRWeJq83vEjRRI0VniavN7xI0U= in its partition.

Connection options

OptionDefaultDescription
-h HOST1:TLSNAME1:PORT1,... or
--host HOST1:TLSNAME1:PORT1,...
127.0.0.1The host that acts as the entry point to the cluster. Any nodes in the cluster can be specified. The remaining nodes are discovered automatically.
-p PORT or --port PORT3000Port to connect to.
-U USER or --user USER-User name with read permission. Mandatory if the server has security enabled.
-P PASSWORD or--password-Password to authenticate the given user. The first form passes the password on the command line. The second form prompts for the password.
-A or --authINTERNALSet authentication mode when user and password are defined. Modes are (INTERNAL, EXTERNAL, EXTERNAL_INSECURE, PKI) and the default is INTERNAL. This mode must be set EXTERNAL when using LDAP.
--parallel N1Maximum number of scans to run in parallel. If only one partition range is given, or the entire namespace is being validated, the range of partitions is evenly divided by this number to be processed in parallel. Otherwise, each filter cannot be parallelized individually, so you may only achieve as much parallelism as there are partition filters.
-l HOST1:[TLSNAME1:]PORT1,... or --node-list HOST1:[TLSNAME1:]PORT1,...-Validate the given cluster nodes only. Mutually exclusive with --partition-list.
--tls-enabledisabledIndicates a TLS connection should be used.
-S or --services-alternatefalseSet to true to connect to Aerospike node’s alternate-access-address.
--prefer-racks RACKID1,...disabledA comma separated list of rack IDs to prefer when reading records. This is useful for limiting cross datacenter network traffic.

Output options

OptionDefaultDescription
-d PATH or --directory PATH-Directory to store the .asb validation files. If the directory does not exist, it is created before use. Mandatory, unless --output-file or --estimate is given.
-o PATH or --output-file PATH-The single file to write to. - means stdout. Mandatory, unless --directory or --estimate is given.
-q DESIRED-PREFIX
or --output-file-prefix DESIRED-PREFIX
-Must be used with the --directory option. A desired prefix for all output files.
-F LIMIT or --file-limit LIMIT250 MBFile size limit (in MiB) for --directory. If a .asb validation file crosses this size threshold, asvalidation switches to a new file.
-r or --remove-files-Clear directory or remove output file. By default, asvalidation refuses to write to a non-empty directory or to overwrite an existing validation file. This option clears the given --directory or removes an existing --output-file. Mutually exclusive to --continue.
--remove-artifacts-Clear directory or remove output file, like --remove-files, without running a validation. This option is mutually exclusive to --continue and --estimate.
-N BANDWIDTH or --nice BANDWIDTH-Throttles asvalidation’s write operations to the validation file(s) to not exceed the given bandwidth in MiB/s. Effectively also throttles the scan on the server side as asvalidation refuses to accept more data than it can write.

Other options

OptionDefaultDescription
-v or --verbosedisabledOutput considerably more information about the running validation.
-m or --machine PATH-Output machine-readable status updates to the given path, typically a FIFO.
-L or --records-per-second RPS0Limit total returned records per second (RPS). If RPS is zero, a records-per-second limit is not applied.
-V or --version-Print ASVALIDATION version information.
-C or --compactdisabledDo not apply base-64 encoding to BLOBs; results in smaller output files.

Configuration file options

Configurations are read from the following files in the given order:

  • /etc/aerospike/astools.conf
  • ~/.aerospike/astools.conf
OptionDefaultDescription
-no-config-filedisabledDo not read any config file.
--instance NAME-Section with this instance is read. In case instance a is specified, sections cluster_a, asvalidation_a is read.
--config-file PATH-Read this file after default configuration file.
--only-config-file PATH-Read only this configuration file.

Error descriptions

ReasonDescriptionDisposition
Has non-storageThe bin contains an infinite or wildcard element which is not allowed as storage.Requires manual intervention to fix.
Has duplicate keysA map bin has duplicate key entries.Unfixable without manual intervention.
CorruptedA problem not attributable to any other error categories.Requires manual intervention to fix.
Invalid KeysThe bin has a Map with at least one invalid key.Requires manual intervention to fix.
OrderThe bin has elements out of order.Can be fixed by reordering the list with the --cdt-fix-ordered-list-unique option.
PaddingThe bin has garbage bytes after the valid List or Map.Can be fixed by truncating the extra bytes.
Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?