Database Log Reference

Arenax

`could not allocate xxxxxxxxxx-byte arena stage xxx: No space left on device`

WARNING

message

could not allocate xxxxxxxxxx-byte arena stage xxx: No space left on device

description

Indicates for the index-type flash configuration that the mount points have run out of space. You may need to delete the arena files manually and run fsck on the disk partitions.

context

arenax

introduced

-

removed

-

As

`allowing x fill-migrations after y seconds delay`

INFO

message

allowing x fill-migrations after y seconds delay

description

This message appears after a recluster event, if a value is set for the migrate-fill-delay and if the recluster event has caused fill-migrations to be scheduled.

context

as

introduced

4.3.1

removed

-

`finished clean shutdown - exiting`

INFO

message

finished clean shutdown - exiting

description

This message is the last one of a sequence of messages logged during Aerospike server shutdown. The message signifies that Aerospike was shut down with “trusted” status which is a necessary condition for a subsequent fast restart of a namespace that is configured with storage-engine device. See ASD shutdown process.

context

as

introduced

3.0

removed

-

`waiting for storage: 1569063 objects, 1819777 scanned`

INFO

message

waiting for storage: 1569063 objects, 1819777 scanned

description

Objects and scanned values diverge for various reasons, such as scanning records that were previously expired or expired while the system was down.

context

as

introduced

-

removed

-

occurs

When cold starting a node that has data on persistent storage, such as SSD.

parameters

objects

Number of retained objects from storage device.

scanned

Number of objects that have been scanned on the storage device.

Batch

`abandoned batch from 11.22.33.44 with 23 transactions after 30000 ms.`

WARNING

message

abandoned batch from 11.22.33.44 with 23 transactions after 30000 ms.

description

context

batch

introduced

4.1

removed

-

occurs

When a batch-index transaction is abandoned due to one or more delays, where its total time exceeds the allowed threshold of either the client’s total timeout or 30 seconds, if the total timeout is not set by the client. Each occurrence also increments the batch_index_error statistic.

Prior to 6.4, this stat increments when the total time exceeds twice the client’s total timeout or 30 seconds if the total timeout is not set by the client.

parameters

IP Address

The client originating IP address for the transaction.

Number of transactions

The number of batch sub transactions in the impacted batch index transaction.

Abandoned time

The total time the batch index transaction had been running before being abandoned.

`failed to find active batch queue that is not full`

WARNING

message

failed to find active batch queue that is not full

description

All batch index queues are greater than the batch-max-buffers-per-queue limit, so batch requests are being rejected. Consult Tuning Batches. To save log space, this message is logged with a (repeated: nnn) prefix just once per ticker-interval during periods of repetition.

context

batch

introduced

4.1

removed

-

Bin

`{NAMESPACE} bin-name quota full - can’t add new bin-name`

WARNING

message

{NAMESPACE} bin-name quota full - can’t add new bin-name

description

Max number of bins per namespace (65535) has been reached. Prior to Database 4.9: Max number of bins per namespace (32767) has been reached. See How to clear up bin names when they exceed the limits.

context

bin

introduced

-

removed

-

parameters

NAMESPACE

The namespace that has reached the maximum number of bins.

Clustering

`evicted from cluster by principal node BB9030011AC4202`

WARNING

message

evicted from cluster by principal node BB9030011AC4202

description

The paxos principal node has determined that this node is not a valid cluster member. See the log on the principal for more information.

context

clustering

introduced

-

removed

-

parameters

principal

Node ID of the paxos principal node

`ignoring node: bb9030011ac4202 - exceeding maximum supported cluster size 1`

WARNING

message

ignoring node: bb9030011ac4202 - exceeding maximum supported cluster size 1

description

The cluster has exceeded the asdb-cluster-nodes-limit in the features.conf file. Reduce the number of nodes attempting to cluster or install the correct license.

context

clustering

introduced

-

removed

-

parameters

node

Node that is attempting to start cluster.

`ignoring paxos accepted from node BB9030011AC4202 - it is not in acceptor list`

WARNING

message

ignoring paxos accepted from node BB9030011AC4202 - it is not in acceptor list

description

A paxos clustering message from another node was ignored. This can result from network stability issues.

context

clustering

introduced

-

removed

-

parameters

node

Node sending the invalid paxos message

`ignoring paxos accepted from node BB9030011AC4202 with invalid proposal id`

WARNING

message

ignoring paxos accepted from node BB9030011AC4202 with invalid proposal id

description

A paxos clustering message from another node was ignored. This can result from network stability issues.

context

clustering

introduced

-

removed

-

parameters

node

Node sending the invalid paxos message

`ignoring stale join request from node BB9030011AC4202 - delay estimate 83108(ms)`

INFO

message

ignoring stale join request from node BB9030011AC4202 - delay estimate 83108(ms)

description

A paxos clustering message from another node was ignored. This can result from network stability issues causing delayed delivery of packets.

context

clustering

introduced

-

removed

-

parameters

node

Node sending the invalid paxos message

delay estimate

Estimated delay of the message in milliseconds.

Config

`CRITICAL (config): (features_ee.c:184) trailing garbage in /etc/aerospike/features.conf, line 21`

CRITICAL

message

CRITICAL (config): (features_ee.c:184) trailing garbage in /etc/aerospike/features.conf, line 21

description

The feature-key-file has been tampered with. Replace the file with the original provided by Aerospike. See Aerospike fails to start due to corrupted features.conf file - trailing garbage error.

context

config

introduced

-

removed

-

`failed CONFIG_CHECK check - MESSAGE`

WARNING

message

failed CONFIG_CHECK check - MESSAGE

description

A configuration best-practice was violated at startup.

context

config

introduced

5.7

removed

-

parameters

CONFIG_CHECK

Name of the check that was violated, which could be one of the following:

sMessage

Description of how the best-practice was violated.

`failed best-practices checks - see 'https://docs.aerospike.com/operations/install/linux/bestpractices'`

CRITICAL

message

failed best-practices checks - see 'https://docs.aerospike.com/operations/install/linux/bestpractices'

description

Indicates failed best practices. This message follows a set of warnings for each best practice that was violated. Becomes a WARNING when enforce-best-practices is set to false.

context

config

introduced

5.7

removed

-

`invalid feature key signature`

CRITICAL

message

invalid feature key signature

description

The key file has been modified and no longer matches the digital signature. The key file must be exactly the same as when it was downloaded. Seemingly harmless operations, like cutting and pasting the file’s contents, can change the file and invalidate it.

context

config

introduced

-

removed

-

Drv_mem

`bad set-id x`

CRITICAL

message

bad set-id x

description

Indicates an issue with the specified set ID, likely indicating a configuration or referencing error.

context

drv_mem

introduced

7.0

removed

-

parameters

x

set ID

`bad size x`

CRITICAL

message

bad size x

description

Indicates an issue with the specified size, likely outside acceptable parameters.

context

drv_mem

introduced

7.0

removed

-

parameters

x

size

`bad void-time x`

CRITICAL

message

bad void-time x

description

Indicates an issue with the specified void-time, likely indicating an error in expiration or deletion timing.

context

drv_mem

introduced

7.0

removed

-

parameters

x

void time

`bad wblock-id x`

CRITICAL

message

bad wblock-id x

description

Indicates an issue with the specified write block ID.

context

drv_mem

introduced

7.0

removed

-

parameters

x

write block ID

`can't get set index x from vmap`

CRITICAL

message

can't get set index x from vmap

description

Indicates a failure to retrieve the specified set index from the version map.

context

drv_mem

introduced

7.0

removed

-

parameters

x

set index

`community edition called resume_devices()`

CRITICAL

message

community edition called resume_devices()

description

The Community Edition of Aerospike does not allow the resume_devices() function.

context

drv_mem

introduced

7.0

removed

-

`device encrypted but no encryption key file configured`

CRITICAL

message

device encrypted but no encryption key file configured

description

Indicates a device is encrypted but lacks configuration for an encryption key file.

context

drv_mem

introduced

7.0

removed

-

`device has AP partition versions but 'strong-consistency' is configured`

CRITICAL

message

device has AP partition versions but 'strong-consistency' is configured

description

Indicates a configuration issue where AP partitions exist despite strong consistency being enabled.

context

drv_mem

introduced

7.0

removed

-

`device has CP partition versions but 'strong-consistency' is not configured`

CRITICAL

message

device has CP partition versions but 'strong-consistency' is not configured

description

Indicates a configuration issue where CP partitions exist without strong consistency enabled.

context

drv_mem

introduced

7.0

removed

-

`device has 'single-bin' data but 'single-bin' is no longer supported`

CRITICAL

message

device has 'single-bin' data but 'single-bin' is no longer supported

description

Indicates a configuration issue where the device contains data in a format no longer supported.

context

drv_mem

introduced

7.0

removed

-

`device not encrypted but encryption key file x is configured`

CRITICAL

message

device not encrypted but encryption key file x is configured

description

Indicates a non-encrypted device has an encryption key file configured unnecessarily.

context

drv_mem

introduced

7.0

removed

-

parameters

x

path to encryption key file

`devices are smaller than memory stripes`

CRITICAL

message

devices are smaller than memory stripes

description

Device memory is less than the stripe size

context

drv_mem

introduced

7.0

removed

-

`drive-name: Aerospike device has old format - must erase device to upgrade`

CRITICAL

message

drive-name: Aerospike device has old format - must erase device to upgrade

description

Aerospike device is using an old format and needs to be erased for an upgrade.

context

drv_mem

introduced

7.0

removed

-

`drive-name: bad device-id x`

CRITICAL

message

drive-name: bad device-id x

description

Indicates an issue with the device ID on the Aerospike device.

context

drv_mem

introduced

7.0

removed

-

parameters

x

device ID

`drive-name: bad n-devices x`

CRITICAL

message

drive-name: bad n-devices x

description

Indicates an issue with the number of devices specified for the Aerospike device.

context

drv_mem

introduced

7.0

removed

-

parameters

x

number of devices

`drive-name: bad pristine offset x`

CRITICAL

message

drive-name: bad pristine offset x

description

Indicates an issue with the pristine offset on the Aerospike device. Pristine blocks are blocks the Aerospike server has never written to before.

context

drv_mem

introduced

7.0

removed

-

parameters

x

pristine offset

`drive-name: can't change write-block-size from x to y`

CRITICAL

message

drive-name: can't change write-block-size from x to y

description

Indicates an issue changing the write block size on the Aerospike device.

context

drv_mem

introduced

7.0

removed

-

parameters

x

previous size

y

target size

`drive-name: previous namespace x now y - check config or erase device`

CRITICAL

message

drive-name: previous namespace x now y - check config or erase device

description

Indicates a namespace name change on the Aerospike device and suggests checking configuration or erasing the device.

context

drv_mem

introduced

7.0

removed

-

parameters

x

previous namespace name

y

current namespace name

`drive-name: random signature is 0`

CRITICAL

message

drive-name: random signature is 0

description

Indicates a problem where the random signature expected on the Aerospike device is 0.

context

drv_mem

introduced

7.0

removed

-

`drive-name: unknown version x`

CRITICAL

message

drive-name: unknown version x

description

Aerospike device is using an unknown version.

context

drv_mem

introduced

7.0

removed

-

parameters

x

version number

`encryption key or algorithm mismatch`

CRITICAL

message

encryption key or algorithm mismatch

description

Indicates a mismatch between the encryption key or algorithm configured and what is required.

context

drv_mem

introduced

7.0

removed

-

`existing memory stripe size differs from config`

CRITICAL

message

existing memory stripe size differs from config

description

Indicates a mismatch between the configured memory stripe size and the existing memory stripe size. To resolve this, configure data-size.

context

drv_mem

introduced

7.0

removed

-

`generation 0`

CRITICAL

message

generation 0

description

Displays when a record is deleted and removed from the tree but not freed.

context

drv_mem

introduced

7.0

removed

-

`hit stop-writes limit before drive scan completed`

CRITICAL

message

hit stop-writes limit before drive scan completed

description

The stop-writes limit is a user-configurable limit to prevent more data being written than is available on the device.

context

drv_mem

introduced

7.0

removed

-

`mprotect (address, length, protection) failed: x, (y)`

CRITICAL

message

mprotect (address, length, protection) failed: x, (y)

description

Indicates a failure in setting memory protection for a region of memory.

context

drv_mem

introduced

7.0

removed

-

parameters

x

error number

y

error description

`{namespace} can't add record to index`

CRITICAL

message

{namespace} can't add record to index

description

Indicates a failure to add a record to the index in the specified namespace.

context

drv_mem

introduced

7.0

removed

-

`{namespace} could not allocate x-byte shmem stripe`

CRITICAL

message

{namespace} could not allocate x-byte shmem stripe

description

Indicates a failure to allocate the specified amount of shared memory for a stripe in the namespace.

context

drv_mem

introduced

7.0

removed

-

parameters

x

number of bytes

`namespace drive set with unmatched headers - devices x & y have different device counts`

CRITICAL

message

namespace drive set with unmatched headers - devices x & y have different device counts

description

Indicates a configuration issue where two devices in a namespace drive set have different device counts.

context

drv_mem

introduced

7.0

removed

-

parameters

x

device name

y

device name

`namespace drive set with unmatched headers - devices x & y have different signatures`

CRITICAL

message

namespace drive set with unmatched headers - devices x & y have different signatures

description

Indicates a configuration issue where two devices in a namespace drive set have different signatures.

context

drv_mem

introduced

7.0

removed

-

parameters

x

device name

y

device name

`{namespace} loaded: objects 12345 device-pcts (20, 30, 25)`

INFO

message

{namespace} loaded: objects 12345 device-pcts (20, 30, 25)

description

Objects that have been loaded during a cold start, and the percentage of each device that has been scanned. The device percentages are listed in the same order as the devices are defined in the config file.

context

drv_mem

introduced

removed

-

`{namespace} loaded: objects 12345 tombstones 77 device-pcts (20, 30, 25)`

INFO

message

{namespace} loaded: objects 12345 tombstones 77 device-pcts (20, 30, 25)

description

Objects and tombstones that have been loaded during a cold start, and the percentage of each device that has been scanned. The device percentages are listed in the same order as the devices are defined in the config file.

context

drv_mem

introduced

removed

-

`{namespace}: no keys available for namespace`

CRITICAL

message

{namespace}: no keys available for namespace

description

Indicates a lack of available keys for the specified namespace, preventing operations.

context

drv_mem

introduced

7.0

removed

-

`{namespace} no keys available to create stripe`

CRITICAL

message

{namespace} no keys available to create stripe

description

Indicates a lack of available keys to create a stripe in the specified namespace.

context

drv_mem

introduced

7.0

removed

-

`{namespace} shmem stripe and device shadow-name have different device counts`

CRITICAL

message

{namespace} shmem stripe and device shadow-name have different device counts

description

Indicates a device count mismatch between shared memory stripe and device in the specified namespace.

context

drv_mem

introduced

7.0

removed

-

`{namespace} shmem stripe and device shadow-name have different signatures`

CRITICAL

message

{namespace} shmem stripe and device shadow-name have different signatures

description

Indicates a signature mismatch between shared memory stripe and device in the specified namespace.

context

drv_mem

introduced

7.0

removed

-

`{namespace_name}: could not allocate 17179869184-byte shmem stripe`

CRITICAL

message

{namespace_name}: could not allocate 17179869184-byte shmem stripe

description

Aerospike server is attempting to pre-allocate in-memory data storage in shared memory, either as stripes (1/8th of data-size) or as a mirror of the storage-backed persistence (filesize or device size), and the kernel.shmmax or kernel.shmall is set too low. See configuring namespace storage

context

drv_mem

introduced

7.0

removed

-

`random key generation failed`

CRITICAL

message

random key generation failed

description

Indicates a failure in generating a random key.

context

drv_mem

introduced

7.0

removed

-

`s: not an Aerospike device but not erased - check config or erase device`

CRITICAL

message

s: not an Aerospike device but not erased - check config or erase device

description

Warns that the device is not recognized as an Aerospike device but has not been erased.

context

drv_mem

introduced

7.0

removed

-

`shadow-name: DEVICE FAILED open: errno x (y)`

CRITICAL

message

shadow-name: DEVICE FAILED open: errno x (y)

description

Indicates a failure to open the specified device with an error number and description.

context

drv_mem

introduced

7.0

removed

-

parameters

x

error number

y

error description

`shadow-name: DEVICE FAILED write: errno x (y)`

CRITICAL

message

shadow-name: DEVICE FAILED write: errno x (y)

description

Indicates a write failure on the specified device with an error number and description.

context

drv_mem

introduced

7.0

removed

-

parameters

x

error number

y

error description

`shadow-name: read failed at all sizes from x to y bytes`

CRITICAL

message

shadow-name: read failed at all sizes from x to y bytes

description

Indicates a failure to read the shadow-name at any size within the specified byte range.

context

drv_mem

introduced

7.0

removed

-

parameters

x

byte range start

y

byte range end

`shadow-name: read failed: errno x (y)`

CRITICAL

message

shadow-name: read failed: errno x (y)

description

Indicates a read failure on the shadow-name device, providing an error number and description.

context

drv_mem

introduced

7.0

removed

-

parameters

x

error number

y

error description

`unable to open device x: error y`

CRITICAL

message

unable to open device x: error y

description

Displays the error code giving information about the reason the device could not be opened.

context

drv_mem

introduced

7.0

removed

-

parameters

x

device name

y

error number

`unable to open file x: error y`

CRITICAL

message

unable to open file x: error y

description

Displays the error code giving information about the reason the file could not be opened.

context

drv_mem

introduced

7.0

removed

-

parameters

x

path to file

y

error number

`unable to truncate file: errorno x`

CRITICAL

message

unable to truncate file: errorno x

description

This error appears when a shadow file could not be sized properly. See ftruncate() for more information.

context

drv_mem

introduced

7.0

removed

-

parameters

x

ftruncate() error code

`x - load bins failed`

CRITICAL

message

x - load bins failed

description

Indicates a failure in loading bins for the specified process or identifier.

context

drv_mem

introduced

7.0

removed

-

parameters

x

error number

`x - unpack bins failed`

CRITICAL

message

x - unpack bins failed

description

Indicates a failure in unpacking bins for the specified process or identifier.

context

drv_mem

introduced

7.0

removed

-

parameters

x

error number

`x found all y devices fresh during warm restart`

CRITICAL

message

x found all y devices fresh during warm restart

description

Shown when fresh headers were found during warm restart

context

drv_mem

introduced

7.0

removed

-

parameters

x

error number

y

number of devices

`x size y must be greater than header size z`

CRITICAL

message

x size y must be greater than header size z

description

Shown when the file size is less than or equal to the drive header size. To resolve this, configure filesize.

context

drv_mem

introduced

7.0

removed

-

parameters

x

Either “file” or “usable device” depending on whether files or devices are used for backing storage in the memory namespace.

y

file size

z

header size

Drv_pmem

`get_key: failed pmem_read_record()`

WARNING

message

get_key: failed pmem_read_record()

description

Aerospike was not able to read the record from storage, which may indicate a hardware failure. See What is the expected behaviour when an Aerospike node experiences an SSD hardware failure? for more information.

context

drv_pmem

introduced

-

removed

-

`load_bins: failed pmem_read_record()`

WARNING

message

load_bins: failed pmem_read_record()

description

Aerospike was not able to read the record from storage, which may indicate a hardware failure. See What is the expected behaviour when an Aerospike node experiences an SSD hardware failure? for more information.

context

drv_pmem

introduced

-

removed

-

`load_n_bins: failed pmem_read_record()`

WARNING

message

load_n_bins: failed pmem_read_record()

description

Aerospike was not able to read the record from storage, which may indicate a hardware failure. See What is the expected behaviour when an Aerospike node experiences an SSD hardware failure? for more information.

context

drv_pmem

introduced

-

removed

5.1.0

`{namespace} loaded: objects 12345 device-pcts (20, 30, 25)`

INFO

message

{namespace} loaded: objects 12345 device-pcts (20, 30, 25)

description

Objects that have been loaded during a cold start, and the percentage of each device that has been scanned. The device percentages are listed in the same order as the devices are defined in the config file.

context

drv_pmem

introduced

removed

-

`{namespace} loaded: objects 12345 tombstones 77 device-pcts (20, 30, 25)`

INFO

message

{namespace} loaded: objects 12345 tombstones 77 device-pcts (20, 30, 25)

description

Objects and tombstones that have been loaded during a cold start, and the percentage of each device that has been scanned. The device percentages are listed in the same order as the devices are defined in the config file.

context

drv_pmem

introduced

removed

-

`(namespace): (namespace.c:550) set-id 1 - negative device bytes!`

WARNING

message

(namespace): (namespace.c:550) set-id 1 - negative device bytes!

description

A statistic was not correctly initialized during a warm restart, and Aerospike server believes the PMEM storage has a negative size. This error may be printed once for every record held on the storage during the warm restart. This error poses no risk to the data stored on the device, and has been resolved in hotfix AER-6478, which is available in Aerospike Database 5.2.0.36, 5.3.0.26, 5.4.0.21, 5.5.0.19, 5.6.0.13, and later.

context

drv_pmem

introduced

4.9

removed

-

`{namespace_name} out of space`

WARNING

message

{namespace_name} out of space

description

Indicates a shortage of free storage blocks. See How do I recover from Available Percent Zero?. To save log space, this message is logged with a (repeated: nnn) prefix just once per ticker-interval during periods of repetition.

context

drv_pmem

introduced

5.2

removed

-

parameters

namespace

Namespace being written to

`{namespace_name} write: size 9437246 - rejecting 1142f0217ababf9fda5b1a4de66e6e8d4e51765e`

DETAIL

message

{namespace_name} write: size 9437246 - rejecting 1142f0217ababf9fda5b1a4de66e6e8d4e51765e

description

Most likely appearing as a result of exceeding the write-block-size. The record’s digest is the last item in the log entry.

context

drv_pmem

introduced

5.2

removed

-

parameters

namespace

Namespace being written to

size

Total size of the record that was rejected

Drv_ssd

`(arenax): (arenax_ee.c:98) too many chunks`

CRITICAL

message

(arenax): (arenax_ee.c:98) too many chunks

description

This error occurs when the primary index has been misconfigured partition-tree-sprigs for an all-flash namespace. To size all-flash installations adequately, you must account for the number of partition-tree-sprigs when you estimate the size of the index. Each sprig uses no more than a single 4 KiB chunk. Also, verify that all index entries are stored at the desired fill fraction, which defines the level to which a sprig is filled. This allows for some expansion without overfilling and consuming more than one chunk per sprig. See the Capacity Planning Guide for more information.

context

drv_ssd

introduced

4.2.0.2

removed

-

`can't add record to index`

CRITICAL

message

can't add record to index

description

For the index-type flash configuration, indicates that the mount points have run out of space. You may need to delete the arena files manually and run fsck on the disk partitions.

context

drv_ssd

introduced

-

removed

-

`defrag_move_record: couldn’t get swb`

WARNING

message

defrag_move_record: couldn’t get swb

description

Indicates the node has a shortage of free storage blocks. See How do I recover from Available Percent Zero?.

context

drv_ssd

introduced

-

removed

5.7, 5.6.0.13, 5.4.0.21, 5.3.0.26, 5.2.0.36, 5.1.0.42

`device /dev/sdb - swb buf valloc failed`

WARNING

message

device /dev/sdb - swb buf valloc failed

description

Indicates a shortage of memory. Make sure the nodes have enough memory.

context

drv_ssd

introduced

-

removed

-

parameters

device

Device that Aerospike was trying to read or write to at the time of the error.

`device /dev/sdb: read complete: UNIQUE 20401681 (REPLACED 5619021) (OLDER 11905062) (EXPIRED 0) (EVICTED 0) (UNPARSABLE 4) records`

INFO

message

device /dev/sdb: read complete: UNIQUE 20401681 (REPLACED 5619021) (OLDER 11905062) (EXPIRED 0) (EVICTED 0) (UNPARSABLE 4) records

description

context

drv_ssd

introduced

4.5.1.5

removed

-

parameters

device

Name of the device for which the following stats apply. The stats are a summary of a cold start that read the entire device.

UNIQUE

Total number of unique records loaded from this device.

REPLACED

Number of records that replaced a version loaded earlier during the device scan (won the conflict resolution).

OLDER

Number of records that were skipped because a newer version was loaded earlier during the device scan (lost the conflict resolution).

EXPIRED

Number of records skipped because they were expired.

EVICTED

Number of records skipped because they were evicted.

UNPARSABLE

Number of records skipped because they were unparsable.

`device /dev/sdb: read_complete: added 0 expired 0`

INFO

message

device /dev/sdb: read_complete: added 0 expired 0

description

context

drv_ssd

introduced

4.5.1.5

removed

-

parameters

device

Name of the device for which the following stats apply. The stats are a summary of a cold start that read the entire device.

added

Total number of unique records loaded from this device.

expired

Number of records skipped because they were expired.

`device /dev/sdd defrag: rblock_id 952163461 generation mismatch (4:3) :0xc0b12ffb353e0385179f39c85edc7791264f11aa`

WARNING

message

device /dev/sdd defrag: rblock_id 952163461 generation mismatch (4:3) :0xc0b12ffb353e0385179f39c85edc7791264f11aa

description

This can occur when the generation value for the index has been advanced while the generation value for the record on the drive has not advanced. This happened in some very old Aerospike Database versions. The error indicates that defrag found this discrepancy, and it should appear only once as the defrag process resolves the discrepancy.

context

drv_ssd

introduced

-

removed

-

parameters

device

Disk where this occurred.

rblock_id

Identifier of the disk block being defragged.

mismatch

Generation of the record in the index and on disk, respectively.

digest

Digest of the record in question.

`device has AP partition versions but 'strong-consistency' is configured`

CRITICAL

message

device has AP partition versions but 'strong-consistency' is configured

description

A namespace has data that was written while it was in AP mode, but is being started in SC mode. This is not permitted. See strong-consistency for details.

context

drv_ssd

introduced

4.0.0

removed

-

`device not encrypted but encryption key file /etc/aerospike/key-256.dat is configured`

CRITICAL

message

device not encrypted but encryption key file /etc/aerospike/key-256.dat is configured

description

This happens when a block device has its header configured using asd without encryption enabled. This can happen if you:

zeroized the device.
started asd without enabling encryption in aerospike.conf.
took down asd.
restarted asd with encryption enabled in aerospike.conf.

To resolve this and use encryption, zeroize again and reinitialize the drive:

Take down asd.
Zeroize the device again.
Start asd with encryption enabled in aerospike.conf.

context

drv_ssd

introduced

-

removed

-

`/dev/nvme0n1p1: bad device-id 3192497567`

CRITICAL

message

/dev/nvme0n1p1: bad device-id 3192497567

description

Appears in cases of device corruption or if a device was not erased before starting the Aerospike service. See SSD Initialization and SSD Setup for details and recommended steps. In case of system log corruption, the device might need to be replaced.

context

drv_ssd

introduced

4.0

removed

-

`/dev/nvme0n1p1: read failed: errno 5 (Input/output error)`

CRITICAL

message

/dev/nvme0n1p1: read failed: errno 5 (Input/output error)

description

Indicates an error during a system call to the storage device. Depending on the transaction path where this occurs, the database could abort if the integrity of the underlying data is not known (typically on write transactions). In case of corruption seen on system logs, the firmware version should be checked or the device might need to be replaced.

context

drv_ssd

introduced

-

removed

-

`/dev/nvme3n1 init defrag profile: 3,1,1,2,0,0,1,0,3,1,0,0,0,0,1,1,2,0,0,1,1,1,0,0,0,0,2,1,3,4,2,1,1,2,2,3,1,4,3,3,3,4,2,1,3,3,2,0,4,2`

INFO

message

/dev/nvme3n1 init defrag profile: 3,1,1,2,0,0,1,0,3,1,0,0,0,0,1,1,2,0,0,1,1,1,0,0,0,0,2,1,3,4,2,1,1,2,2,3,1,4,3,3,3,4,2,1,3,3,2,0,4,2

description

At startup, the initial defragmentation profile for a device. This represents the number of blocks at each percentage all the way to the defrag-lwm-pct. In this example, 50 buckets (default defrag-lwm-pct set to 50) are shown, including: 3 blocks that are less than 1% full, 1 that is between 1 and 2% full, all the way to 2 blocks that are between 49 and 50% full (last number). Blocks that are more than 50% full are not be eligible to be defragmented.

context

drv_ssd

introduced

-

removed

-

`/dev/nvme3n1 init wblocks: pristine-id 1464407 pristine 109155 free-q 591298, defrag-q 75`

INFO

message

/dev/nvme3n1 init wblocks: pristine-id 1464407 pristine 109155 free-q 591298, defrag-q 75

description

At startup, the status of unwritten blocks, free blocks, and blocks on the defrag queue. The sum of free-q and pristine blocks indicates the total space available for writing, which would match the number of free wblocks).

context

drv_ssd

introduced

-

removed

-

parameters

pristine-id

The ID of the first unwritten (pristine) block on the disk.

pristine

The number of completely unwritten (pristine) blocks on the disk.

free-q

The number of blocks that have been through the defragmentation process and are available to be re-written.

defrag-q

The number of blocks that are awaiting defragmentation on the defrag queue.

`(drv_ssd.c:2436) bad end marker for {digest}`

WARNING

message

(drv_ssd.c:2436) bad end marker for {digest}

description

This can occur on a subsequent cold restart after the initial upgrade to Database 6.x if you zeroized only the first 8 MB of the headers and didn’t erase the full device before upgrading. This error is generally not critical as any records would be previous versions of records that should have been written anew during the first migration on the node after the upgrade.

context

drv_ssd

introduced

-

removed

-

`encryption key or algorithm mismatch`

WARNING

message

encryption key or algorithm mismatch

description

At startup, this error indicates that the encryption key file used previously to encrypt data on the storage device does not match the file currently provided.

context

drv_ssd

introduced

-

removed

-

`error: block extends over read size: foff 5242880 boff 1047552 blen 1392`

WARNING

message

error: block extends over read size: foff 5242880 boff 1047552 blen 1392

description

The device likely has a bad sector. If this issue occurs frequently, replace the device.

context

drv_ssd

introduced

-

removed

-

parameters

foff

Offset of file containing malformed block.

boff

Offset of malformed block.

blen

Length of malformed block.

`get_key: failed as_storage_record_read_ssd()`

WARNING

message

get_key: failed as_storage_record_read_ssd()

description

Symptom of having run out of storage space. Resolved by a cold start.

context

drv_ssd

introduced

-

removed

-

`get_key: failed ssd_read_record()`

WARNING

message

get_key: failed ssd_read_record()

description

Aerospike was not able to read the record from storage. This may indicate a hardware failure. See What is the expected behaviour when an Aerospike node experiences an SSD hardware failure? for more information.

context

drv_ssd

introduced

-

removed

-

`load_bins: failed ssd_read_record()`

WARNING

message

load_bins: failed ssd_read_record()

description

Aerospike was not able to read the record from storage. This may indicate a hardware failure. See What is the expected behaviour when an Aerospike node experiences an SSD hardware failure? for more information.

context

drv_ssd

introduced

-

removed

-

`load_n_bins: failed ssd_read_record()`

WARNING

message

load_n_bins: failed ssd_read_record()

description

Aerospike was not able to read the record from storage. This may indicate a hardware failure. See What is the expected behaviour when an Aerospike node experiences an SSD hardware failure? for more information.

context

drv_ssd

introduced

-

removed

5.1.0

`metadata mismatch - removing <DIGEST_ID>`

WARNING

message

metadata mismatch - removing <DIGEST_ID>

description

This message indicates the change in a primary index bit and was introduced as a fix to AER-6335. The issue would occur for XDR-enabled namespaces upgrading from a 4.9 Database prior to 4.9.0.19. The workaround is to proceed with a rolling cold-restart.

context

drv_ssd

introduced

4.9.0.19

removed

-

`namespace NS waiting for defrag: 5 pct available, waiting for 10 ...`

INFO

message

namespace NS waiting for defrag: 5 pct available, waiting for 10 ...

description

The node is stuck in a defrag loop at startup where it cannot defragment enough to get device_free_pct below defrag-startup-minimum.

context

drv_ssd

introduced

-

removed

-

parameters

namespace

The namespace impacted.

avail pct

Current available percent.

required pct

Target available percent to reach in order to start up.

`{namespace} loaded: objects 12345 device-pcts (20, 30, 25)`

INFO

message

{namespace} loaded: objects 12345 device-pcts (20, 30, 25)

description

Objects that have been loaded during a cold start, and the percentage of each device that has been scanned. The device percentages are listed in the same order as the devices are defined in the config file.

context

drv_ssd

introduced

removed

-

`{namespace} loaded: objects 12345 tombstones 77 device-pcts (20, 30, 25)`

INFO

message

{namespace} loaded: objects 12345 tombstones 77 device-pcts (20, 30, 25)

description

Objects and tombstones that have been loaded during a cold start, and the percentage of each device that has been scanned. The device percentages are listed in the same order as the devices are defined in the config file.

context

drv_ssd

introduced

removed

-

`{namespace} out of space`

WARNING

message

{namespace} out of space

description

Indicates a shortage of free storage blocks. See How do I recover from Available Percent Zero?. To save log space, this message is logged with a (repeated: nnn) prefix just once per ticker-interval during periods of repetition.

context

drv_ssd

introduced

5.2

removed

-

parameters

{namespace}

Namespace being written to.

`{namespace_name} defrag: drive Drive1 totally full - waiting for vacated wblocks to be freed`

WARNING

message

{namespace_name} defrag: drive Drive1 totally full - waiting for vacated wblocks to be freed

description

Indicates that the node has no free storage blocks. In this case, the defragmentation process waits until a free block is available. See [How to Recover Contiguous Free Blocks]How do I recover from Available Percent Zero? for more information. To save log space, this message is logged with a (repeated: nnn) prefix just once per ticker-interval during periods of repetition.

context

drv_ssd

introduced

5.7, 5.6.0.13, 5.4.0.21, 5.3.0.26, 5.2.0.36, 5.1.0.42

removed

-

parameters

{namespace_name}

Affected namespace name.

drive

Affected storage device name.

`{namespace_name} device /dev/sda prior shutdown not clean`

INFO

message

{namespace_name} device /dev/sda prior shutdown not clean

description

Indicates that the previous shutdown was not trusted. The node must perform a cold start.

context

drv_ssd

introduced

-

removed

-

`{namespace_name} /dev/sda: used-bytes 296160983424 free-wblocks 885103 write-q 0 write (12659541,43.3) defrag-q 0 defrag-read (11936852,39.1) defrag-write (3586533,10.2) shadow-write-q 0 tomb-raider-read (13758,598.0)`

INFO

message

{namespace_name} /dev/sda: used-bytes 296160983424 free-wblocks 885103 write-q 0 write (12659541,43.3) defrag-q 0 defrag-read (11936852,39.1) defrag-write (3586533,10.2) shadow-write-q 0 tomb-raider-read (13758,598.0)

description

context

drv_ssd

introduced

3.10

removed

-

parameters

{namespace}

Name of the namespace the device and stats belong to.

/dev/sda

Name of the device for which the following stats apply.

used-bytes

Number of bytes on this device that are in use. Corresponds to the storage-engine.device[ix].used_bytes statistic.

free-wblocks

The number of wblocks that are free (the device_available_pct). Corresponds to the storage-engine.device[ix].free_wblocks statistic.

write-q

Number of write buffers pending to be written to the SSD. When this reaches the max-write-cache configured value (default 64M), ‘device overload’ errors are returned and ‘queue too deep’ warnings are printed on the server log. Corresponds to the storage-engine.device[ix].write_q statistic.

write

Total number of SSD write buffers written to this device since the Aerospike server started, including defragmentation, and the number of write buffers written per second. Corresponds to the storage-engine.device[ix].writes statistic. Does not include partial flushes at flush-max-ms.

defrag-q

Number of wblocks pending defragmentation. These are blocks that have fallen below the defrag-lwm-pct and are waiting to be read and have their relevant content recombined in a fresh streaming write buffer. The defrag-sleep setting controls the sleep period in between each block being read (default 1ms). Corresponds to the storage-engine.device[ix].defrag_q statistic.

defrag-read

Total number of write blocks that have been sent to the defragmentation queue (defrag-q) and read by the defragmentation thread on this device, and the normalization to the average number of wblocks processed per second during the interval at which this message is logged. Usually, defrag-q is at 0 and wblocks are read as they are put on the defrag-q. In such cases, defrag-read represents the number of wblocks read by the defragmentation thread. Corresponds to the storage-engine.device[ix].defrag_reads statistic.

defrag-write

Total number of write blocks written by defragmentation on this device since the Aerospike server started, and the number of wblocks written per second (subset of write). Corresponds to the storage-engine.device[ix].defrag_writes statistic.

shadow-write-q

Number of write buffers pending to be written to the shadow device (only printed when a shadow device is configured). When this reaches the configured max-write-cache value (default 64M), ‘device overload’ errors are returned and queue too deep warnings are printed to the server log. Corresponds to the storage-engine.device[ix].shadow_write_q statistic.

tomb-raider-read

Total number of blocks read by the tomb-raider in the current cycle, and the current number of wblocks read per second. Only printed when the tomb-raider is active.

`{namespace_name} durable delete fail: queue too deep: exceeds max 544`

WARNING

message

{namespace_name} durable delete fail: queue too deep: exceeds max 544

description

Indicates that the disks are not keeping up with the load placed upon them, although the disks themselves are not necessarily faulty or nearing end of life. See Why do I see warning - queue too deep for more information.

context

drv_ssd

introduced

5.7

removed

-

`{namespace_name} immigrate fail: queue too deep: exceeds max 576`

WARNING

message

{namespace_name} immigrate fail: queue too deep: exceeds max 576

description

Indicates that the disks are not keeping up with the load placed upon them, although the disks themselves are not necessarily faulty or nearing end of life. See Why do I see warning - queue too deep for more information.

context

drv_ssd

introduced

5.7

removed

-

`{namespace_name} read_ssd: invalid rblock_id :0x0c0008d663318a674a7bd379f6efd3bb1f55141d`

WARNING

message

{namespace_name} read_ssd: invalid rblock_id :0x0c0008d663318a674a7bd379f6efd3bb1f55141d

description

Indicates that a record with an invalid read-block is being read. This can happen if a node runs out of memory and swb cannot be allocated. It can also happen on some earlier database versions if a node runs out of device_available_pct before checking upfront for the available free blocks. A cold start of the node should resolve the issue.

context

drv_ssd

introduced

-

removed

-

parameters

NAMESPACE

Namespace the record with invalid read-block resides in.

rblock_id

Digest of the record with invalid read-block.

`{namespace_name} udf fail: queue too deep: exceeds max 512`

WARNING

message

{namespace_name} udf fail: queue too deep: exceeds max 512

description

All UDF writes fail by design.

context

drv_ssd

introduced

5.7

removed

-

`{namespace_name} write fail: queue too deep: exceeds max 512`

WARNING

message

{namespace_name} write fail: queue too deep: exceeds max 512

description

Indicates that the disks are not keeping up with the load placed upon them, although the disks themselves are not necessarily faulty or nearing end of life. See Why do I see warning - queue too deep for more information.

context

drv_ssd

introduced

3.0

removed

-

`{namespace_name} write: size 9437246 - rejecting 1142f0217ababf9fda5b1a4de66e6e8d4e51765e`

DETAIL

message

{namespace_name} write: size 9437246 - rejecting 1142f0217ababf9fda5b1a4de66e6e8d4e51765e

description

Most likely appears as a result of exceeding the write-block-size. The record’s digest is the last item in the log entry.

context

drv_ssd

introduced

5.2

removed

-

parameters

namespace

Namespace being written to.

size

Total size of the record that was rejected.

`read: bad block magic offset 303403269632`

WARNING

message

read: bad block magic offset 303403269632

description

The SSD is corrupted and the expected value of a block does not match the actual value. This issue has three potential causes:

The names of raw devices were swapped on server reboot. For example, the storage pointed to by /dev/sda is now /dev/sdb. See How to configure storage device to use the disk WWID? to learn about configuring persistent device names.
The storage-engine configuration in aerospike.conf has changed to reorder devices.
Hardware failure and data corruption.

context

drv_ssd

introduced

-

removed

-

parameters

offset

Location in the device where the mismatch was noticed.

`read failed: expected 512 got -1: fd 9900 data 0x7f277981b000 errno 5`

WARNING

message

read failed: expected 512 got -1: fd 9900 data 0x7f277981b000 errno 5

description

Aerospike tried to read 512 bytes from the device and the read() call returned -1 (error) with errno 5, which is an I/O error. This indicates a potential hardware issue.

context

drv_ssd

introduced

-

removed

-

parameters

expected

Size of block requested from disk.

fd

File descriptor used for read.

data

Offset of block.

`read_all: failed as_storage_record_read_ssd()`

INFO

message

read_all: failed as_storage_record_read_ssd()

description

Result of having run out of storage space. Resolved by a cold start.

context

drv_ssd

introduced

-

removed

-

`ssd_read: record b98946e3d616790 has no block associated, fail`

WARNING

message

ssd_read: record b98946e3d616790 has no block associated, fail

description

This is a result of having run out of space. The record was written to the index but not flushed to disk, creating this inconsistency. You can resolve this issue with a cold start.

context

drv_ssd

introduced

-

removed

-

parameters

record

The record’s hashed key.

`write bins: couldn’t get swb`

WARNING

message

write bins: couldn’t get swb

description

Indicates a shortage of free storage blocks. See How do I recover from Available Percent Zero?.

context

drv_ssd

introduced

-

removed

5.2

`write: size 9437246 - rejecting digest:0xd751c6d7eea87c82b3d6332467e8bc9a3c630e13`

DETAIL

message

write: size 9437246 - rejecting digest:0xd751c6d7eea87c82b3d6332467e8bc9a3c630e13

description

Appears with the WARNING message about failed as_storage_record_write() for exceeding the write-block-size.

context

drv_ssd

introduced

3.16

removed

5.2

parameters

size

Total size of the record that was rejected.

digest

Digest of the record that was rejected.

Exchange

`blocking client transactions in orphan state!`

WARNING

message

blocking client transactions in orphan state!

description

The node is not currently part of any cluster and will not accept any transactions from clients that are connected to it, meaning clients cannot access any partitions on the node.

context

exchange

introduced

-

removed

-

`error sending exchange data`

WARNING

message

error sending exchange data

description

Failure exchanging partition maps with another node because the node is down or disconnected from the network and unable to receive messages.

context

exchange

introduced

-

removed

-

`received duplicate exchange data from node 783f4ac2fbb57e81`

INFO

message

received duplicate exchange data from node 783f4ac2fbb57e81

description

Another node has resent exchange data because it did not receive an acknowledgment from this node within half the heartbeat interval. This is most likely due to a networking issue.

context

exchange

introduced

-

removed

-

parameters

node

NodeID of the origin of the unacknowledged exchange data

`received duplicate ready to commit message from node 783f4ac2fbb57e81`

INFO

message

received duplicate ready to commit message from node 783f4ac2fbb57e81

description

A node has resent the ready to commit message because it did not receive an acknowledgment from this node within half the heartbeat interval. This is most likely due to a networking issue between nodes in the cluster. Those ready to commit messages are sent by each node to the principal node when the exchange data has been completed on the node. The principal node also follows this pattern and sends itself such a ready to commit message. On a given node, the exchange data is done when the node has successfully sent its partition map to all the nodes in the cluster as well as received the partition map from each node in the cluster. This requires each node to ack back that it has received the exchange data. Only then would a node tell the principal that it is ready to commit. While the principal is waiting to receive the ready to commit from some nodes, other nodes would keep sending their ready to commit as they are ready and waiting. Therefore, the nodes for which such message is seen are the nodes that are ready and the other nodes would be the ones having potential issues completed their exchange data.

context

exchange

introduced

-

removed

-

parameters

node

Node ID of the node that is continuing to send the ready to commit message as it is waiting for the principal to acknowledge.

Exp

`build_cond - error 4 mismatched type <n> (TYPE_NAME) expected type <m> (<TYPE_NAME>) at default condition`

WARNING

message

build_cond - error 4 mismatched type <n> (TYPE_NAME) expected type <m> (<TYPE_NAME>) at default condition

description

Example: WARNING (exp): (exp.c:2168) build_cond - error 4 mismatched type 5 (map) expected type 0 (nil) at default condition.

In this example the type is map and the type is nil.

cond expression was illegally invoked by the client with mixed return types. The cond expression requires that all return types be the same with the exception of the unknown type.

context

exp

introduced

5.6

removed

-

`predexp deprecated - use new expressions API`

WARNING

message

predexp deprecated - use new expressions API

description

Indicates use of the deprecated Predicate Expressions API, which was replaced in Database 5.2 by the Aerospike Expressions API. The deprecated API is removed in Database 6.0. This warning is logged no more than once per log ticker cycle.

context

exp

introduced

5.6

removed

-

Fabric

`error creating fabric published endpoint list`

CRITICAL

message

error creating fabric published endpoint list

description

This issue can occur in the case of a failure in network interface initialization. Check the system logs for time of issue and verify the fabric network interface was initialized properly.

context

fabric

introduced

-

removed

-

`msg_read: could not deliver message type 1`

INFO

message

msg_read: could not deliver message type 1

description

Heartbeat message handler is not registered on the node. The message should disappear soon after the node joins the cluster. Otherwise, restart the asd service on the node.

context

fabric

introduced

-

removed

-

parameters

message type

Internal code.

`no IPv4 addresses configured for fabric`

WARNING

message

no IPv4 addresses configured for fabric

description

This issue can occur in the case of a failure in network interface initialization. Check the system logs for time of issue and verify the fabric network interface was initialized properly.

context

fabric

introduced

-

removed

-

`node b5 fds {via_connect={rw=8 ctrl=1 bulk=2 meta=1} all=24} live 1 q {rw=0 ctrl=0 bulk=0 meta=0}`

INFO

message

node b5 fds {via_connect={rw=8 ctrl=1 bulk=2 meta=1} all=24} live 1 q {rw=0 ctrl=0 bulk=0 meta=0}

description

Provides details on the number of fabric connections to each node in the cluster as well as pending outbound messages for each.

context

fabric

introduced

-

removed

-

occurs

This log line is generated when the info command dump-fabric:verbose=true is issued.

parameters

node

The node to which this node is connected through fabric.

fds rw= ctrl= bulk= meta=

The number of file descriptors to the node, broken up by channel (replica writes — includes proxies and duplicate resolution, control — for clustering, bulk — for migrations and meta — for system meta data or SMD)

fds all=

The total number of file descriptors to the node, typically double the sum of rw, ctrl, bulk and meta as each node establishes connections for each channel from each side.

q rw= ctrl= bulk= meta=

The number of pending outbound messages to the node, broken up by channel.

`r_msg_sz > sizeof(fc->r_membuf) 1048582`

WARNING

message

r_msg_sz > sizeof(fc->r_membuf) 1048582

description

This is from the run_fabric_accept thread rather than the run_fabric_recv thread. This thread is responsible for accepting new connections. These messages are only expected to store the NodeID and the ChannelID, which would always be only a few bytes and never expected to be over 1MiB. This WARNING is likely to indicate non-Aerospike traffic on the fabric port. Verify that the fabric port (default 3001) is not exposed. parameters:

name: “r_msg_sz” description: | Internal code.

context

fabric

introduced

-

removed

-

Flat

`record too small 0`

WARNING

message

record too small 0

description

Indicates that an inbound record for migration is corrupt. Appears with the WARNING message about handle insert: got bad record and is documented in Why Do I See Stalled Migrations And "record too small" Errors In The Log?

context

flat

introduced

-

removed

-

Hardware

`(skew_monitor.c:685) node bb90adf81f64eb0 HLC not in sync - hlc 112495319903567872 self-hlc 112644426202057622 diff 2275181556`

INFO

message

 (skew_monitor.c:685) node bb90adf81f64eb0 HLC not in sync - hlc 112495319903567872 self-hlc 112644426202057622 diff 2275181556

description

A strong consistency namespace can hit stop writes if clock_skew_stop_writes is in effect when cluster_clock_skew_ms is above the cluster_clock_skew_stop_writes_sec threshold.

context

hardware

introduced

4.0

removed

-

`failed to submit command to /dev/nvme0: x109`

WARNING

message

failed to submit command to /dev/nvme0: x109

description

Indicates that the underlying NVMe device does not support the health information check. To suppress this message, set log level to critical for the hardware context.

context

hardware

introduced

4.3

removed

-

Hb

`Broken pipe (under normal network conditions)`

INFO

message

Broken pipe (under normal network conditions)

description

This message appears before any namespace-specific shutdown messages in the normal shutdown sequence for mesh network nodes. It indicates that the node’s heartbeat is being stopped to prepare for prompt removal of the node from the cluster in case the shutdown process is lengthy and the node was not quiesced.

context

hb

introduced

5.3

removed

-

`Found a socket 0x7f0979812460 without an associated channel.`

WARNING

message

Found a socket 0x7f0979812460 without an associated channel.

description

This warning occurs in rare cases where a network failure causes two error events on the same socket. No action is required.

context

hb

introduced

-

removed

-

parameters

socket

HEX number identifying the socket within the OS.

`Timeout while connecting`

WARNING

message

Timeout while connecting

description

A peer node could not be reached on the heartbeat port. The node may be down or experiencing network issues.

context

hb

introduced

-

removed

-

`closing mesh heartbeat sockets`

INFO

message

closing mesh heartbeat sockets

description

This message appears early in the normal shutdown sequence for mesh network nodes, before any namespace-specific shutdown messages. It indicates that the node’s heartbeat is being stopped so the node is promptly removed from the cluster in case the shutdown process is lengthy and the node was not quiesced.

context

hb

introduced

5.3

removed

-

`closing multicast heartbeat sockets`

INFO

message

closing multicast heartbeat sockets

description

This message appears early in the normal shutdown sequence for multicast network nodes, before any namespace-specific shutdown messages. It indicates that the node’s heartbeat is being stopped so the node is promptly removed from the cluster in case the shutdown process is lengthy and the node was not quiesced.

context

hb

introduced

5.3

removed

-

`could not create heartbeat connection to node - 10.219.136.101 {10.219.136.101:3012}`

WARNING

message

could not create heartbeat connection to node - 10.219.136.101 {10.219.136.101:3012}

description

This warning can indicate that a cluster node is down. It is also possible that IP addresses in the cluster have changed, which can occur following a node restart. If the indicated node is expected to be part of the cluster, troubleshoot as normal. Otherwise, follow the tip-clear and services-alumni-reset steps from the node removal instructions to clear the errors.

context

hb

introduced

-

removed

-

occurs

When a node is expected to be part of the cluster and cannot be reached on the heartbeat port.

parameters

IP:port

IP and heartbeat port of the host that could not be reached.

`error allocating space for [multicast/mesh] recv buffer of size 1195725862 on fd 773`

WARNING

message

error allocating space for [multicast/mesh] recv buffer of size 1195725862 on fd 773

description

Indicates that non-Aerospike traffic may be sending data to the heartbeat port on this machine and some bits have been misinterpreted in the Aerospike protocol as a large buffer size. Such requests can lead to denial of service or disruption of cluster traffic if they are frequent. We recommend that you block access to the heartbeat port from outside the cluster.

context

hb

introduced

-

removed

-

parameters

size

Size of the message received (or the interpretation of a size).

fd

File descriptor of the socket the non-Aerospike message was received on.

`heartbeat TLS client handshake failed - 10.219.136.101 {10.219.136.101:3012}`

WARNING

message

heartbeat TLS client handshake failed - 10.219.136.101 {10.219.136.101:3012}

description

The other node could be reached, but there was an error in setting up the TLS connection. Check the other log messages near this one for details.

context

hb

introduced

-

removed

-

parameters

address:ports

Address(es) of the peer where the handshake failed.

`heartbeat TLS server handshake with 10.11.12.13:3012 failed`

WARNING

message

heartbeat TLS server handshake with 10.11.12.13:3012 failed

description

The other node could be reached, but there was an error in setting up the TLS connection. Check the other log messages near this one for details.

context

hb

introduced

-

removed

-

parameters

address:port

Address of the peer where the handshake failed.

`ignoring delayed heartbeat - expected timestamp less than 1483213248012 but was 1483213252345 from node: BB9020011AC4202`

WARNING

message

ignoring delayed heartbeat - expected timestamp less than 1483213248012 but was 1483213252345 from node: BB9020011AC4202

description

A received heartbeat message was not generated during the last heartbeat interval, which may indicate clock skew across the cluster.

context

hb

introduced

-

removed

-

parameters

expected ts

Latest timestamp expected (in milliseconds since the Aerospike Epoch of 2010-01-01 00:00:00)

actual ts

The actual timestamp in the message.

node

Source of the message.

`ignoring message from BB9030011AC4202 with different cluster name(dev_cluster)`

WARNING

message

ignoring message from BB9030011AC4202 with different cluster name(dev_cluster)

description

A node that is not part of the cluster is sending heartbeats. It is possible that the sending node has this node listed incorrectly as a seed node in the network.heartbeat stanza of its aerospike.conf.

context

hb

introduced

-

removed

-

parameters

node ID

ID of the node sending heartbeats.

cluster name

The cluster name that the sending node is presenting, which should help in identifying the node.

`mesh size recv failed fd 361: Connection timed out`

WARNING

message

mesh size recv failed fd 361: Connection timed out

description

An incomplete heartbeat message was received because there was a timeout in the socket connection.

context

hb

introduced

-

removed

-

parameters

fd

ID number of the file descriptor the message was received on.

error message

Error message returned from the OS.

`sending mesh message to BB9030011AC4202 on fd 361 failed : Broken pipe`

WARNING

message

sending mesh message to BB9030011AC4202 on fd 361 failed : Broken pipe

description

The node sent a heartbeat message to a peer node after the connection broke. The first parameter may be 0 rather than a node ID if the node has not been able to retrieve the remote node ID.

context

hb

introduced

-

removed

-

occurs

Under rare circumstances, particularly in large clusters after a cluster event, the server may continue to use a heartbeat connection that is no longer usable, which repeatedly generates the Broken pipe message. In this situation, the error occurs on the same file descriptor after the following error:

sending mesh message to bb9030011ac4202 on fd 361 failed : Connection reset by peer

Note: if the error keeps occurring on the same file descriptor, the node is not able to re-join the cluster and must be restarted.

parameters

node

Node that should have received the message.

fd

ID number of the file descriptor the message was sent on.

error message

Error message returned from the OS.

`sending mesh message to BB9030011AC4202 on fd 361 failed : No route to host`

WARNING

message

sending mesh message to BB9030011AC4202 on fd 361 failed : No route to host

description

Tried to send a heartbeat message to a peer, but the socket had some problem.

context

hb

introduced

-

removed

-

parameters

node

Node that should have received the message.

fd

ID number of the file descriptor the message was sent on.

error message

Error message returned from the OS.

`unable to parse heartbeat message on fd 361`

WARNING

message

unable to parse heartbeat message on fd 361

description

Received a malformed message on the heartbeat port, possibly due to a non-Aerospike process unintentionally or maliciously trying to connect on the port.

context

hb

introduced

-

removed

-

parameters

fd

ID number of the file descriptor the message was received on.

`updating mesh endpoint address from {10.219.136.101:3002} to {10.219.136.101:3002,10.219.148.101:3002}`

INFO

message

updating mesh endpoint address from {10.219.136.101:3002} to {10.219.136.101:3002,10.219.148.101:3002}

description

This info message indicates that the local node connected to one IP but received two in return. This is common in situations where nodes have multiple NICs and an address is not specified in the network.heartbeat context.

context

hb

introduced

-

removed

-

parameters

original address

Address connected to initially.

new addresses

List of addresses advertised by the node connected to.

Index

`could not allocate 1073741824-byte arena stage 13: Cannot allocate memory`

WARNING

message

could not allocate 1073741824-byte arena stage 13: Cannot allocate memory

description

Aerospike Enterprise edition allocates memory in 1GiB arenas for the primary index. This message indicates that the process failed to allocate the 13th arena. A contiguous 1GiB of memory should be available. The process continues to serve read and update transactions, but all new writes fail. This message is always be followed by the following warning:

WARNING (index): (index.c:737)(repeated:xxx)arenax alloc failed

context

index

introduced

-

removed

-

Info

`NODE-ID bb97f1d46894206 CLUSTER-SIZE 12 CLUSTER-NAME myCluster`

INFO

message

NODE-ID bb97f1d46894206 CLUSTER-SIZE 12 CLUSTER-NAME myCluster

description

context

info

introduced

3.9

removed

-

parameters

NODE-ID

The generated node ID, based on the MAC address and the service port.

CLUSTER-SIZE

Number of nodes recognized by this node as being in the cluster.

CLUSTER-NAME

Name of the cluster. Appears as null for unnamed clusters.

`batch-index: batches (234,0,0) delays 0`

INFO

message

batch-index: batches (234,0,0) delays 0

description

context

info

introduced

3.9

removed

-

occurs

Displayed periodically, every 10 seconds by default, and only if batch-index transactions have been issued on this node. Message details are aggregated across all namespaces.

parameters

batches

Number of batch-index jobs since the server started (Success, Error, Timed out). Success means all the sub-transactions for the batch-index job were dispatched successfully. The sub-transactions for the batch-index job could still error or time out individually, even if the parent batch-index job reported a success status. Also, an unsuccessful parent batch-index job could have some of its sub-transactions processed with any resulting status. No correlation can be made between the status of a parent batch-index job and and the statuses of its sub-transactions.

Related metrics:

delays

Number of times the job’s response buffer has been delayed by the sending process’s WOULDBLOCK to avoid overflowing the buffer.

Related metric:

batch_index_delay

`early-fail: demarshal 0 tsvc-client 1 tsvc-from-proxy 0 tsvc-from-proxy-batch-sub 0`

INFO

message

early-fail: demarshal 0 tsvc-client 1 tsvc-from-proxy 0 tsvc-from-proxy-batch-sub 0

description

context

info

introduced

4.5.1

removed

-

occurs

Displayed periodically, every 10 seconds by default, and only when any transaction failed early on this node. Cumulative since asd start and aggregated across all namespaces. Single-digit counts are likely not problematic. Prior to Database 7.2, the log line read early-fail: demarshal 0 tsvc-client 1 tsvc-from-proxy 0 tsvc-batch-sub 0 tsvc-from-proxy-batch-sub 0 tsvc-udf-sub 0 tsvc-ops-sub 0

parameters

demarshal

Failure during the demarshal phase of a transaction. Metric: demarshal_error.

tsvc-client

Indicates a failure for client initiated transactions, before getting to the namespace part. This can be due to authentication failure at the socket level, a missing or bad namespace, or an initial partition imbalance caused by a node that just started and has not yet joined the cluster. The latter would result in an unavailable error back to the client. Metric: early_tsvc_client_error.

tsvc-from-proxy

Indicates a failure for proxied transactions, before getting to the namespace part. This can be due to authentication failure at the socket level, a missing or bad namespace, or an initial partition imbalance caused by a node that just started and has not yet joined the cluster. The latter would result in an unavailable error. Metric: early_tsvc_from_proxy_error.

tsvc-batch-sub

Part of a batch sub transaction. Metric: early_tsvc_batch_sub_error.

tsvc-from-proxy-batch-sub

Part of a proxied batch sub transaction. Metric: early_tsvc_from_proxy_batch_sub_error.

tsvc-udf-sub

Part of a UDF sub transaction. Metric: early_tsvc_udf_sub_error.

tsvc-ops-sub

Part of a scan/query background ops sub transaction. Metric: early_tsvc_ops_sub_error.

`fabric-bytes-per-second: bulk (1525,7396) ctrl (33156,46738) meta (42,42) rw (128,128)`

INFO

message

fabric-bytes-per-second: bulk (1525,7396) ctrl (33156,46738) meta (42,42) rw (128,128)

description

context

info

introduced

3.11.1.1

removed

-

occurs

Fabric traffic statistics, displayed every 10 seconds by default.

parameters

bulk

Current transmit and receive rate for fabric-channel-bulk. This channel is used for record migrations during rebalance.

ctrl

Current transmit and receive rate for fabric-channel-ctrl. This channel distributes cluster membership change events and partition migration control messages.

`fds: proto (38553,57711444,57672891) heartbeat (27,553,526) fabric (648,2686,2038)`

INFO

message

fds: proto (38553,57711444,57672891) heartbeat (27,553,526) fabric (648,2686,2038)

description

context

info

introduced

3.9

removed

-

parameters

proto

Client connections statistics, in order:

Include connections that are reaped after idle (reaped connections correspond to reaped_fds), properly shutdown by the client (initiated a proper socket close) or preliminary packet parsing errors like unexpected headers, most of these would have a WARNING in the logs.

heartbeat

Heartbeat connections statistics, in order:

fabric

Fabric connections statistics, in order:

`heartbeat-received: self 887075 : foreign 35456447`

INFO

message

heartbeat-received: self 887075 : foreign 35456447

description

context

info

introduced

3.9

removed

-

parameters

self

Number of heartbeats the current node has received from itself (should be 0 for mesh).

foreign

Number of heartbeats the current node has received from all other nodes combined.

`histogram dump: {ns-name}-{hist-name} (1344911766 total) msec (00: 1262539302) (01: 0049561831) (02: 0013431778) (03: 0007273116) (04: 0004299011) (05: 0003086466) (06: 0002182478) (07: 0001854797) (08: 0000312272) (09: 0000370715)`

INFO

message

histogram dump: {ns-name}-{hist-name} (1344911766 total) msec (00: 1262539302) (01: 0049561831) (02: 0013431778) (03: 0007273116) (04: 0004299011) (05: 0003086466) (06: 0002182478) (07: 0001854797) (08: 0000312272) (09: 0000370715)

description

In the example, 05:0003086466 implies 3,086,466 data points took between 16 and 32 milliseconds. You can access additional histograms by enabling microbenchmarks or storage-benchmarks, statically or dynamically, in the service context of the configuration. See Monitoring latencies for monitoring latencies and details about the histograms.

context

info

introduced

3.9

removed

-

occurs

Periodically printed to the logs, every 10 seconds by default.

parameters

histogram dump

Name of the histogram to follow for the {ns-name} namespace

total

Number of data points represented by this histogram (since the server started)

N

Number of data points within units (e.g. msec or bytes) greater than 2^(N-1) and less than 2^N for N>0, between 0 and 1 for N=0

`in-progress: info-q 5 rw-hash 0 proxy-hash 0 tree-gc-q 0 long-queries 0`

INFO

message

in-progress: info-q 5 rw-hash 0 proxy-hash 0 tree-gc-q 0 long-queries 0

description

context

info

introduced

3.9

removed

-

parameters

tsvc-q

Removed in 4.7.0.2. Number of transactions siting in the transaction queue, waiting to be picked up by a transaction thread. Corresponds to the tsvc_queue statistic.

info-q

Number of transactions on the info transaction queue. Corresponds to the info_queue statistic.

nsup-delete-q

Removed in 4.5.1. Number of records queued up for deletion by the nsup thread.

rw-hash

Number of transactions that are parked on the read write hash. This is used for transactions that have to be processed on a different node. For example, prole writes, or read duplicate resolutions (when requested through client policy). Corresponds to the rw_in_progress statistic.

proxy-hash

Number of transactions on the proxy hash waiting for transmission on the fabric. Corresponds to the proxy_in_progress statistic.

rec-refs

Removed in 3.10. Number of references to a primary key.

tree-gc-q

Introduced in 3.10. This is the number of trees queued up, ready to be completely removed (partitions drop). Corresponds to the tree_gc_queue statistic.

long-queries

Introduced in version 6.1. Number of long queries currently active. Corresponds to the long_queries_active statistic.

`{<namespace>} data-usage: used-bytes 1073741824000 avail-pct 20`

INFO

message

{<namespace>} data-usage: used-bytes 1073741824000 avail-pct 20

description

context

info

introduced

7.0

removed

-

occurs

Displayed periodically for each namespace, every 10 seconds by default. This is a break down of the device usage if a storage-engine device has been configured for the namespace.

parameters

used-bytes

Number of bytes used on disk for {ns_name} on the local node.

avail-pct

Minimum percentage of contiguous disk space in {ns_name} on the local node across all devices. Corresponds to the device_available_pct statistic. Appears for storage-engine device.

cache-read-pct

Percentage of reads from the post-write cache instead of disk. Only applicable when {ns_name} is not configured for data in memory. Corresponds to the cache_read_pct statistic.

`{namespace_name} index-flash-usage: used-bytes 5502926848 used-pct 1 alloc-bytes 16384000 alloc-pct 92`

INFO

message

{namespace_name} index-flash-usage: used-bytes 5502926848 used-pct 1 alloc-bytes 16384000 alloc-pct 92

description

context

info

introduced

4.3.0.2

removed

7.0

occurs

Displayed periodically for each namespace configured with ‘index-type flash’, every 10 seconds by default.

parameters

{ns_name}

Name of the namespace the device and stats belongs to.

index-flash-usage

Name for which the following stats apply.

used-bytes

Total bytes in use on the mount for the primary index used by this namespace on this node.

used-pct

Percentage of the mount in use for the primary index used by this namespace on this node.

alloc-bytes

Total bytes allocated on the mount for the primary index used by this namespace on this node. This statistic represents entire 4KiB chunks that have at least one element in use. This statistic was introduced in 5.6. Corresponds to the index_flash_alloc_bytesstatistic.

alloc-pct

Percentage of the mount allocated for the primary index used by this namespace on this node. This statistic represents entire 4KiB chunks that have at least one element in use. This statistic was introduced in 5.6. Corresponds to the index_flash_alloc_pct statistic.

`{namespace_name} index-pmem-usage: used-bytes 5502926848 used-pct 1`

INFO

message

{namespace_name} index-pmem-usage: used-bytes 5502926848 used-pct 1

description

context

info

introduced

4.5

removed

7.0

occurs

Displayed periodically for each namespace configured with ‘index-type pmem’, every 10 seconds by default.

parameters

{ns_name}

Name of the namespace the index and stats belongs to.

index-pmem-usage

Name for which the following stats apply.

used-bytes

Total bytes in use on the mount for the primary index used by this namespace on this node.

used-pct

Percentage of the mount in use for the primary index used by this namespace on this node.

`{namespace_name} index-usage: used-bytes 5502926848`

INFO

message

{namespace_name} index-usage: used-bytes 5502926848

description

context

info

introduced

7.0

removed

-

occurs

Displayed periodically for each namespace, every 10 seconds by default.

parameters

{ns_name}

Name of the namespace the device and stats belongs to.

index-usage

Name for which the following stats apply.

used-bytes

Total bytes in use on the mount for the primary index used by this namespace on this node.

used-pct

Percentage of the mount in use for the primary index used by this namespace on this node.

`{namespace_name} set-index-usage: used-bytes 5502926848 used-pct 1`

INFO

message

{namespace_name} set-index-usage: used-bytes 5502926848 used-pct 1

description

context

info

introduced

7.0

removed

-

occurs

Displayed periodically for each namespace, every 10 seconds by default.

parameters

{ns_name}

Name of the namespace the index and stats belongs to.

set-index-usage

Name for which the following stats apply.

used-bytes

Total bytes in use on the mount for the primary index used by this namespace on this node.

used-pct

Percentage of the mount in use for the primary index used by this namespace on this node.

`{namespace_name} sindex-pmem-usage: used-bytes 12345 used-pct 43`

INFO

message

{namespace_name} sindex-pmem-usage: used-bytes 12345 used-pct 43

description

context

info

introduced

6.3

removed

7.0

occurs

Displayed periodically for each namespace configured with ‘sindex-type pmem’, every 10 seconds by default.

parameters

{ns_name}

Name of the namespace the index and stats belongs to.

sindex-pmem-usage

Name for which the following stats apply.

used-bytes

Total bytes in use on the mount for the secondary indexes used by this namespace on this node.

used-pct

Percentage of the mount in use for the secondary indexes used by this namespace on this node.

`{namespace_name} sindex-usage: used-bytes 12345 used-pct 43`

INFO

message

{namespace_name} sindex-usage: used-bytes 12345 used-pct 43

description

context

info

introduced

7.0

removed

-

occurs

Displayed periodically for each namespace, every 10 seconds by default.

parameters

{ns_name}

Name of the namespace the index and stats belongs to.

sindex-usage

Name for which the following stats apply.

used-bytes

Total bytes in use on the mount for the secondary indexes used by this namespace on this node.

used-pct

Percentage of the mount in use for the secondary indexes used by this namespace on this node.

`{ns_name} batch-sub: tsvc (0,0) proxy (0,0,0) read (959,0,0,51,1) write (0,0,0,0) delete (0,0,0,0,0) udf (0,0,0,0) lang (0,0,0,0)`

INFO

message

{ns_name} batch-sub: tsvc (0,0) proxy (0,0,0) read (959,0,0,51,1) write (0,0,0,0) delete (0,0,0,0,0) udf (0,0,0,0) lang (0,0,0,0)

description

context

info

introduced

3.9

removed

-

occurs

Batch index transaction statistics for each namespace, displayed every 10 seconds by default, and only if batch index transactions hit this namespace on this node. No correlation can be made between the status of a parent batch-index job and the statuses of its sub-transactions. See the batch-index log entry for details.

parameters

tsvc

Number of batch-index read sub transactions that failed in the transaction service (Error,Timed out). Corresponds to the batch_sub_tsvc_error and batch_sub_tsvc_timeout statistics.

proxy

Number of proxied batch-index read sub transactions (Success,Error,Timed out). Corresponds to the batch_sub_proxy_complete, batch_sub_proxy_error, and batch_sub_proxy_timeout statistics.

read

Number of batch-index read sub transactions (Success,Error,Timed out,Not found,Filtered out). Corresponds to the batch_sub_read_success, batch_sub_read_error, batch_sub_read_timeout, batch_sub_read_not_found, and batch_sub_read_filtered_out statistics.

write

Number of batch-index write sub transactions (Success,Error,Timed out,Filtered out). Corresponds to the batch_sub_write_success, batch_sub_write_error, batch_sub_write_timeout, and batch_sub_write_filtered_out statistics. Displayed as of Database 6.0.

delete

Number of batch-index delete sub transactions (Success,Error,Timed out,Not found,Filtered out). Corresponds to the batch_sub_delete_success, batch_sub_delete_error, batch_sub_delete_timeout, batch_sub_delete_not_found, and batch_sub_delete_filtered_out statistics. Displayed as of Database 6.0.

udf

Number of batch-index udf sub transactions (Complete,Error,Timed out,Filtered out). Corresponds to the batch_sub_udf_complete, batch_sub_udf_error, batch_sub_udf_timeout, and batch_sub_udf_filtered_out statistics. Displayed as of Database 6.0.

lang

Number of batch-index lang sub transactions (Delete Success,Error,Read Success,Write Success). Corresponds to the batch_sub_lang_delete_success, batch_sub_lang_error, batch_sub_lang_read_success, and batch_sub_lang_write_success statistics. Displayed as of Database 6.0.

`{ns_name} client: tsvc (0,0) proxy (0,0,0) read (126,0,1,3,1) write (2886,0,23,2) delete (197,0,1,19,3) udf (35,0,1,4) lang (26,7,0,3)`

INFO

message

{ns_name} client: tsvc (0,0) proxy (0,0,0) read (126,0,1,3,1) write (2886,0,23,2) delete (197,0,1,19,3) udf (35,0,1,4) lang (26,7,0,3)

description

context

info

introduced

3.9

removed

-

occurs

Basic client transaction statistics displayed periodically for each namespace, every 10 seconds by default, and only if client transactions hit this namespace on this node.

The following values define various actions displayed in the logs:

S - success
C - complete, but success or failure is indeterminate. Proxy diverts and UDFs can successfully send a “FAILURE” response bin.
E - error
T - timed out
N - not found, which for reads and deletes is a result that wants to be distinguished from success but is not an error
F - result filtered out or action skipped by predexp
R,W,D - successful UDF read, write, or delete operation.

parameters

tsvc

Failures in the transaction service, before attempting to handle the transaction (E,T). Also reported as the client_tsvc_error and client_tsvc_timeout statistics.

proxy

Client proxied transactions (C,E,T). This should only happen during migrations. Also reported as the client_proxy_complete, client_proxy_error, and client_proxy_timeout statistics.

read

Client read transactions (S,E,T,N,F). Also reported as the client_read_success, client_read_error, client_read_timeout, client_read_not_found, and client_read_filtered_out statistics.

write

Client write transactions (S,E,T,F). Also reported as the client_write_success, client_write_error, client_write_timeout, and client_write_filtered_out statistics.

delete

Client delete transactions (S,E,T,N,F). Also reported as the client_delete_success, client_delete_error, client_delete_timeout, client_delete_not_found, and client_delete_filtered_out statistics.

udf

Client UDF transactions (C,E,T,F). See the lang stat breakdown for the underlying operation statuses. Also reported as the client_udf_complete, client_udf_error, client_udf_timeout, and client_udf_filtered_out statistics.

lang

Statistics for UDF operation statuses (R,W,D,E). Also reported as the client_lang_read_success, client_lang_write_success, client_lang_delete_success, and client_lang_error statistics.

`{ns_name} data-usage: used-bytes 2054187648 avail-pct 92`

INFO

message

{ns_name} data-usage: used-bytes 2054187648 avail-pct 92

description

context

info

introduced

7.0

removed

-

occurs

If storage-engine pmem has been configured for the namespace, this is a break down of the pmem storage file usage. Displayed periodically for each namespace, every 10 seconds by default.

parameters

used-bytes

Number of bytes used on pmem storage files for {ns_name} on the local node.

avail-pct

Minimum percentage of contiguous pmem storage file space in {ns_name} on the local node across all pmem storage files. Corresponds to the pmem_available_pct statistic.

`{ns_name} device-usage: used-bytes 2054187648 avail-pct 92 cache-read-pct 12.35`

INFO

message

{ns_name} device-usage: used-bytes 2054187648 avail-pct 92 cache-read-pct 12.35

description

context

info

introduced

3.9

removed

7.0

occurs

If storage-engine device has been configured for the namespace, this is a break down of the device usage. Displayed periodically for each namespace, every 10 seconds by default.

parameters

used-bytes

Number of bytes used on disk for {ns_name} on the local node.

avail-pct

Minimum percentage of contiguous disk space in {ns_name} on the local node across all devices. Corresponds to the device_available_pct statistic.

cache-read-pct

Percentage of reads from the post-write cache instead of disk. Only applicable when {ns_name} is not configured for data in memory. Corresponds to the cache_read_pct statistic.

`{ns_name} dup-res: ask 1234 respond (10,4321)`

INFO

message

{ns_name} dup-res: ask 1234 respond (10,4321)

description

context

info

introduced

5.5

removed

-

occurs

Statistics for transactions that are asking or handling duplicate resolution. Displayed periodically for each namespace, every 10 seconds by default.

parameters

ask

Number of duplicate resolution requests made by the node to other individual nodes. Also reported as the dup_res_ask statistic.

respond

Number of duplicate resolution requests handled by the node, broken up between transactions where a read was required and transactions where a read was not required. Also reported as the dup_res_respond_read and dup_res_respond_no_read statistics.

`{ns_name} from-proxy: tsvc (0,0) read (105,0,1,7) write (2812,0,22,1) delete (188,0,1,16,2) udf (35,0,1,3) lang (26,7,0,3)`

INFO

message

{ns_name} from-proxy: tsvc (0,0) read (105,0,1,7) write (2812,0,22,1) delete (188,0,1,16,2) udf (35,0,1,3) lang (26,7,0,3)

description

context

info

introduced

4.5.1

removed

-

occurs

Basic proxied transaction statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only after proxied transactions hit this namespace on this node.

The following values define various actions that are displayed in the logs

S - success
C - complete, but success/failure indeterminate. UDFs can successfully send a “FAILURE” response bin.
E - error
T - timed out
N - not found, which for reads and deletes is a result that wants to be distinguished from success but is not an error
F - result filtered out or action skipped by predexp
R,W,D - successful UDF read, write, or delete operation.

parameters

tsvc

Failures in the transaction service, before attempting to handle the transaction (E,T). Also reported as the following statistics:

read

Proxied read transactions (S,E,T,N,F). Also reported as the following statistics:

write

Proxied write transactions (S,E,T,F). Also reported as the following statistics:

delete

Proxied delete transactions (S,E,T,N,F). Also reported as the following statistics:

from_proxy_delete_success
from_proxy_delete_error
from_proxy_delete_timeout](/database/reference/metrics#namespace__from_proxy_delete_timeout)
from_proxy_delete_not_found
from_proxy_delete_filtered_out

udf

Proxied UDF transactions (C,E,T,F). See the lang stat breakdown for the underlying operation statuses. Also reported as the following statistics:

lang

Statistics for proxied UDF operation statuses (R,W,D,E). Also reported as thefollowing statistics:

from_proxy_lang_read_success
from_proxy_lang_write_success
from_proxy_lang_delete_success](/database/reference/metrics#namespace__from_proxy_lang_delete_success)
from_proxy_lang_error statistics.

`{ns_name} from-proxy-batch-sub: tsvc (0,0) read (959,0,0,51,1) write (0,0,0,0) delete (0,0,0,0,0) udf (0,0,0,0) lang (0,0,0,0)`

INFO

message

{ns_name} from-proxy-batch-sub: tsvc (0,0) read (959,0,0,51,1) write (0,0,0,0) delete (0,0,0,0,0) udf (0,0,0,0) lang (0,0,0,0)

description

context

info

introduced

4.5.1

removed

-

occurs

Proxied batch index transaction statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only after proxied batch index transactions hit this namespace on this node. No correlation can be made between the status of a parent batch-index job and the statuses of its sub-transactions. See the batch-index log entry for details.

parameters

tsvc

Number of proxied batch-index read sub transactions that failed in the transaction service (Error,Timed out). Corresponds to the following statistics:

read

Number of proxied batch-index read sub transactions (Success,Error,Timed out,Not found). Corresponds to the followng statistics:

write

Number of proxied batch-index write sub transactions (Success,Error,Timed out,Filtered out). Corresponds to the following statistics

from_proxy_batch_sub_write_success
from_proxy_batch_sub_write_error
from_proxy_batch_sub_write_timeout
from_proxy_batch_sub_write_filtered_out

Displayed as of Database 6.0.

delete

Number of proxied batch-index delete sub transactions (Success,Error,Timed out,Not found,Filtered out). Corresponds to the following stastics:

Displayed as of Database 6.0.

udf

Number of proxied batch-index udf sub transactions (Complete,Error,Timed out,Filtered out). Corresponds to the following statistics:

Displayed as of Database 6.0.

lang

Number of proxied batch-index lang sub transactions (Delete Success,Error,Read Success,Write Success). Corresponds to the following statistics:

Displayed as of Database 6.0.

`{ns_name} index-usage: used-bytes 3121300 used-pct 0.05 alloc-bytes 16777216 alloc-pct 0.50`

INFO

message

{ns_name} index-usage: used-bytes 3121300 used-pct 0.05 alloc-bytes 16777216 alloc-pct 0.50

description

context

info

introduced

3.9

removed

7.0

occurs

Index usage statistics for {ns_name}, displayed periodically for each namespace, every 10 seconds by default. When ‘data-in-memory’ is false, ‘data-bytes’ is not included. ‘set-index-bytes’ is included as of Database 5.6.

parameters

used-bytes

Total number of bytes used in memory for {ns_name} on the local node.

used-pct

Percentage of bytes used in memory for {ns_name} on the local node.

alloc-pct

Number of bytes holding secondary indexes in process memory for {ns_name} on the local node.

alloc-bytes

Number of bytes holding data in process memory for {ns_name} on the local node. Displayed only when ‘data-in-memory’ is set to true for {ns_name}.

`{ns_name} memory-usage: total-bytes 3121300 index-bytes 140544 set-index-bytes 70272 sindex-bytes 221544 data-bytes 2688940 used-pct 0.05`

INFO

message

{ns_name} memory-usage: total-bytes 3121300 index-bytes 140544 set-index-bytes 70272 sindex-bytes 221544 data-bytes 2688940 used-pct 0.05

description

context

info

introduced

3.9

removed

7.0

occurs

Memory usage statistics for {ns_name}, displayed periodically for each namespace, every 10 seconds by default. When ‘data-in-memory’ is false, ‘data-bytes’ is not included. ‘set-index-bytes’ is included as of Database 5.6.

parameters

total-bytes

Total number of bytes used in memory for {ns_name} on the local node.

index-bytes

Number of bytes holding the primary index in system memory for {ns_name} on the local node. Displays 0 when index is not stored in RAM.

set-index-bytes

Number of bytes holding set indexes in process memory for {ns_name} on the local node. Displayed as of Database 5.6.

sindex-bytes

Number of bytes holding secondary indexes in process memory for {ns_name} on the local node.

data-bytes

Number of bytes holding data in process memory for {ns_name} on the local node. Displayed only when ‘data-in-memory’ is set to true for {ns_name}.

used-pct

Percentage of bytes used in memory for {ns_name} on the local node.

`{ns_name} migrations: remaining (654,289,254) active (1,1,0) complete-pct 88.49`

INFO

message

{ns_name} migrations: remaining (654,289,254) active (1,1,0) complete-pct 88.49

description

context

info

introduced

3.9

removed

-

occurs

Migration statistics for {ns_name}, displayed periodically for each namespace, every 10 seconds by default. When migrations have completed, this line is reduced to {ns_name} migrations - complete.

parameters

{ns_name}

“ns_name” is replaced by the name of a particular namespace.

remaining

Total number of transmit and receive partition migrations outstanding for this node as well as signals. Signals represents the number of signals to send to other nodes (non replica nodes) for partitions to drop. This log line changes to complete after migrations are completed on this node.

active

Number of transmit and receive partition migrations currently in progress, as well as active signals.

complete-pct

Percent of the total number of partition migrations scheduled for this rebalance that have already completed.

`{ns_name} objects: all 845922 master 281071 prole 564851`

INFO

message

{ns_name} objects: all 845922 master 281071 prole 564851

description

context

info

introduced

3.9

removed

-

occurs

Object statistics for {ns_name}, displayed periodically for each namespace, every 10 seconds by default. Number of objects for this namespace on this node along with the master and prole breakdown.

parameters

{ns_name}

“ns_name” is replaced by the name of a particular namespace.

all

Total number of objects for this namespace on this node (master and proles).

master

Number of master objects for this namespace on this node.

prole

Number of prole (replica) objects for this namespace on this node.

`{ns_name} ops-sub: tsvc (0,0) write (2651,0,0,1)`

INFO

message

{ns_name} ops-sub: tsvc (0,0) write (2651,0,0,1)

description

context

info

introduced

4.7.0.2

removed

-

occurs

Scan/query ops sub-transactions statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only after scan/query ops sub-transactions hit this namespace on this node.

parameters

tsvc

Number of ops sub-transactions of scan/query background ops jobs that failed in the transaction service (Error,Timed out). Corresponds to the ops_sub_tsvc_error and ops_sub_tsvc_timeout statistics.

write

Number of ops sub-transactions of scan/query background ops jobs (Success,Error,Timed out,Filtered out). Corresponds to the ops_sub_write_success, ops_sub_write_error, ops_sub_write_timeout, and ops_sub_write_filtered_out statistics.

`{ns_name} pi-query: short-basic (4,0,0) long-basic (36,0,0) aggr (0,0,0) udf-bg (7,0,0), ops-bg (3,0,0)`

INFO

message

{ns_name} pi-query: short-basic (4,0,0) long-basic (36,0,0) aggr (0,0,0) udf-bg (7,0,0), ops-bg (3,0,0)

description

context

info

introduced

6.0

removed

-

occurs

Primary index query (pi-query) transaction statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only after pi-query transactions hit this namespace on this node.

parameters

short-basic

Number of short primary index (pi)-query jobs since the server started (Success,Error,Aborted). Short queries are declared by the client, are unmonitored and typically run for a second or less. Corresponds to the pi_query_short_basic_complete, pi_query_short_basic_error, and pi_query_short_basic_timeout statistics.

long-basic

Number of long pi-query jobs since the server started (Success,Error,Aborted). Long queries are monitored and not time bounded. Corresponds to the pi_query_long_basic_complete, pi_query_long_basic_error, and pi_query__long_basic_abort statistics.

aggr

Number of pi-query aggregation jobs since the server started (Success,Error,Aborted). Corresponds to the pi_query_aggr_complete, pi_query_aggr_error, and pi_query_aggr_abort statistics.

udf-bg

Number of pi-query background udf jobs since the server started (Success,Error,Aborted). Corresponds to the pi_query_udf_bg_complete, pi_query_udf_bg_error, and pi_query_udf_bg_abort statistics.

ops-bg

Number of pi-query background operations (ops) jobs since the server started (Success,Error,Aborted). Corresponds to the pi_query_ops_bg_complete, pi_query_ops_bg_error, and pi_query_ops_bg_abort statistics.

`{ns_name} pmem-usage: used-bytes 2054187648 avail-pct 92`

INFO

message

{ns_name} pmem-usage: used-bytes 2054187648 avail-pct 92

description

context

info

introduced

4.8

removed

7.0

occurs

Break down of the pmem storage file usage, displayed periodically for each namespace, every 10 seconds by default. Displayed only if storage-engine pmem has been configured for the namespace.

parameters

used-bytes

Number of bytes used on pmem storage files for {ns_name} on the local node.

avail-pct

Minimum percentage of contiguous pmem storage file space in {ns_name} on the local node across all pmem storage files. Corresponds to the pmem_available_pct statistic.

`{ns_name} query: basic (210,0,0) aggr (0,0,0) udf-bg (0,0,0) ops-bg (0,0,0)`

INFO

message

{ns_name} query: basic (210,0,0) aggr (0,0,0) udf-bg (0,0,0) ops-bg (0,0,0)

description

context

info

introduced

5.7

removed

6.0

occurs

Secondary index query transactions statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only after query transactions hit this namespace on this node.

parameters

basic

Number of secondary index queries since the server started (Completed,Error,Abort). Corresponds to the query_basic_complete, query_basic_error and query_basic_abort statistics.

aggr

Number of query aggregation jobs since the server started (Completed,Error,Abort). Corresponds to the query_aggr_complete, query_aggr_error and query_aggr_abort statistics.

udf-bg

Number of query background udf jobs since the server started (Completed,Error,Abort). Corresponds to the query_udf_bg_complete, query_udf_bg_error and query_udf_bg_abort statistics.

ops-bg

Number of query background operations (ops) jobs since the server started (Success,Failure). Corresponds to the query_ops_bg_success and query_ops_bg_failure statistics.

`{ns_name} query: basic (5,0) aggr (6,0) udf-bg (1,0) ops-bg (2,0)`

INFO

message

{ns_name} query: basic (5,0) aggr (6,0) udf-bg (1,0) ops-bg (2,0)

description

context

info

introduced

3.9

removed

5.7

occurs

Secondary index query transactions statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only after query transactions hit this namespace on this node.

parameters

basic

Number of secondary index queries since the server started (Success,Abort). Corresponds to the query_lookup_success and query_lookup_abort statistics.

aggr

Number of query aggregation jobs since the server started (Success,Abort). Corresponds to the query_agg_success and query_agg_abort statistics.

udf-bg

Number of query background udf jobs since the server started (Success,Failure). Corresponds to the query_udf_bg_success and query_udf_bg_failure statistics.

ops-bg

Number of query background operations (ops) jobs since the server started (Success,Failure). Corresponds to the query_ops_bg_success and query_ops_bg_failure statistics.

`{ns_name} read-touch: tsvc (0,0) all-triggers (12345,0,3,45) 14`

INFO

message

{ns_name} read-touch: tsvc (0,0) all-triggers (12345,0,3,45) 14

description

Statistics for the read-touch LRU behavior, described in configuring namespace data retention and controlled by default-read-touch-ttl-pct.

context

info

introduced

7.1

removed

-

occurs

Displayed periodically for each namespace, every 10 seconds by default. The following values define various actions that are displayed in the logs:

S - success
E - error
T - timed out
K - skipped

parameters

tsvc

Read-touch early timeouts (E, T). Also reported as the read_touch_tsvc_error and read_touch_tsvc_timeout statistics.

all-triggers

Read-touch transactions (S, E, T, K). Also reported as the read_touch_success,read_touch_error, read_touch_timeout, and read_touch_skip statistics.

`{ns_name} re-repl: tsvc (0,34) all-triggers (525,0,32) unreplicated-records 14`

INFO

message

{ns_name} re-repl: tsvc (0,34) all-triggers (525,0,32) unreplicated-records 14

description

Statistics for transactions re-replicating, applicable to strong-consistency enabled namespaces. In strong consistency mode, write transactions failing replication are marked as unreplicated, and attempt to re-replicate one time immediately (despite returning an in doubt timeout failure to the client), as well as on any subsequent transaction attempted on the record (read or write).

context

info

introduced

4.0

removed

-

occurs

Displayed periodically for each namespace, every 10 seconds by default. The following values define various actions which are displayed in the logs:

S - success
E - error
T - timed out

parameters

tsvc

Re-replication early timeouts (E, T). Also reported as the re_repl_tsvc_error and re_repl_tsvc_timeout statistics.

all-triggers

Re-replication transactions (S, E, T). Also reported as the re_repl_success,re_repl_error and re_repl_timeout statistics. Starting with Database 6.3, re_repl_timeout only applies to timeouts during the actual replication.

unreplicated-records

Number of unreplicated records in the namespace. Displayed as of Database 5.7. Also reported as the unreplicated_records statistic.

`{ns_name} retransmits: migration 1 dup-res (1,2,3,4,5,6,7,8,9,10) repl-ping (1,2) repl-write (1,2,3,4,5,6,7,8)`

INFO

message

{ns_name} retransmits: migration 1 dup-res (1,2,3,4,5,6,7,8,9,10) repl-ping (1,2) repl-write (1,2,3,4,5,6,7,8)

description

context

info

introduced

4.5.1

removed

-

occurs

Retransmit statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only if any retransmit has taken place. Retransmission happens when the server resends a message to another cluster node, after a prior attempt timed out.

parameters

migration

Number of retransmits that occurred during migrations. Corresponds migrate_record_retransmits statistic.

dup-res

Groups the retransmission statistics related to duplicate-resolution.

The numbered dup-res metrics correspond to the following:

retransmit_all_read_dup_res
retransmit_all_write_dup_res
retransmit_all_delete_dup_res
retransmit_all_udf_dup_res
retransmit_all_batch_sub_read_dup_res
retransmit_all_batch_sub_write_dup_res
retransmit_all_batch_sub_delete_dup_res
retransmit_all_batch_sub_udf_dup_res
retransmit_udf_sub_dup_res
retransmit_ops_sub_dup_res

repl-ping

Groups the retransmission statistics related to Linearized SC reads.

The numbered repl-ping metrics correspond to the following:

retransmit_all_read_repl_ping
retransmit_all_batch_sub_read_repl_ping

repl-write

Groups the retransmission statistics related to replica writes.

The numbered repl-write metrics correspond to the following:

retransmit_all_write_repl_write
retransmit_all_delete_repl_write
retransmit_all_udf_repl_write
retransmit_all_batch_sub_write_repl_write
retransmit_all_batch_sub_delete_repl_write
retransmit_all_batch_sub_udf_repl_write
retransmit_udf_sub_repl_write
retransmit_ops_sub_repl_write

`{ns_name} scan: basic (11,0,0) aggr (0,0,0) udf-bg (5,0,0), ops-bg (10,0,0)`

INFO

message

{ns_name} scan: basic (11,0,0) aggr (0,0,0) udf-bg (5,0,0), ops-bg (10,0,0)

description

context

info

introduced

3.9

removed

6.0

occurs

Scan transactions statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only after scan transactions hit this namespace on this node.

parameters

basic

Number of scan jobs since the server started (Success,Error,Aborted). Corresponds to the scan_basic_complete, scan_basic_error, and scan_basic_abort statistics.

aggr

Number of scan aggregation jobs since the server started (Success,Error,Aborted). Corresponds to the scan_aggr_complete, scan_aggr_error, and scan_aggr_abort statistics.

udf-bg

Number of scan background udf jobs since the server started (Success,Error,Aborted). Corresponds to the scan_udf_bg_complete, scan_udf_bg_error, and scan_udf_bg_abort statistics.

ops-bg

Number of scan background operations (ops) jobs since the server started (Success,Error,Aborted). Corresponds to the scan_ops_bg_complete, scan_ops_bg_error, and scan_ops_bg_abort statistics.

`{ns_name} sindex-flash-usage: used-bytes 12345 used-pct 43`

INFO

message

{ns_name} sindex-flash-usage: used-bytes 12345 used-pct 43

description

context

info

introduced

6.4

removed

7.0

occurs

Secondary index flash usage (sindex-flash-usage) statistics, displayed periodically for each namespace configured with ‘sindex-type flash’, every 10 seconds by default.

parameters

{ns_name}

Name of the namespace the index and stats belongs to.

sindex-flash-usage

Name for which the following stats apply.

used-bytes

Total bytes in use on the mount for the secondary indexes used by this namespace on this node.

used-pct

Percentage of the mount in use for the secondary indexes used by this namespace on this node.

`{ns_name} si-query: short-basic (4,0,0) long-basic (26,0,0) aggr (0,0,0) udf-bg (7,0,0) ops-bg (3,0,0)`

INFO

message

{ns_name} si-query: short-basic (4,0,0) long-basic (26,0,0) aggr (0,0,0) udf-bg (7,0,0) ops-bg (3,0,0)

description

context

info

introduced

6.0

removed

-

occurs

Secondary index query (si-query) transactions statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only after si-query transactions hit this namespace on this node.

parameters

short-basic

Number of short secondary index queries since the server started (Completed,Error,Abort). Short queries are declared by the client, are unmonitored and typically run for a second or less. Corresponds to the si_query_short_basic_complete, si_query_short_basic_error and si_query_short_basic_timeout statistics.

long-basic

Number of long secondary index queries since the server started (Completed,Error,Abort). Long queries are monitored and not time bounded. Corresponds to the si_query_long_basic_complete, si_query_long_basic_error and si_query_long_basic_abort statistics.

aggr

Number of si-query aggregation jobs since the server started (Completed,Error,Abort). Corresponds to the si_query_aggr_complete, si_query_aggr_error and si_query_aggr_abort statistics.

udf-bg

Number of si-query background udf jobs since the server started (Completed,Error,Abort). Corresponds to the si_query_udf_bg_complete, si_query_udf_bg_error and si_query_udf_bg_abort statistics.

ops-bg

Number of si-query background operations (ops) jobs since the server started (Completed,Error,Abort). Corresponds to the si_query_ops_bg_complete, si_query_ops_bg_error, and si_query_ops_bg_abort statistics.

`{ns_name} special-errors: key-busy (1234, 40) record-too-big 5678 lost-conflict (256,32)`

INFO

message

{ns_name} special-errors: key-busy (1234, 40) record-too-big 5678 lost-conflict (256,32)

description

context

info

introduced

5.6

removed

-

occurs

Special errors statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only if any of those errors have occurred. Log ticket modified in Database 7.2, adding fail_xdr_key_busy count as second number in the first parentheses.

parameters

key-busy

Number of key busy errors. Corresponds to the fail_key_busy and fail_xdr_key_busy statistics. SeeHot Key Error code 14.

record-too-big

Number of record too big errors. Corresponds to the fail_record_too_big statistic.

lost-conflict

Composed of the following metrics:

fail_client_lost_conflict- fail_xdr_lost_conflict

`{ns_name} special-errors: key-busy 1234 record-too-big 5678`

INFO

message

{ns_name} special-errors: key-busy 1234 record-too-big 5678

description

context

info

introduced

3.16.0.1

removed

5.6

occurs

Special errors statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only if any of those errors have occurred.

parameters

key-busy

Number of key busy errors. Corresponds to the fail_key_busy statistic. See Hot Key Error code 14.

record-too-big

Number of record too big errors. Corresponds to the fail_record_too_big statistic.

`{ns_name} tombstones: all 11252 xdr (11223,0) master 5501 prole 5751 non-replica 0`

INFO

message

{ns_name} tombstones: all 11252 xdr (11223,0) master 5501 prole 5751 non-replica 0

description

context

info

introduced

5.5

removed

-

occurs

Number of tombstones for this namespace on this node along with the breakdown. Displayed periodically for each namespace, every 10 seconds by default.

parameters

{ns_name}

“ns_name” is replaced by the name of a particular namespace.

all

Total number of tombstones for this namespace on this node.

xdr

Number of xdr tombstones and bin cemeteries - (xdr_tombstones,xdr_bin_cemeteries).

master

Number of master tombstones for this namespace on this node.

prole

Number of prole (replica) tombstones for this namespace on this node.

non-replica

Number of non-replica tombstones for this namespace on this node.

`{ns_name} udf-sub: tsvc (0,0) udf (2651,0,0,1) lang (52,2498,101,0)`

INFO

message

{ns_name} udf-sub: tsvc (0,0) udf (2651,0,0,1) lang (52,2498,101,0)

description

context

info

introduced

3.9

removed

-

occurs

Scan/query UDF sub-transactions statistics, displayed periodically for each namespace, every 10 seconds by default. Displayed only after query UDF sub-transactions hit this namespace on this node.

parameters

tsvc

Number of udf sub transactions of scan/query background udf jobs that failed in the transaction service (Error,Timed out). Corresponds to the following statistics:

udf

Number of udf sub transactions of scan/query background udf jobs (Success,Error,Timed out,Filtered out). Corresponds to the following statistics:

lang

Different status counts for underlying udf operations for sub transactions of scan/query background udf jobs (Read,Write,Delete,Error). Corresponds to the following statistics:

`{ns_name} xdr-client: write (1543,0,15) delete (134,0,3,25)`

INFO

message

{ns_name} xdr-client: write (1543,0,15) delete (134,0,3,25)

description

context

info

introduced

3.16.0.1

removed

-

occurs

XDR client transactions statistics, displayed after an XDR client transactions hit this namespace on this node. Displayed periodically, every 10 seconds by default, for each namespace receiving write transactions from an XDR client. The values on this line are a subset of the values displayed in the client statistics line just above in the log file.

The following values define various actions which are displayed in the logs:

S - success
E - error
T - timed out
N - not found, which for reads and deletes is a result that wants to be distinguished from success but is not an error.

parameters

write

XDR client write transactions (S,E,T). Also reported as the following statistics:

delete

XDR client delete transactions (S,E,T,N). Also reported as the following statistics:

`{ns_name} xdr-dc dc2: lag 12 throughput 710 bytes-shipped 212456900 in-queue 250563 in-progress 81150 complete (1002215,0,0,0) retries (0,0,23) recoveries (2048,0) hot-keys 4655`

INFO

message

{ns_name} xdr-dc dc2: lag 12 throughput 710 bytes-shipped 212456900 in-queue 250563 in-progress 81150 complete (1002215,0,0,0) retries (0,0,23) recoveries (2048,0) hot-keys 4655

description

context

info

introduced

6.0

removed

-

occurs

Displayed periodically for each combination of namespace and XDR destination cluster (DC), every 10 seconds by default.

parameters

{ns_name}

“ns_name” is replaced by the name of a particular namespace.

xdr-dc

Name of the XDR destination cluster.

lag

See lag.

throughput

See throughput.

bytes-shipped

See bytes_shipped.

in-queue

See in_queue.

in-progress

See in_progress.

complete

Composed of the following metrics:

retries

Composed of the following metrics:

recoveries

Composed of the following metrics:

hot-keys

See hot_keys.

`{ns_name} xdr-from-proxy: write (743,0,11) delete (104,0,3,21)`

INFO

message

{ns_name} xdr-from-proxy: write (743,0,11) delete (104,0,3,21)

description

context

info

introduced

4.5.1

removed

-

occurs

Proxied XDR transactions statistics. Displayed periodically, every 10 seconds by default, for each namespace receiving proxied write transactions from an XDR client. Displayed only after a proxied XDR transaction hits this namespace on this node. The values on this line are a subset of the values displayed in the from-proxy statistics line just above in the log file.

The following values define various actions which are displayed in the logs

S - success
E - error
T - timed out
N - not found, which for reads and deletes is a result that wants to be distinguished from success but is not an error

parameters

write

XDR client write transactions (S,E,T). Also reported as the xdr_from_proxy_write_success, xdr_from_proxy_write_error and xdr_from_proxy_write_timeout statistics.

delete

XDR client delete transactions (S,E,T,N). Also reported as the xdr_from_proxy_delete_success, xdr_from_proxy_delete_error, xdr_from_proxy_delete_timeout, and xdr_from_proxy_delete_not_found statistics.

`process: cpu-pct 28 threads (8,67,46,46) heap-kbytes (71477,72292,118784) heap-efficiency-pct 60.2`

INFO

message

process: cpu-pct 28 threads (8,67,46,46) heap-kbytes (71477,72292,118784) heap-efficiency-pct 60.2

description

context

info

introduced

4.7.0.2

removed

-

parameters

cpu-pct

Percent CPU time Aerospike was scheduled since previously reported. Corresponds to process_cpu_pct.

threads

Thread statistics, in order: (threads_joinable, threads_detached, threads_pool_total, threads_pool_active). Introduced in Database 5.6.

heap-kbytes

Heap statistics, in order: (heap_allocated_kbytes, heap_active_kbytes, heap_mapped_kbytes).

heap-efficiency-pct

Indicates the jemalloc heap fragmentation. This represents the heap_allocated_kbytes / heap_mapped_kbytes ratio (prior to 5.7), or the heap_allocated_kbytes / heap_active_kbytes ratio (6.0 or later). A lower number indicates a higher fragmentation rate. Corresponds to the heap_efficiency_pct statistic.

`system: total-cpu-pct 76 user-cpu-pct 44 kernel-cpu-pct 32 free-mem-kbytes 7462956 free-mem-pct 52 thp-mem-kbytes 4096`

INFO

message

system: total-cpu-pct 76 user-cpu-pct 44 kernel-cpu-pct 32 free-mem-kbytes 7462956 free-mem-pct 52 thp-mem-kbytes 4096

description

context

info

introduced

4.7.0.2

removed

-

parameters

total-cpu-pct

Percent of time the CPU spent servicing user-space or kernel space tasks. Corresponds to system_total_cpu_pct.

user-cpu-pct

Percent of time the CPUs spent servicing user-space tasks. Corresponds to system_user_cpu_pct.

kernel-cpu-pct

Percent of time the CPUs spent servicing kernel-space tasks. Corresponds to system_kernel_cpu_pct.

free-mem-kbytes

Amount of free RAM in kilobytes for the host. Corresponds to system_free_mem_kbytes.

free-mem-pct

Percentage of all RAM free, rounded to nearest percent, for the host. Corresponds to the system_free_mem_pct.

thp-mem-kbytes

Amount of memory in use by the Transparent Huge Page mechanism, in kilobytes. Corresponds to system_thp_mem_kbytes. Displayed in 5.7 and later.

`xdr-dc dc2: nodes 8 latency-ms 19`

INFO

message

xdr-dc dc2: nodes 8 latency-ms 19

description

context

info

introduced

6.0

removed

-

occurs

Displayed periodically for each XDR destination cluster (DC), every 10 seconds by default.

parameters

xdr-dc

Name of the XDR destination cluster.

nodes

See nodes.

latency-ms

See latency_ms.

Key busy

`DETAIL (info-command): (<file>:<line>) client <ipaddress:port> command '<info-command-text>'`

DETAIL

message

DETAIL (info-command): (<file>:<line>) client <ipaddress:port> command '<info-command-text>'

description

context

info-command

introduced

7.1

removed

-

occurs

A detail logging context for logging all info protocol commands received by the server

`DETAIL (key-busy): (<file>:<line>) {<namespace>} {digest} <transaction-type> from <source>`

DETAIL

message

DETAIL (key-busy): (<file>:<line>) {<namespace>} {digest} <transaction-type> from <source>

description

context

key-busy

introduced

6.4

removed

-

occurs

Describes the digest, transaction type, and transaction source of a possibly hot key. This message is issued whenever a transaction fails with a KEY_BUSY error (14).

parameters

Possible transaction types:

read
write
delete
udf
batch-sub-read
batch-sub-write
batch-sub-delete
batch-sub-udf

Possible source types:

client ip address:port
proxy
bg-udf
bg-ops
re-repl

See Hot Key Error code 14.

Migrate

`handle insert: binless pickle, dropping {digest}:0x2b2c08f859bd4eb401982e038a2bdcae2b74c853`

WARNING

message

handle insert: binless pickle, dropping {digest}:0x2b2c08f859bd4eb401982e038a2bdcae2b74c853

description

Attempted to migrate a record that had no bins, so didn’t insert it into the receiving partition.

context

migrate

introduced

-

removed

-

parameters

digest

The 160-bit digest of the record, in hex.

`migrate: handle insert: got bad record`

WARNING

message

migrate: handle insert: got bad record

description

Inbound record for migration is corrupt. Appears with the WARNING message about record too small 0 and is documented in the following Why Do I See Stalled Migrations And "record too small" Errors In The Log?.

context

migrate

introduced

-

removed

-

`migrate: handle insert: got no record`

WARNING

message

migrate: handle insert: got no record

description

Appears on the destination (inbound) node with the warning message about handle insert: got bad record. Associated with the new unreadable digest warning for Database 6.0.

context

migrate

introduced

-

removed

-

`migrate: record flatten failed ce8a775a68c93d49`

WARNING

message

migrate: record flatten failed ce8a775a68c93d49

description

The storage is full for this namespace, during the time of migration. Resolve by allocating more storage for namespace, or reducing volume of data in namespace.

context

migrate

introduced

-

removed

-

parameters

record

Record identifier.

`migrate: unreadable digest`

WARNING

message

migrate: unreadable digest

description

Appears on the source node when the record could not be read locally. Introduced with Database 6.0. Associated with the handle insert: got no record warning.

context

migrate

introduced

-

removed

-

`missing acks from node BB9030011AC4202`

WARNING

message

missing acks from node BB9030011AC4202

description

This means that a particular migration thread has begun to throttle because it has 16MB or more of un-acknowledged records, and has remained in that state for at least 5 seconds. The message prints every 5 seconds until the outstanding acks drop below the threshold. This happens when migration has been pushed higher than your network or machine/disks can sustain. To mitigate, decrease migrate-threads down to a minimum of 1. This change does not take effect immediately; threads terminate as the partition they are handling completes migration. This warning could also be triggered if channel-bulk-recv-threads was reduced during ongoing migrations.

context

migrate

introduced

-

removed

-

parameters

node

The destination node of the migrate thread.

Mrt audit

`{namespace} monitor aborting mrt-id 123456789abcdef0`

INFO

message

{namespace} monitor aborting mrt-id 123456789abcdef0

description

Indicates when monitors kick in to abort transactions.

context

mrt-audit

introduced

8.0

removed

-

parameters

mrt-id

Transaction ID that was aborted.

`{namespace} monitor committing mrt-id 123456789abcdef0`

INFO

message

{namespace} monitor committing mrt-id 123456789abcdef0

description

Indicates when monitors kick in to commit transactions.

context

mrt-audit

introduced

8.0

removed

-

parameters

mrt-id

Transaction ID that was committed.

Msg

`malloc`

CRITICAL

message

malloc

description

Indicates a shortage of memory. Make sure nodes have enough memory.

context

msg

introduced

-

removed

-

Namespace

`at set names limit, can't add set`

WARNING

message

at set names limit, can't add set

description

This warning indicates that the maximum number of sets per namespace has been breached. A namespace can hold a maximum of 1023 sets prior to server 7 and 4095 sets since Database 7. Such an issue can cause migrations to get stuck and writes to fail. See How to clear up set names when they exceed the limit.

context

namespace

introduced

-

removed

-

`can't add SET (at sets limit)`

WARNING

message

can't add SET (at sets limit)

description

Indicates that the maximum number of sets per namespace has been breached. A namespace can hold a maximum of 1023 sets prior to Database 7 and 4095 sets since Database 7. Such an issue can cause migrations to get stuck and writes to fail. See How to clear up set names when they exceed the limit.

context

namespace

introduced

-

removed

-

parameters

set

Name of set that was being created in excess of the limit.

`fail persistent memory delete`

CRITICAL

message

fail persistent memory delete

description

Indicates an issue deleting PMem memory on startup. Can be resultant from incorrect permissions on the /mnt/pmem directory.

context

namespace

introduced

-

removed

-

`{namespace} mrt-finish: verify-read (100,0,0) roll-forward (50,0,0) roll-back (1,0,0)`

INFO

message

{namespace} mrt-finish: verify-read (100,0,0) roll-forward (50,0,0) roll-back (1,0,0)

description

Appears if any of the following stats are non-zero:

context

namespace

introduced

8.0

removed

-

`{namespace} mrt-monitor-finish: roll-forward (0,0,0) roll-back (1,0,0)`

INFO

message

{namespace} mrt-monitor-finish: roll-forward (0,0,0) roll-back (1,0,0)

description

Appears if any of the following stats are non-zero:

context

namespace

introduced

8.0

removed

-

`{namespace_name} appeals: remaining-tx 0 active (0,19)`

INFO

message

{namespace_name} appeals: remaining-tx 0 active (0,19)

description

Information about the number of partition appeals not yet sent and currently being sent. Partition appeals occur for namespaces operating understrong-consistency mode when a node needs to validate the records it has when joining the cluster.

context

namespace

introduced

-

removed

-

parameters

remaining-tx

Current value for appeals_tx_remaining.

active (active_tx, active_rx)

A tuple based on appeals_tx_active and appeals_rx_active.

`ns can’t attach persistent memory base block: block does not exist`

WARNING

message

ns can’t attach persistent memory base block: block does not exist

description

A missing shared memory block. This typically happens when a node is rebooted, but shared memory blocks can be deleted for other reasons. The node must perform a cold start.

context

namespace

introduced

-

removed

-

`{ns_name} found no valid persistent memory blocks, will cold start`

INFO

message

{ns_name} found no valid persistent memory blocks, will cold start

description

treex and base shared memory blocks are missing. This typically happens when a node is rebooted or after an ungraceful shutdown. The node must perform a cold start.

context

namespace

introduced

-

removed

-

`{ns_name} persisted arena stages`

INFO

message

{ns_name} persisted arena stages

description

This message is one of a sequence of messages logged during Aerospike server shutdown of storage-engine device namespaces. The message signifies that the arena stages for the namespace have been persisted to storage. An unusual delay in the appearance of this message during shutdown might be due to index-type configured to pmem.

context

namespace

introduced

4.6.0.2

removed

-

`{ns_name} persisted tree roots`

INFO

message

{ns_name} persisted tree roots

description

This message is one of a sequence of messages logged during Aerospike server shutdown of storage-engine device namespaces. The message signifies that the namespace’s common partition index tree information has been persisted to storage. An unusual delay in the appearance of this message during shutdown might be due to a high number of partition-tree-sprigs configured for the namespace.

context

namespace

introduced

4.6.0.2

removed

-

`{ns_name} persisted trusted base block`

INFO

message

{ns_name} persisted trusted base block

description

This message is one of a sequence of messages logged during Aerospike server shutdown of storage-engine device namespaces. The message signifies that the persistent memory base block for the namespace has been persisted to storage with “trusted” status. “trusted” status is a necessary condition for a subsequent fast restart of the namespace.

context

namespace

introduced

4.6.0.2

removed

-

Network

`fabric_connection_process_readable() recv_sz -1 msg_sz 0 errno 110 Connection timed out`

WARNING

message

fabric_connection_process_readable() recv_sz -1 msg_sz 0 errno 110 Connection timed out

description

The warning message indicates that a fabric connection timed out. In Database 5.6 and later, the log line has the node-id to identify which node is having the fabric connection time out.

context

network

introduced

-

removed

-

Nsup

`{NAMESPACE} failed to create evict-prep thread 5`

CRITICAL

message

{NAMESPACE} failed to create evict-prep thread 5

description

Indicates a shortage of memory. Make sure nodes have enough memory.

context

nsup

introduced

-

removed

-

parameters

NAMESPACE

The namespace nsup was looking at when memory ran low.

thread

Thread identifier.

`{bigdata} hwm breached but nothing to evict`

WARNING

message

{bigdata} hwm breached but nothing to evict

description

The amount of data in memory (high-water-memory-pct) or on disk (high-water-disk-pct, mounts-high-water-pct) has exceeded the limits set for the namespace, triggering evictions, but there is no data with a finite TTL to be evicted.

context

nsup

introduced

-

removed

-

parameters

ns

Namespace where the high water mark was breached.

`{<namespace>} breached eviction limit (<reasons>), sys-memory pct:41, indexes-memory sz:0 (0 + 0 + 0), index-device sz:1073741824000 used-pct 88, data used-pct:58`

WARNING

message

{<namespace>} breached eviction limit (<reasons>), sys-memory pct:41, indexes-memory sz:0 (0 + 0 + 0), index-device sz:1073741824000 used-pct 88, data used-pct:58

description

context

nsup

introduced

-

removed

-

occurs

Checking memory or disk usage at start of nsup cycle and finding that the high-water mark has been breached

parameters

breached eviction limit

memory or disk depending on which limit was breached

memory

Memory used in bytes (primary index + secondary indexes + data in memory), and the amount of the high-water mark (total available * high-water-memory-pct. For Database 5.6 and later the set index memory used is also added.

index-device

Used space on device for index-type flash in bytes and the amount of the high-water mark (total available * mounts-high-water-pct

disk

Disk space used in bytes and the amount of the high-water mark (total available * high-water-disk-pct

`{<namespace>} breached stop-writes limit (<reasons>), sys-memory pct:96, indexes-memory sz:107374182400 (107374182400 + 0 + 0), data avail-pct:20 used-pct:58`

WARNING

message

{<namespace>} breached stop-writes limit (<reasons>), sys-memory pct:96, indexes-memory sz:107374182400 (107374182400 + 0 + 0), data avail-pct:20 used-pct:58

description

context

nsup

introduced

-

removed

-

occurs

Upon checking memory at start of NSUP cycle and finding that a threshold triggering stop writes has been breached (either memory based on stop-writes-pct or stop-writes-sys-memory-pct, or disk based on min-avail-pct or max-used-pct).

parameters

breached stop-writes limit

sys-memory, memory, device-avail-pct, device-used-size or some combination depending on which threshold was breached

sys-memory

Total system memory used in bytes greater than the stop-writes-sys-memory-pct threshold, triggered when (100 - system_free_mem_pct >= stop-writes-sys-memory-pct)

indexes-memory

Memory used in bytes greater than the stop-writes-pct threshold (total available * stop-writes-pct). Prior to Database 5.6, tracked under memory_used_bytes (primary index + secondary indexes + data in memory). Database 5.6 and later, tracked under memory_used_bytes (primary index + set index bytes + secondary indexes + data in memory).

data avail-pct

The available contiguous free space, tracked under device_available_pct

data used-pct

The amount of used disk space, tracked as the ratio of the device_used_bytes and device_total_bytes metrics.

`{namespace} failed set evict-void-time 328967879`

WARNING

message

{namespace} failed set evict-void-time 328967879

description

When high-water-disk-pct, high-water-memory-pct, or mounts-high-water-pct is breached on any node and eviction starts, the timestamp before which records need to be evicted is propagated to all nodes through the System Meta Data (SMD) mechanism, to ensure that no orphaned replicas are left anywhere in the cluster. If not all nodes have acknowledged the message within 5 seconds, this message is logged. Occasional instances of this message do not indicate a serious problem, as the timestamp is picked up on the next namespace supervisor (nsup) cycle.

context

nsup

introduced

-

removed

-

parameters

ns

The namespace where evictions were triggered.

evict void time

The timestamp, in seconds since the Aerospike Epoch of 2010-01-01T00:00:00, before which records should be evicted.

`{namespace} would evict all 146897768 records eligible - not evicting!`

WARNING

message

{namespace} would evict all 146897768 records eligible - not evicting!

description

The namespace supervisor (nsup) configuration required it to evict all expirable records (records with a configured time-to-live or TTL). Most commonly seen when evict-tenths-pct is set to 1000 or greater, but could also happen when the distribution of TTLs is extremely skewed.

context

nsup

introduced

-

removed

-

parameters

namespace

Namespace where this happened.

record count

Total number of evictable records (that is, records with a TTL) in the namespace.

`{<namespace>} xdr-tomb-raid-start: threads <n>`

INFO

message

{<namespace>} xdr-tomb-raid-start: threads <n>

description

XDR tomb raider has started for the namespace. This log entry appeared under the NSUP context prior to Database 7.1.

context

xdr

introduced

-

removed

-

occurs

Expected to run every xdr-tomb-raider-period seconds.

parameters

namespace

The namespace for which the tombstones are being dropped

n

The number of threads configured for the XDR tomb raider

`{ns-name} no records below eviction void-time 329702784 - threshold bucket 9998, width 3154 sec, count 4000000 > target 20000 (0.5 pct)`

WARNING

message

{ns-name} no records below eviction void-time 329702784 - threshold bucket 9998, width 3154 sec, count 4000000 > target 20000 (0.5 pct)

description

context

nsup

introduced

-

removed

-

occurs

Checking the eviction buckets from the bottom up, but there are no records found until one bucket has enough to put the total over the limit for one eviction cycle

parameters

void-time

Timestamp at the top of the bucket that breached the eviction limit

threshold bucket

Which bucket out of the evict-hist-buckets breached the eviction limit

width

Width of the bucket in seconds

count

Total records found in this bucket

target

Number of records allowed for this eviction cycle and the evict-tenths-pct used to calculate it

`{ns-name} nsup deleted 1.56% of namespace - configure more nsup threads?`

WARNING

message

{ns-name} nsup deleted 1.56% of namespace - configure more nsup threads?

description

context

nsup

introduced

6.3

removed

-

occurs

When NSUP cycle takes more than 2 hours and deletes more than 1% of namespace.

`{ns-name} nsup-done: non-expirable 42162 expired (576066,922) evicted (24000935,259985) evict-ttl 134000 total-ms 155`

INFO

message

{ns-name} nsup-done: non-expirable 42162 expired (576066,922) evicted (24000935,259985) evict-ttl 134000 total-ms 155

description

context

nsup

introduced

4.5.1

removed

-

occurs

Logged when the namespace supervisor (nsup) completes expiration or eviction processing for a namespace. See the most recently logged nsup-start entry for the namespace to determine which type of processing has just completed. Expiration processing never evicts records, but eviction processing can expire records.

parameters

non-expirable

The number of records without a TTL. These records do not expire and are never eligible for eviction.

expired

Number of records removed due to expiration; total since the node started and total for the current nsup cycle. In this example, 922 records expired in the most recent nsup cycle, and 576066 records have expired since the node was last started.

evicted

Number of records evicted (early-expired); total since the node started and total for the current nsup cycle. In this example, nsup evicted 259985 records in the most recent cycle, and 24000935 records since the node was last started.

evict-ttl

The high-end expiration-time of evicted (early-expired) records (in seconds).

total-ms

Duration of the just completed nsup expiration or eviction processing cycle, in milliseconds. In this example, the processing cycle completed in 155 milliseconds

`{ns-name} nsup-start: evict-threads 1 evict-ttl 2745 evict-void-time (287665530,287665930)`

INFO

message

{ns-name} nsup-start: evict-threads 1 evict-ttl 2745 evict-void-time (287665530,287665930)

description

context

nsup

introduced

4.6.0

removed

-

occurs

Logged when the namespace supervisor (nsup) begins eviction processing for a namespace.

parameters

evict-threads

The number of threads to be used for the eviction processing cycle.

evict-ttl

The specified eviction depth for the namespace, expressed as a time to live threshold in seconds, below which any eligible records are evicted.

evict-void-time

The current effective eviction depth and the specified eviction depth for the namespace. Each is expressed as a void time, in seconds since 1 January 2010 UTC.

`{ns-name} nsup-start: expire-threads 1`

INFO

message

{ns-name} nsup-start: expire-threads 1

description

context

nsup

introduced

4.6.0

removed

-

occurs

Logged when the namespace supervisor (nsup) begins expiration processing for a namespace.

parameters

expire-threads

The number of threads to be used for the expiration processing cycle. Corresponds to the configured nsup-threads.

`{ns-name} sindex-gc: Processed: 3133360101, found:365961945, deleted: 365952323: Total time: 62667962 ms`

INFO

message

{ns-name} sindex-gc: Processed: 3133360101, found:365961945, deleted: 365952323: Total time: 62667962 ms

description

context

nsup

introduced

3.14.0

removed

4.6.0

occurs

Secondary index (sindex) garbage collection cycle summary. Replaced by “sindex-gc-done” message in 4.6.0.

parameters

Processed

Count of sindex entries that have been checked. Corresponds to the sindex_gc_objects_validated statistic.

found

Count of sindex entries found eligible for garbage collection. Corresponds to the sindex_gc_garbage_found statistic.

deleted

Count of sindex entries deleted through garbage collection (may be lower than above number if those entries got deleted while the garbage collector was running, for example through a competing truncate command). Corresponds to the sindex_gc_garbage_cleaned statistic.

Total time

Duration of a cycle of sindex garbage collection in milliseconds.

`{ns-name} sindex-gc start`

INFO

message

{ns-name} sindex-gc start

description

context

nsup

introduced

3.14.0

removed

4.6.0

occurs

Starting secondary index (sindex) garbage collection. Replaced by “sindex-gc-start” message in 4.6.0.

Os

`failed OS_CHECK check - MESSAGE`

WARNING

message

failed OS_CHECK check - MESSAGE

description

Indicates that a linux best-practices was violated at startup.

context

os

introduced

5.7

removed

-

parameters

OS_CHECK

Name of the check which was violated, could be one of the following:

sMessage

Description of how the best-practice was violated.

Particle

`parse_op - error 4 unable to build expression op`

WARNING

message

parse_op - error 4 unable to build expression op

description

This indicates that an attempt to execute an expression failed. Generic warning triggered by failure to parse the underlying expression encoding.

Example: WARNING (particle): (expop.c:127) parse_op - error 4 unable to build expression op

context

particle

introduced

5.6

removed

-

Partition

`{ns_name} 2 of 5 nodes are quiesced`

INFO

message

{ns_name} 2 of 5 nodes are quiesced

description

Number of nodes quiesced in the cluster (or sub cluster).

context

partition

introduced

4.3.1

removed

-

occurs

When the cluster changes (node addition, removal, or network splits) or when the cluster receives a ‘recluster’ info command.

parameters

nodes participating

The number of nodes quiesced in this sub-cluster out of the total number of nodes observed in the sub-cluster. For strong-consistency namespaces, it is full roster size instead of observed in the sub-cluster.

`{ns_name} 5 of 6 nodes participating - regime 221 -> 223`

INFO

message

{ns_name} 5 of 6 nodes participating - regime 221 -> 223

description

Number of nodes participating in the cluster (or sub cluster) as well as the regime change.

context

partition

introduced

4.0

removed

-

occurs

For strong-consistency namespaces, when the cluster changes, because of node(s) leaving or joining the cluster (or network splits).

parameters

nodes participating

The number of nodes participating in this cluster out of the total number of nodes for the full roster.

regime

This number increments every time there is a reclustering event. This is used in strong consistent namespace and is leveraged by the client libraries. For further details on regime, see Strong consistency.

`{ns_name} rebalanced: expected-migrations (1215,1224,1215) fresh-partitions 397`

INFO

message

{ns_name} rebalanced: expected-migrations (1215,1224,1215) fresh-partitions 397

description

Number of partitions expected to migrate (transmitted, received, and signals) as well as other migration related statistics and fresh partition number.

context

partition

introduced

3.13

removed

-

occurs

For non strong-consistency namespaces, when the cluster changes, because of node addition, removal or network splits.

parameters

expected-migrations

The number of partitions expected to migrate (Transmitted, Received, Signals) as part of this reclustering event. Those correspond to the migrate_tx_partitions_initial, migrate_rx_partitions_initial, and migrate_signals_remaining statistics respectively.

fresh-partitions

Number of partitions that are created fresh or empty because a number of nodes, greater than the replication factor, has left the cluster.

`{ns_name} rebalanced: regime 295 expected-migrations (826,826,826) expected-appeals 0 unavailable-partitions 425`

INFO

message

{ns_name} rebalanced: regime 295 expected-migrations (826,826,826) expected-appeals 0 unavailable-partitions 425

description

Number of partitions expected to migrate (transmitted, received, and signals) as well as other migration related statistics and partition availability details.

context

partition

introduced

4.0

removed

-

occurs

For strong-consistency namespaces, when the cluster changes, because of node(s) leaving or joining the cluster (or network splits).

parameters

regime

This number increments every time there is a reclustering event. This is used in strong consistency namespaces and is leveraged by the client libraries. For further details on regime, see Strong consistency.

expected-migrations

The number of partitions expected to migrate (Transmitted, Received, Signals) as part of this reclustering event. Those correspond to the migrate_tx_partitions_initial, migrate_rx_partitions_initial, and migrate_signals_remaining statistics respectively.

expected-appeals

The number of appeals expected as part of this reclustering event. Appeals occur after a node has been cold-started. The replication state of each record is lost on cold-start and all records must assume an unreplicated state. An appeal resolves replication state from the partition’s acting master. These are important for performance; an unreplicated record must re-replicate to be read, which adds latency. During a rolling cold-restart, an operator may want to wait for the appeal phase to complete after each restart to minimize the performance impact of the procedure. Corresponds to the appeals_tx_remaining statistic but only at the initial time of the reclustering event.

unavailable-partitions

The number of partitions that are unavailable as the roster is not complete and all writes that have occurred to those partitions are not present. Partitions remaining unavailable after the cluster is formed by the full roster become dead and require the use of the revive command to make them available again, which could lead to inconsistencies, depending on what lead to those partition being dead. Revived nodes restore availability only when all nodes are trusted. Corresponds to the unavailable_partitions statistic.

`{ns_name} rebalanced: regime 295 expected-migrations (826,826,826) fresh-partitions 397 expected-appeals 0 dead-partitions 425`

WARNING

message

{ns_name} rebalanced: regime 295 expected-migrations (826,826,826) fresh-partitions 397 expected-appeals 0 dead-partitions 425

description

Number of partitions expected to migrate (transmitted, received, and signals) as well as other migration related statistics and partition availability details.

context

partition

introduced

4.0

removed

-

occurs

For strong-consistency namespaces, when the cluster reforms with all roster members but resulting in dead partitions present.

parameters

regime

This number increments every time there is a reclustering event. This is used in strong consistent namespace and is leveraged by the client libraries. For further details on regime, see Strong consistency.

expected-migrations

The number of partitions expected to migrate (Transmitted, Received, Signals) as part of this reclustering event. Those correspond to the migrate_tx_partitions_initial, migrate_rx_partitions_initial, and migrate_signals_remaining statistics respectively.

fresh-partitions

Number of partitions that are created fresh or empty because a number of nodes, greater than the replication factor, has left the cluster.

expected-appeals

The number of appeals expected as part of this reclustering event. Appeals occur after a node has been cold-started. The replication state of each record is lost on cold-start and all records must assume an unreplicated state. An appeal resolves replication state from the partition’s acting master. These are important for performance; an unreplicated record must re-replicate to be read, which adds latency. During a rolling cold-restart, an operator may want to wait for the appeal phase to complete after each restart to minimize the performance impact of the procedure. Corresponds to the appeals_tx_remaining statistic but only at the initial time of the reclustering event.

unavailable-partitions

The number of partitions that are dead. Corresponds to the dead_partitions statistic. Requires the use of the revive command to make such partitions available again, which could lead to inconsistencies, depending on what lead to those partition being dead.

Proto

`protocol write fail: fd 123 sz 30 errno 32`

DEBUG

message

protocol write fail: fd 123 sz 30 errno 32

description

Client has closed the socket before the server could respond. Client may have timed out, has too many open connections, or there may be network issues.

context

proto

introduced

-

removed

-

parameters

fd

File descriptor of the socket to the client.

sz

Size of the response that couldn’t be sent.

errno

Error number returned by OS. 32, EPIPE, is common.

Query

`Could not find nbtr iterator`

WARNING

message

Could not find nbtr iterator

description

Indicates that there is a secondary index key with no corresponding digests (a bin value for which there are no records). Possibly a primary index and corresponding record got deleted, but the secondary index still exists. Subsequent round of garbage collection may correct the tree structure. May mean that partial results were returned during the query. If seen frequently, may mean that there is a structural issue with the secondary index. In this case, drop and re-create the index.

context

query

introduced

-

removed

5.7

`starting basic query job 562360700545657251 {test:sb_set5:<pi-query>} n-pids-requested (4096,0) rps 0 sample-max 0 socket-timeout 30000 from 127.0.0.1:51124`

DEBUG

message

starting basic query job 562360700545657251 {test:sb_set5:<pi-query>} n-pids-requested (4096,0) rps 0 sample-max 0 socket-timeout 30000 from 127.0.0.1:51124

description

A basic query job is initiated on the specified namespace and set.

context

query

introduced

6.0

removed

-

parameters

query job

ID of the query.

{namespace : set : index-name}

The index name is either the name of the secondary index or pi-query.

n-pids-requested

(n-pids-requested, n-keyds-requested) - The first number is the number of partition IDs requested in this partitioned query.

The second number is the number of those partitions that start at a specified digest. This hints at whether this query is paginated, using a cursor.

rps

Records per second rate requested by the query policy. If not specified, the value of background-query-max-rps is in effect.

sample-max

The maximum number of records to return from the query, as specified by the query policy.

socket-timeout

The socket timeout requested by the client policy, in milliseconds.

from

Client IP address and port.

Rbuffer

`Ring buffer file /opt/aerospike/xdr/digestlog should be at least 18057 bytes Boot strap failed for digest log file (null)`

WARNING

message

Ring buffer file /opt/aerospike/xdr/digestlog should be at least 18057 bytes Boot strap failed for digest log file (null)

description

The minimum possible size of the XDR digestlog, as specified in xdr-digestlog-path, is 18057 bytes. This is not a limitation that should ever come up in practice, because realistic sizes are in the tens or hundreds of GB.

context

rbuffer

introduced

-

removed

-

`unable to create digest log file /opt/aerospike/xdr/digestlog: No such file or directory`

WARNING

message

unable to create digest log file /opt/aerospike/xdr/digestlog: No such file or directory

description

asd tries to create the digest log at start time if it does not already exist, but may fail due to permissions issues, a bad directory, path, etc.

context

rbuffer

introduced

-

removed

-

parameters

log path

Path of the file that asd tried to create from xdr-digestlog-path

system error

The message from the OS giving the reason for failing to create the file

Record

`{AS_PARTICLE} AS_CDT_OP_LIST_APPEND: failed`

WARNING

message

{AS_PARTICLE} AS_CDT_OP_LIST_APPEND: failed

description

This error may occur when a collection data type exists for the record and the write policy was set to CREATE_ONLY.

context

record

introduced

5.2

removed

-

`{AS_PARTICLE} cdt_process_state_context_eval() bin is empty and op has no create flags`

WARNING

message

{AS_PARTICLE} cdt_process_state_context_eval() bin is empty and op has no create flags

description

The map context does not exist.

context

record

introduced

-

removed

-

`{AS_PARTICLE} map_subcontext_by_key() cannot create key with non-storage elements`

WARNING

message

{AS_PARTICLE} map_subcontext_by_key() cannot create key with non-storage elements

description

A map context cannot be created with elements that are not stored, such as a wildcard.

context

record

introduced

-

removed

-

`{AS_PARTICLE} packed_list_get_remove_by_index_range() index 155 out of bounds for ele_count 155`

WARNING

message

{AS_PARTICLE} packed_list_get_remove_by_index_range() index 155 out of bounds for ele_count 155

description

There was an attempt to remove an element at index 156 in a list of 155 elements. Since index starts from 0, index 155 means element 156. This pattern applies for any populated index. If the index is unpopulated, the error reads index 0 out of bounds for ele_count 0.

context

record

introduced

-

removed

-

`{bigdata} record replace: drives full`

WARNING

message

{bigdata} record replace: drives full

description

No more space left in storage.

context

record

introduced

-

removed

-

parameters

ns

Namespace being written to at the time storage filled up.

`{namespace_name} record replace: failed write 1142f0217ababf9fda5b1a4de66e6e8d4e51765e`

DETAIL

message

{namespace_name} record replace: failed write 1142f0217ababf9fda5b1a4de66e6e8d4e51765e

description

Most likely appearing as a result of exceeding the write-block-size. The record’s digest is the last item in the log entry. To determine what set is being written to, see How to return the set name of a record using its digest.

context

record

introduced

5.2

removed

-

parameters

{namespace_name}

Namespace being written to.

Roster

`(roster_ee.c:151) illegal node id 0`

WARNING

message

(roster_ee.c:151) illegal node id 0

description

This error may occur during a downgrade from Database 7.2 if you are using SC mode and have an active-rack configured in the roster.

context

roster

introduced

4.0

removed

-

`(roster_ee.c:237) {test} invalid node list M100|bb...01|`

WARNING

message

(roster_ee.c:237) {test} invalid node list M100|bb...01|

description

This error may occur during a downgrade from Database 7.2 if you are using SC mode and have an active-rack configured in the roster.

context

roster

introduced

4.0

removed

-

Rw client

`{ns_name} client 10.0.3.182:51160 write {digest}:0x8df238affec6f8e3a2c22d6c54c91c5bc4f3ff81`

DETAIL

message

{ns_name} client 10.0.3.182:51160 write {digest}:0x8df238affec6f8e3a2c22d6c54c91c5bc4f3ff81

description

Provides details on the originating client’s IP address, the transaction type and the digest of the record being accessed.

context

rw-client

introduced

3.16.0.1

removed

-

occurs

Single line per client transactions that get as far as successfully reserving a partition. Requires log level to be set to detail for the rw-context: asinfo -v "log-set:id=0;rw-client=detail"

parameters

client

The originating client’s IP address.

transaction {digest}

The transaction type (read/write/delete/udf) as well as the digest.

Rw

`WARNING (rw): (write.c:926) write_master: null/empty set name not allowed for namespace {namespace}`

WARNING

message

WARNING (rw): (write.c:926) write_master: null/empty set name not allowed for namespace {namespace}

description

The write fails because the set name is null or empty and disallow-null-setname is true.

context

rw

introduced

-

removed

-

`{bigdata}: write_master: drives full`

WARNING

message

{bigdata}: write_master: drives full

description

No more space left in storage.

context

rw

introduced

-

removed

-

parameters

ns

Namespace being written to at the time storage filled up.

`dup-res ack: no digest`

WARNING

message

dup-res ack: no digest

description

During a downgrade from Database 4.5.3+ to a prior version, the following protocol-related warning may be seen temporarily on a node with the prior version. This warning is harmless, the result of an older node briefly receiving 4.5.3+ protocol fabric messages from newer nodes. This message ceases after the next rebalance.

context

rw

introduced

-

removed

-

`got rw msg with unrecognized op 8`

WARNING

message

got rw msg with unrecognized op 8

description

During a downgrade from a Database 4.5.3+ server to a prior version, the following protocol-related warning may be experienced temporarily on a node with the earlier version. This warning is harmless. This warning is resultant of an older node briefly receiving 4.5.3+ protocol fabric messages from newer nodes. This message ceases after the next rebalance.

context

rw

introduced

-

removed

-

`key mismatch - end of universe?`

WARNING

message

key mismatch - end of universe?

description

KEY_MISMATCH error. Any update, delete or read request for a record which has key stored, the incoming key does not match with the existing stored key. Indicates a RIPEMD-160 key collision which is not likely. See Collision Resistance of RIPEMD-160 for details. This message occurs in case of key / hash mismatch on the application side or some message level corruption.

context

rw

introduced

-

removed

-

`{namespace} can't get stored key {digest}:0x5230acd92762fa6f827e902d58199dd7b928479c`

WARNING

message

{namespace} can't get stored key {digest}:0x5230acd92762fa6f827e902d58199dd7b928479c

description

This message indicates that the index has the stored-key flag set for a record, but the record in storage does not containthe key. Normally this would never happen, but there was a bug in some older database Databases whereby this mismatch could occur when a node containing old data was cold-started and brought back into the cluster with records that had been deleted but not durably deleted. This was corrected in Database 4.4.0.8 and later. Enterprise Edition Licensees can contact Aerospike Support for further guidance when encountering this error.

context

rw

introduced

-

removed

-

`{namespace} drop while replicating`

WARNING

message

{namespace} drop while replicating

description

This message indicates that a record was dropped while replicating. For example, if a truncation is run and records are still being replicated while the truncation hits the master record. This can also happen when non durable deletes are allowed (strong-consistency-allow-expunge) on a strong-consistency enabled namespace. This increments the client_write_error metric.

context

rw

introduced

-

removed

-

`{<namespace>} write_master: bin name too long (<length>) <digest>`

WARNING

message

{<namespace>} write_master: bin name too long (<length>) <digest>

description

Expected when a client attempts to create a bin with a name that exceeds the 15 character limit. The error is returned to the client and the operation does not succeed. See System limits

context

rw

introduced

-

removed

-

parameters

{namespace}

Namespace being written to.

length

Length of the bin name sent by the client.

digest

The digest of the record being updated.

`{namespace} write_master: failed as_bin_cdt_alloc_modify_from_client()`

WARNING

message

{namespace} write_master: failed as_bin_cdt_alloc_modify_from_client()

description

This error is expected if a list_clear() is called for a bin that isn’t a list or doesn’t exist. list_clear() returns an error when the bin isn’t a list or doesn’t exist.

context

rw

introduced

5.2

removed

-

parameters

{namespace}

Namespace being written to.

`{namespace} write_master: failed as_storage_record_write() {digest}:0xd751c6d7eea87c82b3d6332467e8bc9a3c630e13`

WARNING

message

{namespace} write_master: failed as_storage_record_write() {digest}:0xd751c6d7eea87c82b3d6332467e8bc9a3c630e13

description

Result of exceeding the write-block-size. Setting the rw and drv_ssd contexts to detail logging provides the accompanying explanatory log messages. See log-set and Changing Log Levels for how to dynamically change log levels. To determine what set is being written to, see How to return the set name of a record using its digest.

context

rw

introduced

3.16

removed

5.2

parameters

{namespace}

Namespace being written to

{digest}

Digest of the record that was rejected

`{namespace} write_master: record too big {digest}:0xd751c6d7eea87c82b3d6332467e8bc9a3c630e13`

DETAIL

message

{namespace} write_master: record too big {digest}:0xd751c6d7eea87c82b3d6332467e8bc9a3c630e13

description

Appears with the WARNING message about failed as_storage_record_write() for exceeding the write-block-size. To determine what set is being written to, see How to return the set name of a record using its digest.

context

rw

introduced

3.16

removed

-

parameters

{namespace}

Namespace being written to

{digest}

Digest of the record that was rejected

`{namespace_name} write_master: failed as_storage_record_write() 1142f0217ababf9fda5b1a4de66e6e8d4e51765e`

DETAIL

message

{namespace_name} write_master: failed as_storage_record_write() 1142f0217ababf9fda5b1a4de66e6e8d4e51765e

description

Most likely appearing as a result of exceeding the write-block-size. The record’s digest is the last item in the log entry. To determine what set is being written to, see How to return the set name of a record using its digest.

context

rw

introduced

5.2

removed

-

parameters

{namespace_name}

Namespace being written to.

`write_master: disallowed ttl with nsup-period 0`

WARNING

message

write_master: disallowed ttl with nsup-period 0

description

When expirations in a namespace are disabled because nsup-period is set to the default of 0, records with a TTL other than 0 may not be written to that namespace. This is to avoid confusion as to whether the records should be subject to expiration (or eviction). Simply set the nsup-period to a value other than 0 to allow records with a TTL other than 0 to be written. If there is a need to allow records with a non 0 TTL without having the nsup thread running, it is possible to set ‘allow-ttl-without-nsup` to true, but this is not recommended prevents those records from being properly deleted upon expiration.

context

rw

introduced

-

removed

-

Scan

`basic scan job 283603086331273222 failed to start (4)`

WARNING

message

basic scan job 283603086331273222 failed to start (4)

description

A scan job was submitted with maxRetries greater than 0 and failed or timed out on one or more nodes, causing a retry to conflict with the remaining scan that has the same transaction ID and exit with error code 4 (parameter error).

context

scan

introduced

-

removed

-

parameters

trid

trid of the scan job being retried

`error sending to 10.0.0.1:45678 - fd 646 sz 1049036 Connection timed out`

WARNING

message

error sending to 10.0.0.1:45678 - fd 646 sz 1049036 Connection timed out

description

A scan is trying to send data back to the client, but it cannot send the full data set. This is likely due to an aborted client-side scan. The scan must be re-run.

context

scan

introduced

-

removed

-

parameters

client

Client address:port.

fd

File descriptor of the socket to the client.

sz

Size of the response attempted.

error

Error message returned by OS.

`not starting scan 261941093 because rchash_put() failed with error -4`

WARNING

message

not starting scan 261941093 because rchash_put() failed with error -4

description

Server rejects scan transaction because it is already processing. Wait for the scan to complete or abort it before issuing the scan transaction again.

context

scan

introduced

-

removed

-

parameters

scan

ID of the scan.

errno

Error code.

`scan msg from 10.11.12.13 has unrecognized set name_of_the_set`

WARNING

message

scan msg from 10.11.12.13 has unrecognized set name_of_the_set

description

A scan specifying an unknown set was received. Typically a user error such as a mistyped set name, but also happens if the set does not exist on all the nodes in the cluster. For example, a set with fewer records than the number of nodes in the cluster, where some nodes don’t hold any record (master or replica) for that set. Having something configured for the set, such as access control that is set specific or a secondary index definition, causes this log warning message to disappear as the set would then be ‘known’ on the node even without any record in it.

context

scan

introduced

-

removed

-

parameters

address

Node that the scan message came from.

setname

Set name that doesn’t exist on this node.

`send error - fd 646 sz 1049036 rv 521107`

WARNING

message

send error - fd 646 sz 1049036 rv 521107

description

A scan is trying to send data back to the client, but it cannot send the full data set. This is likely due to an aborted client-side scan. The scan must be re-run.

context

scan

introduced

-

removed

-

parameters

fd

File descriptor of the socket to the client.

sz

Size of the response attempted.

rv

Size that was successfully sent.

`starting basic scan job 8104671463142312256 {namespace:set} rps 0 sample-max 100 socket-timeout 30000 from 172.22.xx.yy:52842`

DEBUG

message

starting basic scan job 8104671463142312256 {namespace:set} rps 0 sample-max 100 socket-timeout 30000 from 172.22.xx.yy:52842

description

A basic scan job is initiated on the specified namespace and set.

context

scan

introduced

4.9

removed

-

parameters

scan job

ID of the scan.

rps

Configured records per second of the scan. Configured in the scan policy. If this is not specified, the cluster maximum of the background-scan-max-rps parameter is in effect.

sample-max

The maximum number of records to return from the scan, if specified. Configured in the scan policy.

metadata-only

Only if the scan is configured to initiate a metadata scan.

fail-on-cluster-change

If this is enabled in the scan policy, or using AQL, scans are halted if there is a cluster change, such as nodes leaving or joining the cluster. See Why do I get AEROSPIKE_ERR_CLUSTER_CHANGE when querying or scanning a namespace? for details.

socket-timeout

The configured socket timeout. Configured in the client policy. The default is 30 seconds.

client

Client IP address and port.

`starting basic scan job 8104671463142312256 {namespace:set} rps 0 sample-pct 100 socket-timeout 30000 from 172.22.xx.yy:52842`

INFO

message

starting basic scan job 8104671463142312256 {namespace:set} rps 0 sample-pct 100 socket-timeout 30000 from 172.22.xx.yy:52842

description

A basic scan job is initiated on the specified namespace and set.

context

scan

introduced

4.7.0.2

removed

4.9

parameters

scan job

ID of the scan.

rps

Configured records per second of the scan. Configured in the scan policy. If this is not specified, the cluster maximum of the background-scan-max-rps parameter applies.

sample-pct

The percentage of records to return from the scan, if specified. The default is 100 percent. Configured in the scan policy.

metadata-only

Only if the scan is configured to initiate a metadata scan.

fail-on-cluster-change

If this is enabled in the scan policy, or using AQL, scans are halted if there is a cluster change, such as nodes leaving or joining the cluster. See Why do I get AEROSPIKE_ERR_CLUSTER_CHANGE when querying or scanning a namespace? for details.

socket-timeout

The configured socket timeout. Configured in the client policy. The default is 30 seconds.

client

Client IP address and port

Security

`FAILED ASSERTION (namespace): (namespace_ee.c:550) can't remove persistent memory base block: Operation not permitted`

WARNING

message

FAILED ASSERTION (namespace): (namespace_ee.c:550) can't remove persistent memory base block: Operation not permitted

description

When running under systemd, this error occurs if shared memory has been allocated by a previous user, and a second user who does not have privileges to access those shared memory segments starts the Aerospike server. Aerospike reads the user from the aerospike.conf file; however, if the shared memory segments are still allocated by the previous user, it may be necessary to force a cold start.

context

security

introduced

-

removed

-

`WARNING (security): (security.c:857) unknown security message command 20`

WARNING

message

WARNING (security): (security.c:857) unknown security message command 20

description

Clients that enable the LDAP protocol server-side, and run on Databases of Aerospike that did not support server-side LDAP, may see this warning when they attempt to log in. This warning is transitory as the server falls back to use the non-LDAP protocol by default. Check the release notes of each client to validate the Database that supports LDAP.

context

security

introduced

-

removed

-

`authentication failed (user) | client: 11.22.33.44:57754 | authenticated user: <none> | action: login | detail: user=baduser`

INFO

message

authentication failed (user) | client:  11.22.33.44:57754 | authenticated user: <none> | action: login | detail: user=baduser

description

Provides details on the internal user not found, including failure, client with IP address and port, authenticated user, action, and user specified.

context

security

introduced

-

removed

-

occurs

Occurs when a failed login is attempted with an internal user that is not found in the access control list, if audit logging is configured to report those. See Security Configuration for details. To enable this log message output, you must set report-authentication and report-violation to true.

parameters

authenticated failed (user)

The user authentication failed.

client

The client IP Address and port.

authenticated user

The authenticated user, in this example: none.

action

The action being performed, in this case login.

detail

The username involved in the failed authentication.

`fd 361 send failed, errno 113`

WARNING

message

fd 361 send failed, errno 113

description

Tried to send the client a message indicating security is not supported on this CE server, but failed due to a socket issue.

context

security

introduced

-

removed

-

parameters

fd

ID of the file descriptor the message was sent on.

errno

Linux error code for the problem, usually something like 113 (“No route to host”) or 110 (“Connection timed out”).

`login - internal user credential mismatch`

WARNING

message

login - internal user credential mismatch

description

Provides a warning that a login failed using a valid internal user that is found in the access control list with an incorrect password.

context

security

introduced

4.1.0.1

removed

-

occurs

Occurs when a failed login is attempted using an internal user that exists in the access control list but where an incorrect password has been supplied.

`login - internal user not found`

WARNING

message

login - internal user not found

description

Provides a warning that a login failed with an internal user that is not found in the access control list.

context

security

introduced

4.1.0.1

removed

-

occurs

Occurs when a failed login is attempted with an internal user that is not found in the access control list. To log the details of the internal user that is not found, report-authentication and report-violation must be set to true. See Security Configuration for details.

`login - internal user using ldap`

WARNING

message

login - internal user using ldap

description

A login failed with an internal user that is not found in the access control list.

context

security

introduced

4.6.0.2

removed

-

occurs

Occurs when a login attempt fails with an internal user that is not found in the access control list. Some conditions when the warning message is presented to the user while incorporating the use of an ACL.

When an encrypted ‘external’ (clear password encrypted) is used for LDAP but a stored hashed ‘internal’ password also exists for that user.
When an incompatible Aerospike client is attempting to connect to Database 4.6 or newer. See Overview of Access Control with LDAP and PKI for details to ensure you are using a compatible Aerospike client version.
When clusters using Cross-Datacenter Replication (XDR) are running with conflicting Aerospike Database versions. When running XDR and an ACL, Database versions 4.1.0.1 to 4.3.0.6 are incompatible with Database 4.6 or later. The incompatible Database versions 4.1.0.1 to 4.3.0.6 cannot ship to 4.6.0.2 or later. The simplest workaround is to avoid using versions 4.1.0.1 to 4.3.0.6. See Internal user warning returned for details.

This warning message typically appears in the logs when upgrading a cluster.

`permitted | authenticated user: admin | action: create user | detail: user=bruce;roles=read-write`

INFO

message

permitted | authenticated user: admin | action: create user | detail: user=bruce;roles=read-write

description

Provides details on the operation performed, including transaction type, namespace, set and relevant record’s digest.

context

security

introduced

3.7.0.1

removed

-

occurs

Occurs when permitted transaction happen under an authenticated user (in this case a user-admin related operation). See Security Configuration for details.

parameters

authenticated user

The authenticated user performing the operation.

action

The transaction type (read/write/delete/udf) or user or data related operation.

detail

The details of the operation, either for a user or admin related operation or namespace, set and digest of the record involved if a single record transaction.

`role violation | authenticated user: sally | action: delete | detail: {test|setB} [D|ee50d7c1d0f427ed5c41ef8a18efd85412b973ff]`

INFO

message

role violation | authenticated user: sally | action: delete | detail: {test|setB} [D|ee50d7c1d0f427ed5c41ef8a18efd85412b973ff]

description

Details on the role violation, including transaction type, namespace, set and relevant record’s digest.

context

security

introduced

3.7.0.1

removed

-

occurs

Occurs when role violation happens, if audit logging configured to report those. See Security Configuration for details.

parameters

authenticated user

The authenticated user violating the role’s permissions.

action

The transaction type (read/write/delete/udf) or user or data related operation.

detail

The namespace, set and digest of the record involved in the role violation.

Service

`refusing client connection - proto-fd-max 50000`

WARNING

message

refusing client connection - proto-fd-max 50000

description

The server has reached the configured maximum for incoming file descriptors (proto-fd-max). This corresponds directly to incoming client connections. Connections greater than the maximum are refused.

context

service

introduced

-

removed

-

parameters

proto-fd-max

Currently configured value for proto-fd-max

`unsupported proto version 0 from 12.0.0.1:5598`

WARNING

message

unsupported proto version 0 from 12.0.0.1:5598

description

Indicates that messages of an unexpected type are being sent to the main server port (usually port 3000) by other nodes in the cluster. This can happen if the nodes have been misconfigured to use the main server port instead of the correct port. For example, the configuration for Mesh heartbeats should specify port 3002, not port 3000.

context

service

introduced

-

removed

-

Sindex

`QTR Put in hash failed with error -4.`

WARNING

message

QTR Put in hash failed with error -4.

description

context

sindex

introduced

4.6.0

removed

-

occurs

WA duplicate query being executed on the Aerospike server. On the client side, you should see error code 215 (AS_PROTO_RESULT_FAIL_QUERY_DUPLICATE). There are two possible causes.

1. This error could happen when there are multiple incoming queries with same Task ID. To avoid this error, set a unique Task ID with the client when running the same query on multiple threads. The Task ID is set by using the following APIs: setTaskID for Java, and SetTaskID for C#.
1. If there are retries enabled for secondary index query jobs. We do not recommend having retries for secondary index queries, to avoid them running for the same Job ID.

`Queuing namespace {ns-name} for sindex population by device scan`

INFO

message

Queuing namespace {ns-name} for sindex population by device scan

description

context

sindex

introduced

5.3.0

removed

-

occurs

During startup, when the namespace’s devices are about to be scanned in order to populate secondary indexes. This happens when sindex-startup-device-scan is true.

`Queuing namespace {ns-name} for sindex population by index scan`

INFO

message

Queuing namespace {ns-name} for sindex population by index scan

description

context

sindex

introduced

5.3.0

removed

-

occurs

During startup, when the namespace’s primary index is about to be scanned in order to populate secondary indexes. This happens when sindex-startup-device-scan is false.

`Sindex-ticker: ns=ns-name si=<all> obj-scanned=500000 si-mem-used=47913 progress= 2% est-time=2336995 ms`

INFO

message

Sindex-ticker: ns=ns-name si=<all> obj-scanned=500000 si-mem-used=47913 progress= 2% est-time=2336995 ms

description

context

sindex

introduced

-

removed

-

occurs

Information logged at startup when secondary indices are being rebuilt.

parameters

ns

Namespace name.

si

Secondary indexes being rebuilt.

obj-scanned

Number of objects scanned.

si-mem-used

Memory used by the secondary indices.

progress

Progress in percent.

est-time

Estimated remaining time in milliseconds for the secondary indices to be fully rebuilt.

`{ns-name} sindex-gc-done: cleaned (40000,40000) total-ms 23`

INFO

message

{ns-name} sindex-gc-done: cleaned (40000,40000) total-ms 23

description

context

sindex

introduced

5.7.0

removed

-

occurs

Secondary index (sindex) garbage collection cycle summary.

parameters

cleaned

Count of sindex entries that have been cleaned (cumulative, current). First value corresponds to the sindex_gc_cleaned statistic. Second value corresponds to the number of sindex entries cleaned in the current round.

total-ms

Duration of the sindex garbage collection cycle in milliseconds.

`{ns-name} sindex-gc-done: processed 3133360101 found 365961945 deleted 365952323 total-ms 62667962`

INFO

message

{ns-name} sindex-gc-done: processed 3133360101 found 365961945 deleted 365952323 total-ms 62667962

description

context

sindex

introduced

4.6.0

removed

5.7.0

occurs

Secondary index (sindex) garbage collection cycle summary. Replaced by a modified version of the sindex-gc-done message in Database 5.7.

parameters

processed

Count of sindex entries that have been checked. Corresponds to the sindex_gc_objects_validated statistic.

found

Count of sindex entries found eligible for garbage collection. Corresponds to the sindex_gc_garbage_found statistic.

deleted

Count of sindex entries deleted through garbage collection. Count may be lower than above number if those entries were deleted while the garbage collector was running, for example through a competing truncate command. Corresponds to the sindex_gc_garbage_cleaned statistic.

total-ms

Duration of the sindex garbage collection cycle in milliseconds.

`{ns-name} sindex-gc-start`

INFO

message

{ns-name} sindex-gc-start

description

context

sindex

introduced

4.6.0

removed

-

occurs

Starting secondary index (sindex) garbage collection.

Smd

`failed to allocate a System Metadata cmd event`

CRITICAL

message

failed to allocate a System Metadata cmd event

description

Indicates a shortage of memory. Make sure nodes have enough memory.

context

smd

introduced

-

removed

-

Socket

`Error while adding FD <FD> to epoll instance <Poll-FD>: <error-code> (<error-string>)`

CRITICAL

message

Error while adding FD <FD> to epoll instance <Poll-FD>: <error-code> (<error-string>)

description

A double-close of file descriptors can lead to this assertion under certain conditions. If a file descriptor (FD) is closed twice in rapid succession, it usually fails silently after the first successful close. However, if the FD is reused between the first and second close due to a race condition, the second close could mistakenly target an active connection. This can result in errors like “Bad file descriptor” when the reused FD is assigned to a service thread’s epoll instance.

context

socket

introduced

-

removed

-

occurs

Example: Error while adding FD 2817 to epoll instance 1105: 9 (Bad file descriptor)

See this Bad File Descriptor knowledge base article for details on one known cause of this assertion when using multiple LDAP servers in the Aerospike configuration and at least one of them is failing to connect against.

parameters

FD : File descriptor

Poll-FD : epoll instance file descriptor

error-code : OS Error Code

error-string : OS Error code description

`Error while connecting: 113 (No route to host)`

WARNING

message

Error while connecting: 113 (No route to host)

description

Failed to make the connection to peer node for sending heartbeat messages.

context

socket

introduced

-

removed

-

parameters

errno

Linux error code returned by the OS.

error message

Message corresponding to errno.

`Error while connecting socket to 10.100.0.101:3002`

WARNING

message

Error while connecting socket to 10.100.0.101:3002

description

The node is unable to connect to a configured peer node. The peer may not be up, or there may be connectivity issues. If the node had been permanently removed, see Removing a Node.

context

socket

introduced

-

removed

-

parameters

address:port

Address and port that could not be connected to.

`Error while creating socket for 10.168.10.1:3002: 24 (Too many open files)`

WARNING

message

Error while creating socket for 10.168.10.1:3002: 24 (Too many open files)

description

Too many file descriptors are in use. This can lead to an assertion that would cause the node to abort. See System running out of file descriptors

context

socket

introduced

-

removed

-

`bind: socket in use, waiting (port:3001)`

WARNING

message

bind: socket in use, waiting (port:3001)

description

Some other process is already using that port and asd cannot listen on it. Use losf -i:3001 or netstat -plant | grep :3001 to find out what process is causing the conflict.

context

socket

introduced

-

removed

-

parameters

port

Port number that asd needs to listen on.

`epoll_create() failed: 24 (Too many open files)`

CRITICAL

message

epoll_create() failed: 24 (Too many open files)

description

Occurs when the server has hit the system configured file descriptor limit, which leads to an assertion causing the node to abort.<br /See System running out of file descriptors.

context

socket

introduced

-

removed

-

`too many addresses for interface <interface> - truncating <IP address>`

WARNING

message

too many addresses for interface <interface> - truncating <IP address>

description

Too many IP addresses associated with a network interface. Limit is 20 addresses per interface.

context

socket

introduced

-

removed

-

Storage

`compression is not available for storage-engine memory`

WARNING

message

compression is not available for storage-engine memory

description

This message means that an attempt was made to add compression on a memory storage device. This is not supported.

context

storage

introduced

4.9.0

removed

-

`{ns_name} partitions shut down`

INFO

message

{ns_name} partitions shut down

description

This message is one of a sequence of messages logged during Aerospike server shutdown of storage-engine device namespaces. The message signifies that all of the namespace’s partitions and index trees have been locked, so that no records are accessible.

context

storage

introduced

4.6.0.2

removed

-

`{ns_name} storage devices flushed`

INFO

message

{ns_name} storage devices flushed

description

This message is one of a sequence of messages logged during Aerospike server shutdown of storage-engine device namespaces. The message signifies that the data in write buffers for the namespace’s devices has been successfully flushed to those devices.

context

storage

introduced

4.6.0.2

removed

-

`{ns_name} storage-engine memory - nothing to do`

INFO

message

{ns_name} storage-engine memory - nothing to do

description

This message is logged during Aerospike server shutdown of storage-engine memory namespaces. The message simply notes that there are no storage shutdown tasks needed for the namespace.

context

storage

introduced

4.6.0.2

removed

-

Tls

`SSL_accept I/O unexpected EOF with 127.0.0.1:46630`

WARNING

message

SSL_accept I/O unexpected EOF with 127.0.0.1:46630

description

The server is attempting to use a connection that has been closed by the client. This can happen if a client times out a transaction, for example if the server is too slow to respond or if there are network disruptions.

context

tls

introduced

-

removed

-

parameters

IP:port

Source IP address and port of the client connection.

`SSL_accept with 127.0.0.1:34086 failed: error:140890C7:SSL routines:ssl3_get_client_certificate:peer did not return a certificate`

WARNING

message

SSL_accept with 127.0.0.1:34086 failed: error:140890C7:SSL routines:ssl3_get_client_certificate:peer did not return a certificate

description

The client sent a proper CA certificate for mutual authentication that matches the server CA certificate, but didn’t send a client certificate.

context

tls

introduced

-

removed

-

parameters

IP:port

Source IP address and port of the client connection.

`SSL_accept with 127.0.0.1:34094 failed: error:14089086:SSL routines:ssl3_get_client_certificate:certificate verify failed`

WARNING

message

SSL_accept with 127.0.0.1:34094 failed: error:14089086:SSL routines:ssl3_get_client_certificate:certificate verify failed

description

The client sent a proper client certificate and CA certificate for mutual authentication, but sent a TLS name that doesn’t match the server’s TLS name as set in tls-authenticate-client and the server’s certificate.

context

tls

introduced

-

removed

-

parameters

IP:port

Source IP address and port of the client connection.

`SSL_accept with 127.0.0.1:34100 failed: error:14094418:SSL routines:ssl3_read_bytes:tlsv1 alert unknown ca`

WARNING

message

SSL_accept with 127.0.0.1:34100 failed: error:14094418:SSL routines:ssl3_read_bytes:tlsv1 alert unknown ca

description

The client sent a proper client certificate for mutual authentication, but sent a CA certificate that either doesn’t match the client certificate, or doesn’t match the server CA certificate.

context

tls

introduced

-

removed

-

parameters

IP:port

Source IP address and port of the client connection.

`SSL_read I/O unexpected EOF with 127.0.0.1:46491`

WARNING

message

SSL_read I/O unexpected EOF with 127.0.0.1:46491

description

The server is attempting to use a connection that has been closed by the client. This can happen if a client times out a transaction, for example if the server is too slow to respond or if there are network disruptions.

context

tls

introduced

-

removed

-

parameters

IP:port

Source IP address and port of the client connection.

`TLS verify result: unable to get local issuer certificate`

WARNING

message

TLS verify result: unable to get local issuer certificate

description

A CA or intermediate CA is not trusted by the server. This may be caused by a self-signed certificate or CA that is not yet added to the server truststore.

context

tls

introduced

-

removed

-

`WARNING (tls): (tls_ee.c:318) SSL_shutdown I/Oerror with (unknown): Transport endpoint is not connected`

WARNING

message

WARNING (tls): (tls_ee.c:318) SSL_shutdown I/Oerror with (unknown): Transport endpoint is not connected

description

Indicates that a socket is not connected. Adjacent messages show that the socket is being used by heartbeats, although similar messages could appear concerning any socket using TLS. Possible causes:

Interruption in network layer
Abrupt node restart if observed on heartbeat
Abrupt client restart if observed in client traffic
Server unable to send results to a client due to client timing out the socket

context

tls

introduced

-

removed

-

Truncate

`{ns-name} done truncate to T deleted N`

INFO

message

{ns-name} done truncate to T deleted N

description

This message indicates a request to truncate up to timestamp T has finished, after deleting N objects.

context

truncate

introduced

6.3

removed

-

occurs

Truncate process completed.

`{ns-name} done truncate`

INFO

message

{ns-name} done truncate

description

context

truncate

introduced

3.12

removed

6.3

occurs

Truncate command completed.

`{ns-name} flagging truncate to restart`

INFO

message

{ns-name} flagging truncate to restart

description

context

truncate

introduced

3.12

removed

-

occurs

Namespace needs a second truncation pass through after the current pass completes. Truncation can act on more than one set at a time so if a truncate command is received for a set in a namespace that is currently already going through truncation (for a different set for example), a subsequent iteration is required. For example, a truncate against set s2 is ongoing when a truncate command against set s4 is triggered. In this case, from the moment the log message appears (when the command to truncate against set4 is issued), both set2 and set4 are truncated together until the end of the current pass. Then truncation restarts another pass through the namespace, so that the rest of set4 gets truncated.

`{ns-name} starting truncate`

INFO

message

{ns-name} starting truncate

description

context

truncate

introduced

3.12

removed

-

occurs

Truncate command being processed for the namespace.

`{ns-name} truncated records (10,50)`

INFO

message

{ns-name} truncated records (10,50)

description

context

truncate

introduced

3.12

removed

-

occurs

Truncate command being processed for the namespace.

parameters

(current,total)

Current truncation count (10 in this example) followed by the total number of records have been deleted by truncation since the server started (50 in this example). Those counts are only kept at the namespace level.

`{ns-name|set-name} got command to truncate to now (226886718769)`

INFO

message

{ns-name|set-name} got command to truncate to now (226886718769)

description

context

truncate

introduced

3.12

removed

-

occurs

Truncate command received. Appears on the node where the info command was issued. The command is distributed to other nodes using system metadata (SMD), and only the truncating/starting/restarting/truncated/done log entries appear on those nodes.

parameters

(timestamp)

The LUT time in milliseconds since the Citrusleaf epoch (00:00:00 UTC on 1 Jan 2010).

`{ns-name|set-name} tombstone covers no records on this node`

INFO

message

{ns-name|set-name} tombstone covers no records on this node

description

context

truncate

introduced

3.12

removed

-

occurs

Occurs at cold start-up and is a listing of all set truncations found in the truncation SMD file (System Meta Data) that are not covering any records. The internal mechanism is that all set truncation tombstones are marked as cenotaph before records are read from drive to build the index. The term tombstone refers here to such a truncation related entry, not to be confused with record level tombstones. We only cycle through those tombstones one time for a namespace during a cold-start. If the drives are all fresh, there is a cenotaph log message for every set truncation in SMD. After the drives are read, which could potentially restore records from truncated sets and strip a set-covering tombstone of its cenotaph status, all the set-truncation tombstones that are still cenotaphs (not covering any tombstones) are listed.

`{ns-name|set-name} truncating to 226886718769`

INFO

message

{ns-name|set-name} truncating to 226886718769

description

context

truncate

introduced

3.12

removed

-

occurs

Truncate command received. Appears on all the nodes after a truncate command is issued.

parameters

timestamp

The LUT time in milliseconds since the Citrusleaf epoch (00:00:00 UTC on 1 Jan 2010).

`{ns-name|set-name} undoing truncate - was to 226886718769`

INFO

message

{ns-name|set-name} undoing truncate - was to 226886718769

description

context

truncate

introduced

4.3.1.11, 4.4.0.11, 4.5.0.6, 4.5.1.5

removed

-

occurs

Truncate command undone.

Tsvc

`rejecting client transaction - initial partition balance unresolved`

WARNING

message

rejecting client transaction - initial partition balance unresolved

description

When using some older Aerospike Client Libraries, there is a very small window where a node joins a cluster and other nodes begin to advertise the node but the node hasn’t finished creating its partition table and a client picked up that advertised service and made a request. This message happens only during that window and should resolve on its own very quickly.

context

tsvc

introduced

-

removed

-

`transaction is neither read nor write - unexpected`

WARNING

message

transaction is neither read nor write - unexpected

description

An invalid request has been received, either a read/write request where the corresponding bit is not set, or an operate() command with no operations. The error code -4 (FAIL_PARAMETER) is returned to the client if there is one, but this message can also be caused by non-Aerospike traffic reaching the service port.

context

tsvc

introduced

-

removed

-

Udf

`UDF bin limit (512) exceeded (bin activity_map)`

WARNING

message

UDF bin limit (512) exceeded (bin activity_map)

description

As per the UDF known limitations page, records with more than 512 bins cannot be read or written by a UDF.

context

udf

introduced

-

removed

5.1

parameters

bin

Name of the 513th bin.

`UDF timed out 734 ms ago`

WARNING

message

UDF timed out 734 ms ago

description

An individual UDF transaction exceeded an internal timeout interval. Time gets checked at regular interval (of lua instruction sets).

context

udf

introduced

-

removed

-

parameters

elapsed time

Milliseconds since the timeout period expired. See transaction-max-ms for details on when transactions could timeout on the server.

`WARNING (udf): (udf_aerospike.c:226) update found urecord not open`

WARNING

message

WARNING (udf): (udf_aerospike.c:226) update found urecord not open

description

Indicates that aerospike:update() was called to update a record that the UDF system did not or could not open. Most likely the record expired or was truncated at the time the UDF ran.

context

udf

introduced

-

removed

`bin limit of 512 for UDF exceeded: 511 bins in use, 1 bins free, 3 new bins needed`

WARNING

message

bin limit of 512 for UDF exceeded: 511 bins in use, 1 bins free, 3 new bins needed

description

A UDF operation that adds bins to a record would result in more than 512 bins in the record. As per the UDF known limitations page, records with more than 512 bins cannot be read or written by a UDF.

context

udf

introduced

-

removed

5.1

parameters

in-use bins

Number of bins already in the record.

free bins

512 - (# of in-use bins)

new bins

Number of new bins that would be need to be added for this operation to succeed.

`bin limit of 512 for UDF exceeded: 512 bins in use, 0 bins free, >4 new bins needed`

WARNING

message

bin limit of 512 for UDF exceeded: 512 bins in use, 0 bins free, >4 new bins needed

description

A UDF operation is trying to add bins to a record that already has 512 bins. As per the UDF known limitations page, records with more than 512 bins cannot be read or written by a UDF.

context

udf

introduced

-

removed

5.1

parameters

new bins

Number of new bins that would be need to be added for this operation to succeed.

`drives full, record is not updated`

WARNING

message

drives full, record is not updated

description

No more space left in storage (happened while executing a UDF).

context

udf

introduced

-

removed

5.1.0

`exceeded UDF max bins 512`

WARNING

message

exceeded UDF max bins 512

description

UDF is attempting to set a bin value, which would result in writing a record with more than 512 bins. As per the UDF known limitations page, records with more than 512 bins cannot be written by a UDF.

context

udf

introduced

5.1

removed

-

`large number of lua function arguments (22)`

WARNING

message

large number of lua function arguments (22)

description

As per the Known Limitations page, calling a Lua function with a large number of arguments can cause instability in the Lua runtime engine. Although this problem is only known to become acute at around 50 arguments, lower values may still result in issues with execution of the UDF.

context

udf

introduced

-

removed

-

parameters

arg count

Number of arguments in the UDF call

`{namespace_name} failed write 1142f0217ababf9fda5b1a4de66e6e8d4e51765e`

DETAIL

message

{namespace_name} failed write 1142f0217ababf9fda5b1a4de66e6e8d4e51765e

description

Most likely appearing as a result of exceeding the write-block-size with a UDF write. The record’s digest is the last item in the log entry. To determine what set is being written to, see How to return the set name of a record using its digest.

context

udf

introduced

5.2

removed

-

parameters

{namespace}

Namespace being written to.

`record has too many bins (513) for UDF processing`

WARNING

message

record has too many bins (513) for UDF processing

description

As per the UDF known limitations page, records with more than 512 bins cannot be read or written by a UDF. Such records can exist in Aerospike, however.

context

udf

introduced

-

removed

5.1

parameters

bins

How many bins the record has.

`too many bins (513) for UDF`

WARNING

message

too many bins (513) for UDF

description

UDF is attempting to use a record (with Database 5.1 or older) or access the bins of a record (with Database 5.2+) with more than 512 bins. As per the UDF known limitations page, records with more than 512 bins cannot be accessed by a UDF, unless the UDF is read-only and accesses only the record’s metadata, with Database 5.2 and later.

context

udf

introduced

5.1

removed

-

parameters

bins

Number of bins in the record.

`too many bins for UDF`

WARNING

message

too many bins for UDF

description

UDF is attempting to set a bin value, but has already set 512 bin values. As per the UDF known limitations page, records with more than 512 bins cannot be written by a UDF.

context

udf

introduced

5.1

removed

-

`udf applied with forbidden policy`

WARNING

message

udf applied with forbidden policy

description

A result of the client submitting a record UDF request with one or more forbidden policies indicated. A parameter error (code 4) is returned to the client in this situation. The forbidden policies are:

Generation
Generation gt
Create only
Update only
Create or replace
Replace only
Any of these policies can be enforced within the UDF itself using Aerospike’s LUA API.

Prior to Database 5.2, the policy flags are ignored.

context

udf

introduced

5.2

removed

-

`udf_aerospike_rec_update: failure executing record updates (-3)`

WARNING

message

udf_aerospike_rec_update: failure executing record updates (-3)

description

UDF could not update a record, so the update was rolled back. See earlier messages in the log for further details.

context

udf

introduced

-

removed

5.1.0

parameters

error code

Undocumented error code.

Xdr client

`DC <DCNAME> receive error [2] on <xx.xx.xx.xx:3000>`

WARNING

message

DC <DCNAME> receive error [2] on <xx.xx.xx.xx:3000>

description

context

xdr-client

introduced

5.0

removed

-

occurs

On its own (i.e if no other associated warning like bad protocol or similar), this warning is actually benign. This warning can also happen if there is a connection reset while trying to read from the socket. A potential cause could be the remote side having a node being restarted.

`connected to <DCNAME> <xx.xx.xx.xx>:<xxxx>`

INFO

message

connected to <DCNAME> <xx.xx.xx.xx>:<xxxx>

description

context

xdr-client

introduced

6.1.0.21, 6.2.0.16, 6.3.0.9, 6.4.0.3

removed

-

occurs

This is logged on a source node whenever the xdr-client tend thread establishes a successful connection to the destination node. The line is logged under the following conditions:

The first time the tend thread successfully connects to the destination node. If security is enabled on the destination, the login only happens if security is enabled in the configuration.
The tend thread has to re-establish the connection again as the login attempt failed when security was enabled on the destination node.
The tend thread has to re-establish the connection in cases such as source or destination restart, or any network interruption. The login flow can be found at XDR login flow.

parameters

DCNAME

The destination dc name, followed by the IP and port of the destination node, e.g.,connected to destdc 172.17.0.5:3116

`connected to connector <IP>:<Port>`

INFO

message

connected to connector <IP>:<Port>

description

This message is logged every 5 seconds and is a background health check connection used for connector destinations.

context

xdr-client

introduced

-

removed

-

`<dc> <node-address-port> failed check <x> times`

WARNING

message

<dc> <node-address-port> failed check <x> times

description

WARNING (xdr-client): (cluster.c:1010) aerospike-kafka-source 10.242.145.46:8080 failed check 130 times

context

xdr-client

introduced

removed

-

occurs

When connector DC is configured under XDR context, xdr-client tends every 5 seconds. After 5 consecutive fails, this warning is logged on the source node.

`logged in to node <xx.xx.xx.xx>:<xxxx> - session-ttl 86400`

INFO

message

logged in to node <xx.xx.xx.xx>:<xxxx> - session-ttl 86400

description

context

xdr-client

introduced

6.1.0.21, 6.2.0.16, 6.3.0.9, 6.4.0.3

removed

-

occurs

Logged on a source node when there is a successful login from the xdr-client tend thread to a destination node. It contains the destination IP, port and the session-ttl configured on the destination node during the login. The session is valid for this TTL period even if the TTL value changes on the destination node later, unless the session is interrupted by a node restart (source or destination) or auth failure which forces the tend thread to do a re-login. The login flow can be found at XDR login flow.

`login to node xx.xx.xx.xx:xxxx failed: <ERROR-CODE>`

WARNING

message

login to node xx.xx.xx.xx:xxxx failed: <ERROR-CODE>

description

context

xdr-client

introduced

6.1.0.21, 6.2.0.16, 6.3.0.9, 6.4.0.3

removed

-

occurs

Logged on a source node when there is a login failure to a destination node. It contains the destination node IP and port. The login flow can be found at XDR login flow.

parameters

ERROR-CODE

The error code returned by the destination server, e.g., failed: 65

`refreshing session token for <DCNAME> <xx.xx.xx.xx>:<xxxx>`

INFO

message

refreshing session token for <DCNAME> <xx.xx.xx.xx>:<xxxx>

description

context

xdr-client

introduced

6.1.0.21, 6.2.0.16, 6.3.0.9, 6.4.0.3

removed

-

occurs

Logged on a source DC node one minute before the session-ttl configured on the destination node during the login. The server (destination node) sets the expiry of the access token, provided to the xdr-client, a minute shorter than the actual expiration time (renewal margin). The xdr-client tend thread refreshes the token one minute before its actual set expiry. It means the source node is issuing a LOGIN command to refresh its token for the specified destination DC node (IP:port). The login flow can be found at XDR login flow.

parameters

DCNAME

The destination DC name, followed by the IP and port of the destination node, e.g.,refreshing session token for destdc 172.17.0.5:3116

`security not configured on node <xx.xx.xx.xx>:<xxxx>`

INFO

message

security not configured on node  <xx.xx.xx.xx>:<xxxx>

description

context

xdr-client

introduced

6.1.0.21, 6.2.0.16, 6.3.0.9, 6.4.0.3

removed

-

occurs

Logged on a source node when there is auth configured on the source node under XDR context, but security is not enabled on the destination cluster node. It contains destination IP and port details. Logged only once when the connection is established by the tend thread. Does not affect the XDR working properly after the connection is established. The login flow can be found at XDR login flow.

Xdr

`[DC_NAME]: dc-state CLUSTER_UP timelag-sec 2 lst 1468006386894 mlst 1468006389647 (2016-07-08 19:33:09.647 GMT) fnlst 0 (-) wslst 0 (-) shlat-ms 0 rsas-ms 0.004 rsas-pct 0.0 con 384 errcl 0 errsrv 0 sz 6`

INFO

message

[DC_NAME]: dc-state CLUSTER_UP timelag-sec 2 lst 1468006386894 mlst 1468006389647 (2016-07-08 19:33:09.647 GMT) fnlst 0 (-) wslst 0 (-) shlat-ms 0 rsas-ms 0.004 rsas-pct 0.0 con 384 errcl 0 errsrv 0 sz 6

description

context

xdr

introduced

3.9

removed

-

occurs

Every 1 minute, for each configured destination cluster (or DC).

parameters

[DC_NAME]

Name and status of the DC. Here are the different statuses: CLUSTER_INACTIVE, CLUSTER_UP, CLUSTER_DOWN, CLUSTER_WINDOW_SHIP. Corresponds to the dc_state statistic.

timelag-sec

The lag in seconds. This is computed as the difference between the current time and the time-stamp of the record that was last successfully shipped. This provides a sense of how ‘far behind’ the destination cluster lags behind the source cluster. This does not correspond to the time it takes the source cluster to ‘catch up’, nor does it necessarily relate to the number of outstanding digests to be processed. Corresponds to the dc_timelag statistic.

lst

The overall last ship time for the node (the minimum of all last ship times on this node).

mlst

The main last ship time (the last ship time of the dlogreader).

fnlst

The failed node last ship time (the minimum of the last ship times of all failed node shippers running on this node).

wslst

The window shipper last ship time (the minimum of the last ship times of all window shippers running on this node).

shlat-ms

Corresponds to the xdr_ship_latency_avg statistic.

rsas-ms

Average sleep time for each write to the DC for the purpose of throttling. Corresponds to the dc_ship_idle_avg statistic. (Stands for remote ship average sleep ms).

rsas-pct

Percentage of throttled writes to the DC. Corresponds to the dc_ship_idle_avg_pct statistic. (Stands for remote ship average sleep pct).

con

Number of open connection to the DC. If the DC accepts pipeline writes, there are 64 connections per destination node. Corresponds to the dc_as_open_conn statistic.

errcl

Number of client layer errors while shipping records for this DC. Errors include timeout, bad network fd, etc. Corresponds to the dc_ship_source_error statistic.

errsrv

Number of errors from the remote cluster(s) while shipping records for this DC. Errors include out-of-space, key-busy, etc. Corresponds to the dc_ship_destination_error statistic.

sz

The cluster size of the destination DC.. Corresponds to the dc_as_size statistic.

`Digest Log Write Failed !!! ... Critical error`

WARNING

message

Digest Log Write Failed !!! ... Critical error

description

context

xdr

introduced

-

removed

-

occurs

XDR digestlog has grown larger than its partition. See Solution: Digestlog Partition Out Of Space for more information.

`Failed to seek during reclaim. Leaving sptr as is`

INFO

message

Failed to seek during reclaim. Leaving sptr as is

description

Benign message if the node on which this is logged did not receive new writes into the digest log. A background process running once a minute tries to reclaim the digest log based on a timestamp. It starts sampling the log by looking at the timestamp of the last record written in the digestlog. If it doesn’t find a single record, it stops early and prints this message.

context

xdr

introduced

-

removed

-

`INFO (info): (dc.c:1024) xdr-dc DC1: lag 0 throughput 0 latency-ms 0 in-queue 0 outstanding 0 complete (3000,0,0,0) retries (0,0) recoveries (4096,0) hot-keys 0`

INFO

message

INFO (info): (dc.c:1024) xdr-dc DC1: lag 0 throughput 0 latency-ms 0 in-queue 0 outstanding 0 complete (3000,0,0,0) retries (0,0) recoveries (4096,0) hot-keys 0

description

context

xdr

introduced

5.0.0

removed

5.1.0

parameters

lag

See lag.

throughput

See throughput.

latency-ms

See latency_ms metric.

in-queue

See in_queue.

in-progress

See in_progress.

complete

Composed of the following metrics:

retries

Composed of the following metrics:

recoveries

Composed of the following metrics:

hot-keys

See hot_keys.

`INFO (info): (dc.c:1353) xdr-dc dc2: lag 12 throughput 710 latency-ms 19 in-queue 250563 in-progress 81150 complete (1002215,0,0,0) retries (0,0,23) recoveries (2048,0) hot-keys 4655`

INFO

message

INFO (info): (dc.c:1353) xdr-dc dc2: lag 12 throughput 710 latency-ms 19 in-queue 250563 in-progress 81150 complete (1002215,0,0,0) retries (0,0,23) recoveries (2048,0) hot-keys 4655

description

context

xdr

introduced

5.1.0

removed

5.3.0

parameters

lag

See lag.

throughput

See throughput.

latency-ms

See latency_ms.

in-queue

See in_queue.

in-progress

See in_progress.

complete

Composed of the following metrics:

retries

Composed of the following metrics:

recoveries

Composed of the following metrics:

hot-keys

See hot_keys.

`INFO (info): (dc.c:1353) xdr-dc dc2: nodes 8 lag 12 throughput 710 latency-ms 19 in-queue 250563 in-progress 81150 complete (1002215,0,0,0) retries (0,0,23) recoveries (2048,0) hot-keys 4655`

INFO

message

INFO (info): (dc.c:1353) xdr-dc dc2: nodes 8 lag 12 throughput 710 latency-ms 19 in-queue 250563 in-progress 81150 complete (1002215,0,0,0) retries (0,0,23) recoveries (2048,0) hot-keys 4655

description

In the XDR log line we use the following values when multiple namespaces are shipped to a DC:

nodes refers to DC-level alone
lag is the max of all namespaces
latency is the average across namespaces
everything else is the sum across namespaces

context

xdr

introduced

5.3.0

removed

6.0

parameters

nodes

See nodes.

lag

See lag.

throughput

See throughput.

latency-ms

See latency_ms.

in-queue

See in_queue.

in-progress

See in_progress.

complete

Composed of the following metrics:

retries

Composed of the following metrics:

recoveries

Composed of the following metrics:

hot-keys

See hot_keys.

`{NAMESPACE} failed to create set`

CRITICAL

message

{NAMESPACE} failed to create set

description

A record belonging to a previously-unknown set was received using XDR, but the limit on set names has already been reached locally, so the new set could not be created. If it happens while Aerospike is starting up, this message may be followed by an abort with SIGUSR1.

context

xdr

introduced

-

removed

-

parameters

namespace

The namespace where the set could not be created.

`XDR digestlog cannot keep up with writes. Dropping record.`

WARNING

message

XDR digestlog cannot keep up with writes. Dropping record.

description

Aerospike is writing data faster to the XDR digestlog than the underlying disk can handle. If using raw-disk backed xdr storage, consider switching over to file-backed xdr storage for the xdr-digestlog-path parameter. This takes advantage of filesystem caching for reads and writes. Otherwise, you may need to use a faster disk or join multiple disks using RAID-0 to allow for faster read/write to the XDR digestlog. Also ensure that dmesg is properly checked for disk failures and a SMART disk test is performed.

context

xdr

introduced

-

removed

5.0

`'XXXX' cluster does not support pipelining`

WARNING

message

'XXXX' cluster does not support pipelining

description

This message usually indicates that one of the destinations is running an older version of XDR (pre-3.8) or there exists a misconfiguration of the kernel configs /proc/sys/net/core/wmem_max and /proc/sys/net/core/rmem_max between source and destination clusters.

context

xdr

introduced

-

removed

-

`detail: sh 5588691 ul 12 lg 11162298 rlg 54 rlgi 0 rlgo 54 lproc 11162198 rproc 45 lkdproc 0 errcl 54 errsrv 0 hkskip 6303 hkf 6299 flat 0`

INFO

message

detail: sh 5588691 ul 12 lg 11162298 rlg 54 rlgi 0 rlgo 54 lproc 11162198 rproc 45 lkdproc 0 errcl 54 errsrv 0 hkskip 6303 hkf 6299 flat 0

description

context

xdr

introduced

3.9

removed

-

parameters

sh

The cumulative number of records that have been attempted to be shipped since this node started, across all datacenters. If a record is shipped to 3 different datacenters, then this number increments by 3. Corresponds to the sum of the xdr_ship_success, xdr_ship_source_error and xdr_ship_destination_error statistics.

ul

The number of record’s digests that have been written to the node but not logged yet to the digestlog (unlogged).

lg

The number of record’s digests that have been logged. Includes both master and replica records but a node only ships records when it owns the master partition and processes records belonging to its replica partitions only when a neighboring source node goes down.

rlog

Relogged digests. The number of record’s digests that have been relogged on this node due to temporary failures when attempting to ship. Corresponds to the dlog_relogged statistic.

rlgi

Relogged incoming digests. The number of record’s digest that another node sent to this node (typically prole side relog or partition ownership change). Corresponds to the relogged_incoming statistic.

rlgo

Relogged outgoing digests. The number of record’s digest log entries that were sent to another node (typically prole side relog or partition ownership change). Corresponds to the relogged_outgoing statistic.

lproc

The number of record’s digests that have been processed locally. A processed digest does not necessarily imply a shipped record (for example, replica digests don’t get shipped unless a source node is down, and hotkeys also don’t have all their updates necessarily shipped). Corresponds to the dlog_processed_main statistics.

rproc

The number of replica record’s digests that have been processed by this node. A node processes records belonging to its replica partitions only when a neighboring source node goes down. Corresponds to the dlog_processed_replica statistic.

lkdproc

The number of record’s digests that have been processed as part of a linked down session. A link down session is spawned when a full destination cluster is down or not reachable. Corresponds to the dlog_processed_link_down statistic.

errcl

The number of errors encountered when attempting to ship due to the embedded client. For example, if the local XDR embedded client is having issues or delays in establishing connections. Corresponds to the xdr_ship_source_error statistic.

errsrv

The number of errors encountered when attempting to ship due to the destination cluster. For example if the destination cluster is temporarily overloaded. Corresponds to the xdr_ship_destination_error statistic.

hkskip

Hotkey skipped. Represents the number of record’s digests that are skipped due to an already existing entry in the reader’s thread cache (meaning a version of this record was just shipped). Corresponds to the xdr_hotkey_skip statistic.

hkf

Hotkey fetched. Represents the number of record’s digest that are actually fetched and shipped because their cache entries expired and were dirty. Corresponds to the xdr_hotkey_fetch statistics.

flat

The average time in milliseconds to fetch records locally (this is an exponential moving average - 95/5). Corresponds to the xdr_read_latency_avg statistic.

`dlog: free-pct 93 reclaimed 2456 glst 1490623097271 (2017-03-27 13:58:17 GMT)`

INFO

message

dlog: free-pct 93 reclaimed 2456 glst 1490623097271 (2017-03-27 13:58:17 GMT)

description

Provides digest log (dlog) related information.

context

xdr

introduced

3.12.1

removed

-

occurs

Every 1 minute.

parameters

free-pct

Percentage of the digest log free and available for use. Corresponds to the dlog_free_pct statistic.

reclaimed

Indicates how many digests were safely ‘removed’ from the digestlog. As shipping successfully proceeds, and records are shipped, digests which are no longer necessary can have their space reclaimed in the digest log. A linked down (destination cluster down or unreachable) is an example where digest log space cannot be reclaimed.

glst

The minimum last ship time across all nodes in the cluster. Corresponds to the xdr_global_lastshiptime statistics. This specifies up to what point slots in the digest log can be reclaimed, by keeping track of the oldest last ship time across all nodes in the cluster.

`dlog-q: capacity 64 used-elements 1 read-offset 0 write-offset 1`

DEBUG

message

dlog-q: capacity 64 used-elements 1 read-offset 0 write-offset 1

description

Status information on the dlog-q. The dlog-q queue is the in-memory digest log queue. Digests of records that have been written get put on this in-memory queue. The dlogwriter picks them from there and puts them in the on-disk digest log. See below for the details on the 4 numbers.

context

xdr

introduced

3.9

removed

-

occurs

Every 1 minute.

parameters

capacity

The size of the queue.

used-elements

The number of elements in the queue.

read-offset

The read pointer of the queue. In general, the number of elements in the queue is the difference between the read pointer and the write pointer (that is, the number of elements that have been written to the queue but haven’t yet been read).

write-offset

The write pointer of the queue. In general, the number of elements in the queue is the difference between the read pointer and the write pointer (that is, the number of elements that have been written to the queue but haven’t yet been read).

`{namespace} xdr-tomb-raid-done: dropped <n> total-ms <t>`

INFO

message

{namespace} xdr-tomb-raid-done: dropped <n> total-ms <t>

description

INFO (xdr): (xdr_ee.c:206) {users_eu} xdr-tomb-raid-done: dropped 0 total-ms 308923

context

xdr

introduced

removed

-

occurs

XDR tomb raider has finished its cycle that is expected to run every xdr-tomb-raider-period seconds having dropped n records and took t ms to finish.

parameters

n

Number of records dropped by the current cycle of XDR tomb raider.

t

Number of ms XDR tomb raider cycle took to finish.

`{namespaceName} DC DC1 abandon result -4`

WARNING

message

{namespaceName} DC DC1 abandon result -4

description

context

xdr

introduced

5.0.0

removed

-

occurs

In strong-consistency enabled namespaces, if XDR finds a record which has not been replicated, re-replication from master to replica is triggered. XDR prints this warning message when this happens. XDR attempts to ship that record again. See XDR transaction delays for details on how XDR handles record in strong-consistency enabled namespaces. See XDR 5.0 Error codes.

`summary: throughput 3722 inflight 164 dlog-outstanding 100 dlog-delta-per-sec -10.0`

INFO

message

summary: throughput 3722 inflight 164 dlog-outstanding 100 dlog-delta-per-sec -10.0

description

context

xdr

introduced

3.9

removed

-

parameters

throughput

The current throughput, shipping to destination cluster(s). When shipping to multiple clusters, the throughput represents the combined throughput to all destination clusters. Corresponds to the xdr_throughput statistic.

inflight

The number of records that are inflight, meaning that have been sent to the destination cluster(s) but for which a response has not been received yet. Corresponds to the xdr_ship_inflight_objects statistic.

dlog-outstanding

The number of record’s digests yet to be processed in the digest log. In parenthesis, the average change normalized to digests per second over the 10 seconds interval separating those log lines. Corresponds to the xdr_ship_outstanding_objects statistic.

dlog-delta-per-sec

The variation of the dlog-outstanding normalized on a per second basis. Gives an idea whether the number of entries in the digestlog is increasing or decreasing over time and at what pace.