Skip to main content
Loading

Stream UDF

Use Aerospike Stream UDFs to filter, transform, and aggregate query results in a distributed fashion. Stream UDFs can process a stream of data by defining a sequence of operations to perform. Stream UDF aggregations run in a distributed fashion to exploit the power of multiple machines.

Stream UDFs perform read-only operations on a collection of records. The Stream UDF system allows a MapReduce style of programming, which is used for common, highly parallel jobs such as word countโ€”each row is accessed and a list of words and their counts are emitted. The top results are calculated in a reduce phase. For simple aggregations where counters are simply incremented in a context instead of continually creating and destroying objects, Aerospike provides optimal implementation.

Operatorsโ€‹

Aerospike Stream UDFs support these operators:

  • filterโ€”Filter data in the stream that satisfies a predicate.
  • mapโ€”Transform a piece of data.
  • aggregateโ€”Reduce partitions of data to a single value.
  • reduceโ€”Allow parallel processing of each group of output data.

These operators are considered primitives and can build complex operations (see Developing Stream UDFs).

filterโ€‹

Filter values in the stream using the predicate p. p tests each value in the stream and returns true if the value should be passed through or false to drop the value.

Example

filter(p: (a: Value) -> Boolean) -> Stream

Parameters

  • pโ€”The predicate to apply to each value in the stream.

Returns

A stream of filtered values.


mapโ€‹

Transforms a value to a different value using the identify function f.

map(f: (a: Value) -> Value) -> Stream

Parameters

  • fโ€”The identify function to apply to each value in the stream.

Returns

A stream of the transformed values.


reduceโ€‹

Reduce the values in the stream to a single value and apply the associative binary operator op to each value. The op return value is the parameter a for subsequent calls to op.

reduce(op: (a: Value, b: Value) -> Value) -> Stream

Parameters

  • opโ€”The associative binary operation to apply to each value in the stream.

Returns

A stream containing a single value.


aggregateโ€‹

Takes the current value (parameter x) and the next value in the input stream, and returns the aggregate value (parameter a).

aggregate(x: Value, op: (a: Value, b: Value) -> Value) -> Stream

Parameters

  • xโ€”The initial (neutral) value passed to the operator op.
  • opโ€”The identify function to apply to each value in the stream.

Returns

A stream containing a single value.


Examplesโ€‹

Referencesโ€‹