Blog

Efficient streaming data architectures with Aerospike and Redpanda

Aerospike and Redpanda have partnered to modernize the world of real-time data management and streaming. Learn more about this partnership and how it’s shaping the future of modern data architectures.

October 30, 2023 | 4 min read
Steve Tuohy website
Steve Tuohy
Director of Product Marketing

Organizations need help navigating the extensive AI/ML and data management ecosystem. For most of Aerospike’s enterprise customers, an architecture diagram can include dozens of technologies or more. The proliferation of distributed streaming data platforms has helped fuel demand for real-time use cases and a real-time database like Aerospike by facilitating data movement between microservices and applications. Many of our customers use Apache Kafka, Apache Pulsar, and, more recently, Redpanda.

Introducing Redpanda and Aerospike

We’re excited to formally partner with Redpanda, the vendor behind the eponymous API-compatible alternative to Kafka. Check out our recent article on how Ad Tech companies benefit from Redpanda and Aerospike to move, process, and act on data for real-time bidding.

What is Redpanda?

Redpanda is a rebuilt Kafka. Like Kafka, it ensures seamless communication among various parts of the online ecosystem. Redpanda uses the Kafka API to ensure compatibility with the existing ecosystem but is rewritten from the ground up in C++ to maximize modern hardware utilization. One of Redpanda’s selling points is its ability to reliably handle large spikes in volume, supporting up to multiple gigabytes (GB) per second on average.

What is Aerospike?

Aerospike is a real-time database that ingests, stores, and retrieves data, handling millions of transactions per second (TPS) with sub-millisecond latency. Event streams are one of the frequent sources of data in an Aerospike database.

Streaming and real-time operational databases

Aerospike has extensive integrations with different streaming protocols, including Connectors for ActiveMQ, RabbitMQ (through Java Message Service), any HTTP-based system, Pulsar, and Kafka. Aerospike can ingest and publish events directly, or through its connectors, to these platforms. It’s common to ingest data into Aerospike from a streaming topic, run several evaluations of that data, and publish data to another topic, all in milliseconds.

Streaming tools and messaging queues have existed for ages, and more than a few Aerospike customers use streaming connectors between their legacy mainframe systems and Aerospike. Modern streaming platforms are increasingly instrumental in new real-time use cases. To highlight a few:

  • Data integration, including ETL & ELT (extract-transform-load) workloads from operational to analytical data systems

  • Event-driven architectures, including event sourcing and log aggregation, simplifying communication between microservices

  • Telemetry and IoT, handling these high-throughput, low-latency data streams for predictive maintenance

In each case, distributed streaming tools help decouple event generation and data producers from data consumers. Some data can be consumed before persisting to a database. Most real-time applications, though, ultimately need to scan and aggregate through more data than would pragmatically sit on a streaming topic. This is where we see many real-time architectures with Aerospike alongside Kafka and other streaming platforms like Redpanda.

Figure 1 illustrates a common pattern among these architectures and where Aerospike and streams complement each other.

efficient-streaming-data-architectures-with-aerospike-and-redpanda-diagram-scaled

Figure 1: Aerospike and streaming in the modern data stack

Aerospike and Redpanda: Architected for customer scale and efficiency

Aerospike attracts customers that require robust performance at scale. Our hybrid memory architecture (HMATM) is instrumental in delivering this with an efficient use of resources, typically delivering superior performance with 80% lower infrastructure costs.

Redpanda brings a similar approach to the streaming space, focused on making the best, most efficient use of modern hardware. In its own words, Redpanda “is frugal with compute, storage, and bandwidth consumption while limiting administrative overheads. Redpanda is a greener data platform… and is 6x more cost-effective than Kafka for the same workload.” Much of this comes from its ability to use hardware efficiently, thanks in part to its C++ codebase.

This lends itself to a highly complementary environment that delivers real-time performance at scale at a fraction of the cost of alternative approaches. The Ad Tech demand-side platform architecture in Figure 2, detailed in this post, is just one use case where this comes to life.

redpanda-blog-figure1-1280w

Figure 2: Aerospike and Redpanda in Ad Tech

Try Aerospike Connect with Redpanda today

It’s no surprise, then, that many of the same innovators out there are interested in incorporating Redpanda and Aerospike together in their environments. This is possible today with Aerospike Connect. In fact, since Redpanda is completely API-compatible with Kafka, Aerospike Connect for Kafka works seamlessly with Redpanda deployments. We invite you to explore the full spectrum of possibilities with Aerospike’s array of streaming connectors, including seamless integration with Redpanda. Expand the potential of real-time data by leveraging our comprehensive connector ecosystem.