Kafka, Cassandra and Kubernetes: Real-time Anomaly Detection at Scale

Presented by

Paul Brebner

About this talk

Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this webinar, we will discuss how we architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. We will demonstrate the scalability, performance and cost effectiveness of Kafka and Cassandra, with results from our experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day.

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (41)
Subscribers (2203)
Instaclustr delivers reliability at scale through our integrated data platform of open source technologies such as Apache Cassandra, Apache Kafka, Apache Spark and Elasticsearch. Our expertize stems from delivering more than 25+ million node hours under management, allowing us to run the world’s most powerful data technologies effortlessly. We provide a range of managed, consulting and support services to help our customers develop and deploy solutions around open source technologies. Our integrated data platform, built on open source technologies, powers mission critical, highly available applications for our customers and help them achieve scalability, reliability and performance for their applications.