Hi [[ session.user.profile.firstName ]]

Apache Kafka Guide 101 Part 2

Instaclustr is back with part 2 of Apache Kafka 101 webinar series for an audience coming from traditional ESB (Enterprise Service Bus), Middleware and ETL based systems who want to learn about a modern-day approach to message processing, stream processing, real-time analytics, event sourcing, or acting as an Enterprise-wide central data bus.

In part-1 of this series, we looked at the basics of Apache Kafka, Kafka ecosystem, an overview of its architecture and explored concepts like brokers, topics, partitions, logs, producers, consumers, consumer groups, etc.

Instaclustr is now organising part 2 of this webinar series. In this part, we will talk about topic design and partitioning. We will at good practices to keep in mind when designing for the required throughput of your applications. We will also look at the message key role with regards to guaranteeing message order. This will move you forward another step in your journey of learning and making the best use of Apache Kafka.
Recorded Sep 5 2019 53 mins
Your place is confirmed,
we'll send you email reminders
Presented by
Alok Dwivedi, Senior Consultant at Instaclustr
Presentation preview: Apache Kafka Guide 101 Part 2

Network with like-minded attendees

  • [[ session.user.profile.displayName ]]
    Add a photo
    • [[ session.user.profile.displayName ]]
    • [[ session.user.profile.jobTitle ]]
    • [[ session.user.profile.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(session.user.profile) ]]
  • [[ card.displayName ]]
    • [[ card.displayName ]]
    • [[ card.jobTitle ]]
    • [[ card.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(card) ]]
  • Channel
  • Channel profile
  • The Future of Open Source 2020 + Beyond Dec 10 2019 6:00 pm UTC 60 mins
    Ben Slater, CPO, Instaclustr
    This webinar will overview the key trends that Instaclustr is seeing in adoption of open source solutions for large scale, high reliability applications. What are the predominant use cases and technologies and what approaches are driving he most successful adoption? Looking forward, we will survey the trends shaping the development and proliferation of open source technology and how enterprises can shape their technology strategic to take advantage of these developments.
  • Instaclustr's Open Source Tools For Cassandra - LDAP/Kerberos, Prometheus Export Nov 19 2019 6:00 pm UTC 60 mins
    Adam Zegelin, Co-Founder of Instaclustr
    This session walks devs through Instaclustr's Cassandra tools and how they add key functionality and ease-of-use to their deployments.
    -An LDAP authenticator plug-in for Cassandran. The open source LDAP authenticator plug-in works closely with the existing CassandraAuthorizer implementation. The plug-in enables developers to quickly reap the benefits of secure LDAP authentication without the need to write their own solutions, and to transition to using the authenticator with zero downtime.
    -A Kerberos authenticator plug-in for CassandranThe open source Kerberos authenticator plug-in enables Cassandra users to leverage Kerberos’ industry-leading secure authentication and true single sign-on capabilities. The open source project also includes a Kerberos authenticator plugin for the Cassandra Java driver.
    - Cassandra Prometheus Exporter. The cassandra-exporter is a high-performance metrics collection agent that allows for easy integration with the Prometheus monitoring solution. It has been designed to collect detailed metrics on production-sized clusters with complex schemata with minimal performance impact while at the same time following Prometheus's best practices for exporting metrics. - Additional utilities and debugging toolsnIncluding a utility to assist with backup and restore to various cloud providers, and tools to provide debug-level information about SSTables. - A Cassandra operator for running and operating Cassandra within KubernetesnThe open source Cassandra operator functions as a Cassandra-as-a-Service on Kubernetes, fully handling deployment and operations duties so that developers don’t have to. It also offers a consistent environment and set of operations founded on best practices, which is reproducible across production clusters and development, staging, and QA environments. The audience for this presentation will learn the specifics of how to implement – and get the most out of – these open source solutions.
  • Storing and Using Metrics for 3000 nodes - How Instaclustr use a Time Series Cas Oct 3 2019 5:00 pm UTC 60 mins
    Jordan Braiuka, Senior Software Engineer
    At Instaclustr, we use one of our own Cassandra clusters and a custom monitoring application to store, process, and retrieve metrics on every single node in our fleet. The talk will introduce how we collect, process, store, and rollup all the metrics which pass through our monitoring system every second. We will discuss why using Cassandra, time buckets, and Spark is really suited to efficiently and quickly store, and query all that monitoring data.
  • Apache Kafka 101 Recorded: Sep 5 2019 47 mins
    Zeke Dean, Senior Consultant, Instaclustr
    Are you responsible for building Data pipelines for your organization or for your customers? Are you coming from a background where you worked on traditional ESB (Enterprise Service Bus), Middleware and ETL based systems but are no longer comfortable with modern trends in these domains? Or are you just interested in learning a technology that provides framework for message processing, stream processing, real time analytics, event sourcing, or acting as a central data bus for your organization.

    Apache Kafka is the leading distributed streaming and queuing technology for large-scale, always-on applications. Kafka has built-in features of horizontal scalability, high-throughput and low-latency. It is highly reliable, has high-availability, and allows geographically distributed data streams and stream processing applications. It’s used by pretty much all the big tech giants; LinkedIn, Netflix, Uber, Twitter, AirBnB and traditional names like Adidas, Goldman Sachs; the list goes on.

    Instaclustr is organizing a multi-part webinar series on Apache Kafka with an aim to help you in your journey of learning and making the best use of this technology.

    In part-1 of this series, we will start with the basics of Apache Kafka, Kafka ecosystem, overview of its architecture and explore concepts like brokers, topics, partitions, logs, producers, consumers, consumer groups, etc.
  • Apache Kafka Guide 101 Part 2 Recorded: Sep 5 2019 53 mins
    Alok Dwivedi, Senior Consultant at Instaclustr
    Instaclustr is back with part 2 of Apache Kafka 101 webinar series for an audience coming from traditional ESB (Enterprise Service Bus), Middleware and ETL based systems who want to learn about a modern-day approach to message processing, stream processing, real-time analytics, event sourcing, or acting as an Enterprise-wide central data bus.

    In part-1 of this series, we looked at the basics of Apache Kafka, Kafka ecosystem, an overview of its architecture and explored concepts like brokers, topics, partitions, logs, producers, consumers, consumer groups, etc.

    Instaclustr is now organising part 2 of this webinar series. In this part, we will talk about topic design and partitioning. We will at good practices to keep in mind when designing for the required throughput of your applications. We will also look at the message key role with regards to guaranteeing message order. This will move you forward another step in your journey of learning and making the best use of Apache Kafka.
  • Apache Cassandra Guide 101 Part 2 Recorded: Sep 3 2019 51 mins
    Alok Dwivedi, Senior Consultant at Instaclustr
    Instaclustr is back with part 2 of Apache Cassandra 101 webinar series with an aim to help Software Engineers, Database developers, DBA, DevOps, Enterprise or Solution Architects, take the next step in their journey of learning and making the best use of Apache Cassandra.

    In part-1 of this series, we looked at Cassandra basics, compared it with RDBMS for good and bad, and then moved on to explore concepts like replication, partitioning, token ranges, consistency.

    In part 2, we will explore in detail how Cassandra stores data i.e. write path and how data is retrieved i.e. read path. In this process, you will learn about some key concepts like memtable, SSTables, Co-ordinator, Compaction, Bloom filter, Row and key caches etc.
  • Apache Cassandra 101 Recorded: Aug 20 2019 52 mins
    Alok Dwivedi, Senior Consultant, Instaclustr
    If you are a Software Engineer, Database developer, DBA, DevOps, Enterprise or Solution Architect who has heard of Big Data and NoSQL but doesn't know exactly what the fuss is all about and want to learn more then this webinar is for you.

    Apache Cassandra is a highly available, linearly scalable, fault-tolerant database that can offer extremely high throughput and very low latency. It's one of the most popular and most successful NoSQL databases. It has been widely adopted by biggest tech names, major world-renowned brands and research institutes alike. This list includes companies like Apple, Twitter, Facebook, Netflix, Rackspace, Reddit and also research institutes like CERN.

    Instaclustr is organizing a multi-part webinar series on Apache Cassandra with an aim to help you in your journey of learning and making the best use of Apache Cassandra.

    In part-1 of this series, we will start with demystifying NoSQL, look at Cassandra basics, compare it with RDBMS for good and bad, and then move on to explore concepts like replication, partitioning, token ranges, consistency etc.
  • Apache Kafka 101 Guide Part 1 Recorded: Aug 15 2019 57 mins
    Alok Dwivedi, Senior Consultant, Instaclustr
    Are you responsible for building Data pipelines for your organization or for your customers? Are you coming from a background where you worked on traditional ESB (Enterprise Service Bus), Middleware and ETL based systems but are no longer comfortable with modern trends in these domains? Or are you just interested in learning a technology that provides framework for message processing, stream processing, real time analytics, event sourcing, or acting as a central data bus for your organization.

    Apache Kafka is the leading distributed streaming and queuing technology for large-scale, always-on applications. Kafka has built-in features of horizontal scalability, high-throughput and low-latency. It is highly reliable, has high-availability, and allows geographically distributed data streams and stream processing applications. It’s used by pretty much all the big tech giants; LinkedIn, Netflix, Uber, Twitter, AirBnB and traditional names like Adidas, Goldman Sachs; the list goes on.

    Instaclustr is organising a multi-part webinar series on Apache Kafka with an aim to help you in your journey of learning and making the best use of this technology.

    In part-1 of this series, we will start with the basics of Apache Kafka, Kafka ecosystem, overview of its architecture and explore concepts like brokers, topics, partitions, logs, producers, consumers, consumer groups, etc.
  • Apache Cassandra 101 Guide Part 1 Recorded: Aug 13 2019 59 mins
    Alok Dwivedi, Senior Consultant, Instaclustr
    If you are a Software Engineer, Database developer, DBA, DevOps, Enterprise or Solution Architect who has heard of Big Data and NoSQL but doesn't know exactly what the fuss is all about and want to learn more then this webinar is for you.

    Apache Cassandra is a highly available, linearly scalable, fault-tolerant database that can offer extremely high throughput and very low latency. It's one of the most popular and most successful NoSQL databases. It has been widely adopted by biggest tech names, major world-renowned brands and research institutes alike. This list includes companies like Apple, Twitter, Facebook, Netflix, Rackspace, Reddit and also research institutes like CERN.

    Instaclustr is organising a multi-part webinar series on Apache Cassandra with an aim to help you in your journey of learning and making the best use of Apache Cassandra.

    In part-1 of this series, we will start with demystifying NoSQL, look at Cassandra basics, compare it with RDBMS for good and bad, and then move on to explore concepts like replication, partitioning, token ranges, consistency etc.
  • Strategies to Ensure Data Integrity with Apache Kafka Recorded: Jul 30 2019 26 mins
    Zeke Dean, Senior Consultant, Instaclustr
    Ensuring reliable data storage and delivery with Apache Kafka can prove to be a concern. Those trying to implement Apache Kafka are forced to deal with critical questions such as: how data order can be at risk? How data could potentially be lost? How records could be accidentally duplicated?


    Join this webinar as we explore the essential components of building an effective strategy for data integrity with Apache Kafka. This webinar will also dive into how to resolve the above issues using a combination of:
    - Effective Topic and Partitioning strategies
    - Effective Data Keying Strategies
    - Exactly once Semantics with Producers and Consumers
  • Kafka, Cassandra and Kubernetes: Real-time Anomaly Detection at Scale Recorded: Jun 25 2019 45 mins
    Paul Brebner
    Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this webinar, we will discuss how we architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. We will demonstrate the scalability, performance and cost effectiveness of Kafka and Cassandra, with results from our experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day.
  • Introducing Certified Apache Cassandra Recorded: Jun 11 2019 28 mins
    Ben Slater, Chief Product Officer, Instaclustr
    Gain confidence to build on open source Apache Cassandra

    Introducing the Instaclustr Open Source Certification Framework and Certified Apache Cassandra. Learn about Instaclustr's new Open Source Certification Framework and first certified product, Apache Cassandra. The certification framework aims to build on Instaclustr’s extensive open source experience to provide additional assurance to companies that the open source software they are building on is robust and well supported. This webinar will explain the details of the framework and the certification process for Apache Cassandra.

    Bio: Ben Slater, Chief Product Officer at Instaclustr. As Chief Product Officer, Ben is charged with steering Instaclustr’s development roadmap and overseeing the product engineering, production support, open source and consulting teams. Ben has over 20 years experience in systems development including previously as lead architect for the product that is now Oracle Policy Automation and over 10 years as a solution architect and project manager for Accenture. He has extensive experience in managing development teams and implementing quality controlled engineering practices.
  • Open Source Database Adoption: Purism Meets Practicality Recorded: May 30 2019 49 mins
    Jim Curtis (Senior Analyst, 451 Research) and Ben Bromhead (CTO, Instaclustr)
    In this webinar Jim Curtis (Senior Analyst, 451 Research) and Ben Bromhead (CTO, Instaclustr) will look at the broad trends we are currently seeing in the open source database world, both on-premise and in the cloud. Jim will share how companies are adopting different open source projects and how they are engaging vendors with different service and license models. Ben will then explore how Instaclustr fits into this ecosystem and how you can leverage Instaclustr to deploy open source database capabilities.

    At the end of this webinar, you will leave with guidance on how to be successful in picking the right tool or service for the job as well as a solid understanding of where Instaclustr's offerings around Cassandra, Kafka, Spark and Elasticsearch fit in the ecosystem.
  • How to Achieve Reliability at Scale with Cassandra Recorded: May 16 2019 56 mins
    Christophe Schmitz
    In this session, Christophe Schmitz, SVP Consulting at Instaclustr will be focusing on the key mechanism that makes Cassandra so reliable and the different level of availability you might want to reach. Christophe will also be disusing how to achieve is and the pros and cons of each solution.

    The Presentation will cover

    •The different level of availability you might want to reach
    •How to achieve it and the pros and cons of each solution.

    At the end of this talk, you will have a better understanding of the Cassandra build in multi-data center feature and be able to make the right Cassandra architecture decision to balance cost vs availability.
  • Improving the observability of Cassandra/Kafka apps w/ Prometheus & OpenTracing Recorded: Mar 26 2019 44 mins
    Paul Brebner, Technology Evangelist, Instaclustr
    As distributed applications grow more complex, dynamic, and massively scalable, “observability” becomes more critical. Observability is the practice of using metrics, monitoring and distributed tracing to understand how a system works. In this webinar we’ll explore two complementary Open Source technologies: Prometheus for monitoring application metrics; and OpenTracing and Jaeger for distributed tracing. We’ll discover how they improve the observability of an Anomaly Detection system - an application which is built around Instaclustr managed Apache Cassandra and Apache Kafka clusters, and dynamically deployed and scaled on Kubernetes (on AWS EKS).
  • Architects Guide to Scalable Technologies: Cassandra, Kafka and Elasticsearch Recorded: Mar 5 2019 60 mins
    Ben Bromhead
    In this session, Ben Bromhead, CTO of Instaclustr will provide a base level introduction to some of the technologies and design patterns used to build out scalable and resilient applications.

    This webinar will cover:
    • When you should use these technologies and key considerations.
    • Different architecture options when combining these leading open source technologies.
    • How to use the technologies in a resilient and scalable way and real-world application patterns such as IoT, social apps and consumer services.
  • Architects and CTO Guide to Scalable Technologies Part 2 Recorded: Feb 19 2019 52 mins
    Ben Bromhead, CTO and Co-Founder, Instaclustr
    In this session, Ben Bromhead, CTO of Instaclustr will provide a base level introduction to some of the technologies and design patterns used to build out scalable and resilient applications. This webinar will be aimed at an introductory level. The technologies we will cover include:
    - Cassandra
    - Kafka
    - Spark
    - Elasticsearch

    We will cover when you should use these technologies and key considerations when choosing these technologies for your application architecture. In addition, we will touch on different architecture options when combining these leading open source technologies into an overall solution. We will also discuss how to use the technologies in a resilient and scalable way and real-world application patterns such as IoT, social apps and consumer services.
  • Architects Guide to Scalable Technologies: Cassandra, Kafka, and Elasticsearch Recorded: Jan 29 2019 49 mins
    Ben Bromhead
    In this session, Ben Bromhead, CTO of Instaclustr will provide a base level introduction to some of the technologies and design patterns used to build out scalable and resilient applications. This webinar will be aimed at an introductory level. The technologies we will cover include:
    - Cassandra
    - Kafka
    - Spark
    - Elasticsearch

    We will cover when you should use these technologies and key considerations when choosing these technologies for your application architecture. In addition, we will touch on different architecture options when combining these leading open source technologies into an overall solution. We will also discuss how to use the technologies in a resilient and scalable way and real-world application patterns such as IoT, social apps and consumer services.
  • Automating Apache Kafka Recorded: Dec 12 2018 39 mins
    Ben Slater, Chief Product Officer, Instaclustr
    Instaclustr recently extended their managed service for Apache Cassandra to include support for Apache Kafka, this presentation will walk through the key challenges and lessons learned in extending Instaclustr’s provisioning and management system for Kafka.

    Specific details will include: benchmarking, choosing appropriate cloud provider configurations, and security configuration.
  • Scaling applications with Kubernetes, Apache Kafka and Apache Cassandra Recorded: Nov 8 2018 31 mins
    Paul Brebner, Technology Evangelist
    Instaclustr provides Apache Cassandra, Apache Kafka and Apache Spark as a managed service. This presentation will work through integrating these technologies with a Kubernetes-deployed business logic layer to produce a massively scaleable application. Using anomaly detection as a illustrative use case we will work through topics including: the benefits of using Kafka as a buffer; design of Kafka and Cassandra client applications for scaleability, deployment automation and benchmarking results.
Managed Platform for Open Source Technologies
Instaclustr delivers reliability at scale through our integrated data platform of open source technologies such as Apache Cassandra, Apache Kafka, Apache Spark and Elasticsearch. Our expertize stems from delivering more than 25+ million node hours under management, allowing us to run the world’s most powerful data technologies effortlessly.

We provide a range of managed, consulting and support services to help our customers develop and deploy solutions around open source technologies. Our integrated data platform, built on open source technologies, powers mission critical, highly available applications for our customers and help them achieve scalability, reliability and performance for their applications.

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: Apache Kafka Guide 101 Part 2
  • Live at: Sep 5 2019 9:00 am
  • Presented by: Alok Dwivedi, Senior Consultant at Instaclustr
  • From:
Your email has been sent.
or close