Getting Real with NodeJS and Kafka: An Event Stream Tale

Rate this content
Bookmark

In this lightning talk we will review some basic principles of event based systems using NodeJS and Kafka and get insights of real mission critical use cases where the default configurations are not enough. We will cover quick tips and tricks and good practices of Kafka with NodeJS using KafkaJS library and see some real code at a lightning speed.

This talk has been presented at JSNation 2022, check out the latest edition of this JavaScript Conference.

FAQ

The speaker has a background in distributed computing, working with scaling companies, big data, and streaming platforms for the past seven years.

The speaker currently lives in the Netherlands and has been living there for seven years.

The speaker works for Bitvavo, the biggest crypto exchange in the Netherlands.

The speaker writes about Kafka, Kubernetes, and mostly backend topics in their technical blog posts.

A common pattern for integrating multiple databases and data sources is using Kafka to share events between systems.

A 'poison pill' in Kafka is a message that causes the consumer to break, stopping the system. It can be avoided by using a dead letter queue and defining strong types with schemas.

A standard Kafka producer and consumer provide 'at least once' guarantees, which means there might be duplicate messages.

To ensure exactly-once processing in Kafka, you need to configure idempotence to true on the producer side, disable auto-commit offsets on the consumer side, and use transactional boundaries.

Setting a transaction ID in Kafka is important to ensure that consuming, processing, and sending messages happen as part of a single atomic transaction.

For exactly-once semantics, Kafka clusters should have at least three partitions and at least two in-sync replicas for the topics.

Marcos Maia
Marcos Maia
8 min
16 Jun, 2022

Comments

Sign in or register to post your comment.
Video Summary and Transcription
This lightning talk introduces distributed computing and discusses the challenges, patterns, and solutions related to using Kafka for event sharing. It emphasizes the importance of separating services and using strong typing to avoid broken messages. The talk also covers Kafka's transaction configuration and guarantees, highlighting the need for proper configuration and the use of transaction IDs. Overall, it provides valuable insights into scaling companies, big data, and streaming platforms.

1. Introduction to Distributed Computing and Bitvavo

Short description:

Hello, everyone. This is a lightning talk where I'll be discussing distributed computing, scaling companies, big data, and streaming platforms. I live in the Netherlands and write technical blog posts. I work for Bitvavo, the biggest crypto exchange in the Netherlands, with a mission to bring the opportunity to trade crypto for everyone.

Hello, everyone. I hope this fits. We had some technical challenge, but let's go. This is a lightning talk. So I'll be very fast.

I spent a lot of time, much more than the talk, thinking about what can I say in this short time that will help you at least to go out from here that feels like, OK, I learned something, or maybe he made me think about something. So my background is distributed computing. So I work with scaling companies usually, and helping systems to really scale. Big data, I worked a lot. Streaming platforms, it's my bread and broth for the past like seven years. And currently, I live in the Netherlands for seven years, from Brazil. And I write a technical blog post. Currently this one, I had another three or four different places where I used to write. But you can find my most recent articles there, usually talking about Kafka, about Kubernetes, mostly back end, in my case. I work for Bitvavo. It's the biggest crypto exchange in the Netherlands. So if you are into crypto or want to be into crypto, it's as quick as clicking a button like you see there. And that's the goal from the company. It's really to bring the opportunity to trade crypto for everyone. And that's what we're doing. That's our mission.

2. Challenges and Solutions with Kafka Event Sharing

Short description:

In this section, I will discuss the challenges, common mistakes, patterns, and solutions related to using Kafka to share events between systems. It is important to separate services in a global platform to avoid reliance on databases. Sending events as JSON can be convenient, but without a contract, broken events can disrupt the system. Kafka's event queue can lead to a system halt when a message cannot be processed, resulting in a poison pill.

So what I'm going to try to talk in this short time, I'm going to talk a bit about this world where we live. Many of us, I'm sure a lot of you, use Kafka to share events between systems. And this is a requirement, of course, because in the current world that we go global with our platforms and your applications, we cannot be reliant on the database. So we really need to separate our services, right?

And I'm going to talk about a few challenges, common mistakes and patterns that we use, and solutions for that. So this is a normal services architecture. You can call it microservices. It really depends where you are, how you do it. It doesn't matter. The important thing here is that you have multiple databases, multiple data sources. You are integrating things through Kafka. And that's a common pattern, more and more. I bet many of you have this.

And a common way to do it, and I've seen this especially on the TypeScript, JavaScript world, is that you send events using JSON. That's very easy because everything is JSON, but the problem is you don't have really a contract, right? If you send events to a JSON, with the producers, the sending side might send something that's actually broken, or other producers might send it, and the consumer starts processing that and breaks up. And the way Kafka works is a queue of events. If you cannot process a message, it doesn't go forward in processing those messages. And then suddenly you are stuck, and your whole system stops because you have what we call a poison pill.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
The talk discusses the importance of supply chain security in the open source ecosystem, highlighting the risks of relying on open source code without proper code review. It explores the trend of supply chain attacks and the need for a new approach to detect and block malicious dependencies. The talk also introduces Socket, a tool that assesses the security of packages and provides automation and analysis to protect against malware and supply chain attacks. It emphasizes the need to prioritize security in software development and offers insights into potential solutions such as realms and Deno's command line flags.
Towards a Standard Library for JavaScript Runtimes
Node Congress 2022Node Congress 2022
34 min
Towards a Standard Library for JavaScript Runtimes
Top Content
There is a need for a standard library of APIs for JavaScript runtimes, as there are currently multiple ways to perform fundamental tasks like base64 encoding. JavaScript runtimes have historically lacked a standard library, causing friction and difficulty for developers. The idea of a small core has both benefits and drawbacks, with some runtimes abusing it to limit innovation. There is a misalignment between Node and web browsers in terms of functionality and API standards. The proposal is to involve browser developers in conversations about API standardization and to create a common standard library for JavaScript runtimes.
ESM Loaders: Enhancing Module Loading in Node.js
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
ESM Loaders enhance module loading in Node.js by resolving URLs and reading files from the disk. Module loaders can override modules and change how they are found. Enhancing the loading phase involves loading directly from HTTP and loading TypeScript code without building it. The loader in the module URL handles URL resolution and uses fetch to fetch the source code. Loaders can be chained together to load from different sources, transform source code, and resolve URLs differently. The future of module loading enhancements is promising and simple to use.
Out of the Box Node.js Diagnostics
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
This talk covers various techniques for getting diagnostics information out of Node.js, including debugging with environment variables, handling warnings and deprecations, tracing uncaught exceptions and process exit, using the v8 inspector and dev tools, and generating diagnostic reports. The speaker also mentions areas for improvement in Node.js diagnostics and provides resources for learning and contributing. Additionally, the responsibilities of the Technical Steering Committee in the TS community are discussed.
Node.js Compatibility in Deno
Node Congress 2022Node Congress 2022
34 min
Node.js Compatibility in Deno
Deno aims to provide Node.js compatibility to make migration smoother and easier. While Deno can run apps and libraries offered for Node.js, not all are supported yet. There are trade-offs to consider, such as incompatible APIs and a less ideal developer experience. Deno is working on improving compatibility and the transition process. Efforts include porting Node.js modules, exploring a superset approach, and transparent package installation from npm.
Multithreaded Logging with Pino
JSNation Live 2021JSNation Live 2021
19 min
Multithreaded Logging with Pino
Top Content
Today's Talk is about logging with Pino, one of the fastest loggers for Node.js. Pino's speed and performance are achieved by avoiding expensive logging and optimizing event loop processing. It offers advanced features like async mode and distributed logging. The use of Worker Threads and Threadstream allows for efficient data processing. Pino.Transport enables log processing in a worker thread with various options for log destinations. The Talk concludes with a demonstration of logging output and an invitation to reach out for job opportunities.

Workshops on related topic

Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
Building a Hyper Fast Web Server with Deno
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
WorkshopFree
Matt Landers
Will Johnston
2 authors
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
0 to Auth in an Hour Using NodeJS SDK
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
Mastering Node.js Test Runner
TestJS Summit 2023TestJS Summit 2023
78 min
Mastering Node.js Test Runner
Workshop
Marco Ippolito
Marco Ippolito
Node.js test runner is modern, fast, and doesn't require additional libraries, but understanding and using it well can be tricky. You will learn how to use Node.js test runner to its full potential. We'll show you how it compares to other tools, how to set it up, and how to run your tests effectively. During the workshop, we'll do exercises to help you get comfortable with filtering, using native assertions, running tests in parallel, using CLI, and more. We'll also talk about working with TypeScript, making custom reports, and code coverage.
GraphQL - From Zero to Hero in 3 hours
React Summit 2022React Summit 2022
164 min
GraphQL - From Zero to Hero in 3 hours
Workshop
Pawel Sawicki
Pawel Sawicki
How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.