Video Summary and Transcription
This lightning talk introduces distributed computing and discusses the challenges, patterns, and solutions related to using Kafka for event sharing. It emphasizes the importance of separating services and using strong typing to avoid broken messages. The talk also covers Kafka's transaction configuration and guarantees, highlighting the need for proper configuration and the use of transaction IDs. Overall, it provides valuable insights into scaling companies, big data, and streaming platforms.
1. Introduction to Distributed Computing and Bitvavo
Hello, everyone. This is a lightning talk where I'll be discussing distributed computing, scaling companies, big data, and streaming platforms. I live in the Netherlands and write technical blog posts. I work for Bitvavo, the biggest crypto exchange in the Netherlands, with a mission to bring the opportunity to trade crypto for everyone.
Hello, everyone. I hope this fits. We had some technical challenge, but let's go. This is a lightning talk. So I'll be very fast.
I spent a lot of time, much more than the talk, thinking about what can I say in this short time that will help you at least to go out from here that feels like, OK, I learned something, or maybe he made me think about something. So my background is distributed computing. So I work with scaling companies usually, and helping systems to really scale. Big data, I worked a lot. Streaming platforms, it's my bread and broth for the past like seven years. And currently, I live in the Netherlands for seven years, from Brazil. And I write a technical blog post. Currently this one, I had another three or four different places where I used to write. But you can find my most recent articles there, usually talking about Kafka, about Kubernetes, mostly back end, in my case. I work for Bitvavo. It's the biggest crypto exchange in the Netherlands. So if you are into crypto or want to be into crypto, it's as quick as clicking a button like you see there. And that's the goal from the company. It's really to bring the opportunity to trade crypto for everyone. And that's what we're doing. That's our mission.
2. Challenges and Solutions with Kafka Event Sharing
In this section, I will discuss the challenges, common mistakes, patterns, and solutions related to using Kafka to share events between systems. It is important to separate services in a global platform to avoid reliance on databases. Sending events as JSON can be convenient, but without a contract, broken events can disrupt the system. Kafka's event queue can lead to a system halt when a message cannot be processed, resulting in a poison pill.
So what I'm going to try to talk in this short time, I'm going to talk a bit about this world where we live. Many of us, I'm sure a lot of you, use Kafka to share events between systems. And this is a requirement, of course, because in the current world that we go global with our platforms and your applications, we cannot be reliant on the database. So we really need to separate our services, right?
And I'm going to talk about a few challenges, common mistakes and patterns that we use, and solutions for that. So this is a normal services architecture. You can call it microservices. It really depends where you are, how you do it. It doesn't matter. The important thing here is that you have multiple databases, multiple data sources. You are integrating things through Kafka. And that's a common pattern, more and more. I bet many of you have this.
And a common way to do it, and I've seen this especially on the TypeScript, JavaScript world, is that you send events using JSON. That's very easy because everything is JSON, but the problem is you don't have really a contract, right? If you send events to a JSON, with the producers, the sending side might send something that's actually broken, or other producers might send it, and the consumer starts processing that and breaks up. And the way Kafka works is a queue of events. If you cannot process a message, it doesn't go forward in processing those messages. And then suddenly you are stuck, and your whole system stops because you have what we call a poison pill.
3. Avoiding Broken Messages with Dead Letter Queue
To avoid broken messages, apply a pattern called dead letter queue and use strong typing. Guarantee that the message type is correct on the producer side to prevent consumer breakage. Implement a dead letter queue approach to handle broken messages and manually use offset committing.
So how do you avoid that? It's quite actually simple. You apply a pattern called dead letter queue and use other schemas to actually define a schema. So it's a strong type.
So now your messages have to comply on the producer side with a specific typing system. So you guarantee that the type is not going to go wrong for that specific topic, and then your consumers will not break for that case. And you can use a dead letter queue approach that if you start consuming a message and it's broken, instead of getting that cycle forever, you try a couple of times, if it doesn't work, you push that to a different queue, and you go to the next message. And for that you might need to use offset committing manually, and I'm gonna go through that and show some because there is a timer here. It's really scary. I have to go fast.
4. Kafka Transaction Configuration and Guarantees
Kafka offers exactly one semantics and transaction boundaries for processing messages in a single transaction. To ensure message integrity, configure Kafka to set idepotence to true on the producer side and disable auto-commit offset on the consumer side. Use a transaction ID to establish boundaries and guarantee atomicity. Keep in mind that Kafka transactions are not distributed transactions like XA transactions. Proper configuration of cluster partitions and in-sync replicas is crucial. Start, send, and commit transactions to ensure message processing or rollback. For more information, refer to the blog post and try the provided docker compose with multiple Kafka nodes for local experimentation.
So think about this standard producer and consumer, if you just use Kafka.js or any client that you decide to use. The normal producer and normal consumers, they don't have strong guarantees. What I mean by that, it doesn't guarantee to you that you're going to send a single message on the producer's side, it might have duplicates. It's what we call at least once. That's the guarantee, which means you can have duplicates. And in some systems, you cannot do that. If you're depositing money, you should not do that. Especially with drawing money from a customer, you should not do that. And on the client's side, you can also have duplicate processing. So the defaults, if you use, that's what you can get. There are solutions for that.
So Kafka offers exactly one semantics. That's the EOS there. And transaction boundaries. And it's very common that you have a pattern like that, that you have a processor, a message that initiates as a process, you want to consume the message, do some processing and produce the message to another topic in a single transaction. So you want to make sure when you consume, do processing, something goes wrong. If you reprocess or restart your system, that you process that message again, because you didn't finalize those three steps, which is sending that message to the next step. And for these, you have to do some configuration in Kafka.
And from the producer's side, you have to set idepotence to true. That guarantees that it's not going to be duplicated and sent. And on the consumer's side, you have to disable the auto-commit offset, which means your client reads the message, but now you have the control. When you want to tell Kafka, I really processed this message, and you commit back the offset. And you want to actually set one property called max in-flight request to one, so you don't have parallel processing and you can keep the ordering guarantees. And on the producer side, on the last step, remember, it's consuming and then producing, and you want all this to be in the same transaction. What you want to set is a transaction ID, so the broker can set the transaction boundaries and say the point is true as well. And what this guarantees is that when you consume, process, and send, it's part of a single atomic transaction. Kafka client, this is not a distributed transaction, like XA transaction. It's not guaranteed if you do a database call on the side that will be rolled back. You have to take care of that. It's on the boundaries of Kafka. It's the same as your normal database transaction between two tables that you are used to, so just keep that in mind.
Some configurations of the cluster, and I'm also done because my clock is blinking already. You should have at least three partitions and at least two in sync replicas for your topics for this to work. So if you try locally and you don't have it, it will not work, and you also have to start the transaction like you can see that, start a transaction, send a message, send the offsets, and then commit the transactions, which means everything is going to happen or it will be rolled back. And that's basically what you want to do, and it's really important to take care of that. That's it. If you want to know more about this, I write in this specific blog post about this, and I also have docker compose where I have multiple Kafka nodes where you can play with this type of more advanced work in your local machine.
Comments