Tech Conferences

GraphQL Galaxy 2022

GraphQL Galaxy 2022

English versionEN

GraphQL Subscriptions with Debezium and Kafka

Nils Hartmann

Freelancer, Germany

Reacting on data changes and publishing those changes as GraphQL events with subscriptions can be hard, especially in a multi-service environment with multiple databases or when scaling your GraphQL server with multiple instances. GraphQL clients shouldn't miss events or receive them twice, no matter how your backend architecture looks like or what trouble (service goes down, database connection lost, ...) they might have when serving a subscription request.

In this talk, I will show you, how Debezium and Apache Kafka can help you building reliable subscriptions from changes in your database. Debezium is a change data capture (CDC) tool that can forward changes from a database' transaction log in to the Kafka message broker.

In my talk I will use a GraphQL backend implement in Java with "Spring for GraphQL", but as Debezium and Kafka are not tied to java the idea is usable also with other GraphQL frameworks and programming languages. You do not need to have knowledge of Java or Spring for GraphQL" to understand the talk.

Video Summary and Transcription

The video explores the integration of GraphQL subscriptions with Apache Kafka and Debezium, highlighting how this setup addresses challenges in maintaining data consistency across multiple service instances. When a client performs a mutation, Debezium captures the change from the database and sends it to Kafka, ensuring all instances are updated. This process enhances real-time updates for GraphQL subscriptions, allowing clients to receive data changes immediately. By using a message broker like Kafka, the architecture ensures robust delivery guarantees, making it ideal for scenarios where multiple services need to stay synchronized. The talk also touches on the potential of using this setup for building a dedicated read model database, optimizing data retrieval for GraphQL APIs. This technology stack not only supports subscriptions but can also be leveraged for efficient queries. Additionally, the source code for the sample application is available on GitHub, providing a practical example of this architecture in action.

Available in Español: Suscripciones de GraphQL con Debezium y Kafka

FAQ

The main topic of the talk is about using GraphQL subscriptions with Kafka and Debezium.

The speaker of the talk is Nils, a freelance software developer from Hamburg, Germany.

GraphQL subscriptions inform clients about new data by sending events to the clients when a mutation occurs that adds new data.

Apache Kafka solves the problem by acting as a message broker. When a mutation occurs, a message is sent to Kafka, which is then listened to by all service instances. They can then send the necessary updates to the connected clients.

The additional complexity introduced is that service instances must write to the same database and ensure that messages are sent to Kafka even if there are issues in committing data or if other applications directly write to the database.

Debezium acts as a change data capture tool that reads changes (inserts, updates, deletes) from the database and writes events for these actions to Kafka, ensuring all service instances are informed of database changes.

Debezium and Kafka provide delivery guarantees that any change in the database (update, insert, delete) will be published to Kafka and received by the service instances, ensuring that subscription data can be sent to clients.

Yes, the technology stack can be used for queries as well. By building a dedicated read model database for the GraphQL API, it can optimize data retrieval without querying all microservices.

The source code for the sample application built with GraphQL Java and Spring for GraphQL is available on GitHub. The URL is mentioned in the talk.

The problem with multiple service instances in a GraphQL setup is that one instance may not know about changes (mutations) that occur in another instance, leading to some clients not receiving updates.

1. GraphQL Subscriptions with Kafka and Debezium#

Short description:

Welcome to my lightning talk about GraphQL subscriptions with Kafka and Debezium. We have three clients and a service that provides a GraphQL API. When client one adds a new customer, the service can send events to clients two and three. However, there can be issues when multiple service instances are involved, or when writing data to a database. To solve these problems, we can add a message broker like Apache Kafka and a change data capture tool like Debezium to our deployment.

Hello and welcome to my lightning talk about GraphQL subscriptions with Kafka and Debezium. My name is Nils and I'm a freelance software developer from Hamburg in Germany.

Let's have a look at this image here. We have three clients and we have a service that provides a GraphQL API. Client number two and client number three send subscriptions to the service to get informed about new customers. When client number one sends a mutation to add a new customer, our service and our GraphQL API can send events to client number two and three informing them about new customers.

In real life, this setup might be a little bit more complex because we might have more than one instance of the same service like in this case. In this case, client number two sends the subscription request to service instance number one, while client number three sends its request to service instance number two. Now when client number one executes the mutation in service instance number one, service instance number one can inform client number two about the new customer. But unfortunately, client number three does not receive an event because service instance number two does not know anything about the new added customer about the executed mutation.

To solve this problem, service instance number one must inform service instance two about things that happen like the mutation. We can solve this problem by adding a message broker like Apache Kafka to our deployment. In this case, client one still sends a mutation to service instance number one. But service instance one instead of sending the subscription directly to client two, sends a message to the message broker. The message contains the information about the new customer and both service instance one and two are listening to this message from the message broker. When they receive the message they can send out the subscription data to both their connected clients two and three. Both clients are happy now.

In real life, things are a little bit more complex because we are writing data to a database. In this case, service instance one and two should write to the same database, and when service instance one wrote something to the database, still the message will be sent to Apache Kafka and both clients two and three get informed about the new customer. But in real life, things can go wrong. For example, after committing the new customer, service instance number one is not able to send a message to Kafka for whatever reason. In that case, none of the clients will receive an event. Also, what can happen is that we have another application that writes directly to the database so that service instance number one does not know about these changes and thus cannot send a message through the message broker. And again, client two and three are not informed about the change to our data.

To solve this kind of problems, we can add a change data capture tool like Debezium to our tool stack. A change data capture tool reads everything that happens in your database like inserts, updates, and deletes and writes events for these actions to a message broker. In the case of Debezium, Debezium publishes change events to Apache Kafka. A Debezium change event might look like this. It has a source attribute where the table, for example, is set. It has an operation like update, delete, or insert that describes what has happened in the database, and it has the before and after data.

2. Architecture with Debezium and Kafka#

Short description:

In this case, Debezium picks up changes directly from the database and sends CDC event messages to the connected message broker. The service instances receive these events, interpret them, and send subscription data to the clients. Thanks to Debezium and Apache Kafka, we can be sure that any change in the database will be published to Kafka and received by our service instance. We can also use this technology stack for queries by building a dedicated read model database for our GraphQL API.

In this case, the before and after data of an update operation. Our architecture with Debezium would look like this. Client one still sends the mutation directly to service instance one. Service instance number one writes the new customer to the database or another application writes something to the database.

And in both cases, Debezium picks up the changes directly from your database and sends a CDC event message to the connected message broker. Both service instance number one and number two receive this CDC change data capture events, can interpret this events, and send subscription data via their GraphQL API to client number two and client number three. And both clients are happy now.

Thanks to the delivery guarantees that Debezium and Apache Kafka gives us, we can be sure that any change in the database, any update, insert, delete, will be published to Kafka and will be received by our service instance so that we can be sure that we can send a subscription for any change in the database for whatever reason the database has been changed.

If you want to try out this yourself, I built a small sample application built with GraphQL Java and Spring for GraphQL. You find the source code in the GitHub repository in the URL below.

By the way, this technology stack we can not only use for subscriptions, but I think also for queries. We could build a dedicated read model database for our GraphQL API. Imagine we have a list of microservices each connected to their own database. Using Debezium and Apache Kafka, we can pick all changes to all databases and build a dedicated optimized database only for our GraphQL API. The GraphQL API then can read the data from this specific database and does not need to query all the microservices to get the data that is requested in a GraphQL query.

Table Of Contents

1. GraphQL Subscriptions with Kafka and Debezium 2. Architecture with Debezium and Kafka

Nils Hartmann

7 min

08 Dec, 2022

Comments

Sign in or register to post your comment.

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

From GraphQL Zero to GraphQL Hero with RedwoodJS

GraphQL Galaxy 2021

32 min

From GraphQL Zero to GraphQL Hero with RedwoodJS

Top Content

Tom Preston-Werner

Tom Preston-Werner

GitHub cofounder, RedwoodJS author

Tom Pressenwurter introduces Redwood.js, a full stack app framework for building GraphQL APIs easily and maintainably. He demonstrates a Redwood.js application with a React-based front end and a Node.js API. Redwood.js offers a simplified folder structure and schema for organizing the application. It provides easy data manipulation and CRUD operations through GraphQL functions. Redwood.js allows for easy implementation of new queries and directives, including authentication and limiting access to data. It is a stable and production-ready framework that integrates well with other front-end technologies.

frameworks graphql redwoodjs builders and founders

Local State and Server Cache: Finding a Balance

Vue.js London Live 2021

24 min

Local State and Server Cache: Finding a Balance

Top Content

Natalia Tepluhina

Natalia Tepluhina

This Talk discusses handling local state in software development, particularly when dealing with asynchronous behavior and API requests. It explores the challenges of managing global state and the need for actions when handling server data. The Talk also highlights the issue of fetching data not in Vuex and the challenges of keeping data up-to-date in Vuex. It mentions alternative tools like Apollo Client and React Query for handling local state. The Talk concludes with a discussion on GitLab going public and the celebration that followed.

graphql vue server cache

Batteries Included Reimagined - The Revival of GraphQL Yoga

GraphQL Galaxy 2021

33 min

Batteries Included Reimagined - The Revival of GraphQL Yoga

Uri Goldshtein

Founder of The Guild, the largest open source group in GraphQL ecosystem.

Envelope is a powerful GraphQL plugin system that simplifies server development and allows for powerful plugin integration. It provides conformity for large corporations with multiple GraphQL servers and can be used with various frameworks. Envelope acts as the Babel of GraphQL, allowing the use of non-spec features. The Guild offers GraphQL Hive, a service similar to Apollo Studio, and encourages collaboration with other frameworks and languages.

graphql react server components

Rock Solid React and GraphQL Apps for People in a Hurry

GraphQL Galaxy 2022

29 min

Rock Solid React and GraphQL Apps for People in a Hurry

Ryan Chenkie

Founder @ CourseLift

The Talk discusses the challenges and advancements in using GraphQL and React together. It introduces RedwoodJS, a framework that simplifies frontend-backend integration and provides features like code generation, scaffolding, and authentication. The Talk demonstrates how to set up a Redwood project, generate layouts and models, and perform CRUD operations. Redwood automates many GraphQL parts and provides an easy way for developers to get started with GraphQL. It also highlights the benefits of Redwood and suggests checking out RedwoodJS.com for more information.

Adopting GraphQL in an Enterprise

GraphQL Galaxy 2021

32 min

Adopting GraphQL in an Enterprise

Shruti Kapoor

Lead Front End Engineer @ Slack

Today's Talk is about adopting GraphQL in an enterprise. It discusses the challenges of using REST APIs and the benefits of GraphQL. The Talk explores different approaches to adopting GraphQL, including coexistence with REST APIs. It emphasizes the power of GraphQL and provides tips for successful adoption. Overall, the Talk highlights the advantages of GraphQL in terms of efficiency, collaboration, and control over APIs.

graphql enterprise

Step aside resolvers: a new approach to GraphQL execution

GraphQL Galaxy 2022

16 min

Step aside resolvers: a new approach to GraphQL execution

Benjie

GraphQL Technical Steering Committee

GraphQL has made a huge impact in the way we build client applications, websites, and mobile apps. Despite the dominance of resolvers, the GraphQL specification does not mandate their use. Introducing Graphast, a new project that compiles GraphQL operations into execution and output plans, providing advanced optimizations. In GraphFast, instead of resolvers, we have plan resolvers that deal with future data. Graphfast plan resolvers are short and efficient, supporting all features of modern GraphQL.

graphql api development

Workshops on related topic

Build a Headless WordPress App with Next.js and WPGraphQL

React Summit 2022

173 min

Build a Headless WordPress App with Next.js and WPGraphQL

Top Content

Workshop

Kellen Mace

In this workshop, you’ll learn how to build a Next.js app that uses Apollo Client to fetch data from a headless WordPress backend and use it to render the pages of your app. You’ll learn when you should consider a headless WordPress architecture, how to turn a WordPress backend into a GraphQL server, how to compose queries using the GraphiQL IDE, how to colocate GraphQL fragments with your components, and more.

next.js wordpress graphql

Build with SvelteKit and GraphQL

GraphQL Galaxy 2021

140 min

Build with SvelteKit and GraphQL

Top Content

Workshop

Scott Spence

Have you ever thought about building something that doesn't require a lot of boilerplate with a tiny bundle size? In this workshop, Scott Spence will go from hello world to covering routing and using endpoints in SvelteKit. You'll set up a backend GraphQL API then use GraphQL queries with SvelteKit to display the GraphQL API data. You'll build a fast secure project that uses SvelteKit's features, then deploy it as a fully static site. This course is for the Svelte curious who haven't had extensive experience with SvelteKit and want a deeper understanding of how to use it in practical applications.

Table of contents:
- Kick-off and Svelte introduction
- Initialise frontend project
- Tour of the SvelteKit skeleton project
- Configure backend project
- Query Data with GraphQL
- Fetching data to the frontend with GraphQL
- Styling
- Svelte directives
- Routing in SvelteKit
- Endpoints in SvelteKit
- Deploying to Netlify
- Navigation
- Mutations in GraphCMS
- Sending GraphQL Mutations via SvelteKit
- Q&A

Relational Database Modeling for GraphQL

GraphQL Galaxy 2020

106 min

Relational Database Modeling for GraphQL

Top Content

Workshop

Adron Hall

In this workshop we'll dig deeper into data modeling. We'll start with a discussion about various database types and how they map to GraphQL. Once that groundwork is laid out, the focus will shift to specific types of databases and how to build data models that work best for GraphQL within various scenarios.
Table of contentsPart 1 - Hour 1 a. Relational Database Data Modeling b. Comparing Relational and NoSQL Databases c. GraphQL with the Database in mindPart 2 - Hour 2 a. Designing Relational Data Models b. Relationship, Building MultijoinsTables c. GraphQL & Relational Data Modeling Query Complexities
Prerequisites a. Data modeling tool. The trainer will be using dbdiagram b. Postgres, albeit no need to install this locally, as I'll be using a Postgres Dicker image, from Docker Hub for all examples c. Hasura

database graphql

Build and Deploy a Backend With Fastify & Platformatic

JSNation 2023

104 min

Build and Deploy a Backend With Fastify & Platformatic

Top Content

WorkshopFree

Matteo Collina

Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/).
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.

node.js cloud graphql fastify

Building GraphQL APIs on top of Ethereum with The Graph

GraphQL Galaxy 2021

48 min

Building GraphQL APIs on top of Ethereum with The Graph

Workshop

Nader Dabit

The Graph is an indexing protocol for querying networks like Ethereum, IPFS, and other blockchains. Anyone can build and publish open APIs, called subgraphs, making data easily accessible.

In this workshop you’ll learn how to build a subgraph that indexes NFT blockchain data from the Foundation smart contract. We’ll deploy the API, and learn how to perform queries to retrieve data using various types of data access patterns, implementing filters and sorting.

By the end of the workshop, you should understand how to build and deploy performant APIs to The Graph to index data from any smart contract deployed to Ethereum.

graphql ethereum api development

Hard GraphQL Problems at Shopify

GraphQL Galaxy 2021

164 min

Hard GraphQL Problems at Shopify

Workshop

Rebecca Friedman

Jonathan Baker

Alex Ackerman

Théo Ben Hassen

Greg MacWilliam

5 authors

At Shopify scale, we solve some pretty hard problems. In this workshop, five different speakers will outline some of the challenges we’ve faced, and how we’ve overcome them.

Table of contents:
1 - The infamous "N+1" problem: Jonathan Baker - Let's talk about what it is, why it is a problem, and how Shopify handles it at scale across several GraphQL APIs.
2 - Contextualizing GraphQL APIs: Alex Ackerman - How and why we decided to use directives. I’ll share what directives are, which directives are available out of the box, and how to create custom directives.
3 - Faster GraphQL queries for mobile clients: Theo Ben Hassen - As your mobile app grows, so will your GraphQL queries. In this talk, I will go over diverse strategies to make your queries faster and more effective.
4 - Building tomorrow’s product today: Greg MacWilliam - How Shopify adopts future features in today’s code.
5 - Managing large APIs effectively: Rebecca Friedman - We have thousands of developers at Shopify. Let’s take a look at how we’re ensuring the quality and consistency of our GraphQL APIs with so many contributors.

case study scalability graphql

Follow us

Upcoming events

Subscribe to the top JS conferences

and grow in-depth as engineer with insights from library authors and core teams

JSNation US 2025

New York, Nov 17 - 20, 2025

Want to sponsor our events?

React Summit US 2025

New York, Nov 18 - 21, 2025

TechLead Conference 2025: AI in Orgs

Sep 18 - 19, 2025

AI Coding Summit

October, 2025

React Advanced 2025

London, Nov 27 - Dec 1, 2025

TechLead Conf London 2025: Adopting AI in Orgs Edition

London, Nov 28, 2025

React Advanced Canada 2026

Toronto, Mar 24 - 26, 2026

Node Congress 2026

April, 2026

JSNation 2026

Amsterdam, Jun 11 - 15, 2026

React Summit 2026

Amsterdam, Jun 11 - 15, 2026