How to Edge Cache GraphQL APIs

Rate this content
Bookmark

For years, not being able to cache GraphQL was considered one of its main downsides compared to RESTful APIs. Not anymore. GraphCDN makes it possible to cache almost any GraphQL API at the edge, and not only that but our cache is even smarter than any RESTful cache could ever be. Let's dive deep into the inner workings of GraphCDN to figure out how exactly we make this happen.

This talk has been presented at GraphQL Galaxy 2021, check out the latest edition of this Tech Conference.

FAQ

GraphCDN is a content delivery network specifically designed for GraphQL APIs. It helps in caching GraphQL queries at the edge, aiming to provide faster response times and reduce server load by caching data closer to users globally.

Max Stoiber chose RethinkDB because it advertised itself as a real-time database, which seemed suitable for the real-time features needed in Spectrum's community forum model.

The development of GraphCDN was motivated by the scaling challenges faced at Spectrum due to the limitations of RethinkDB in handling real-time updates efficiently, combined with the lack of prebuilt solutions for caching GraphQL at the edge.

GraphQL is highly effective for caching due to its introspectability and strict schema, which allow for precise data management and efficient cache invalidation, ensuring that only relevant data is cached and stale data is quickly purged.

Fastly's invalidation logic allows GraphCDN to purge stale data globally within 150 milliseconds, ensuring that all users access the most current data almost instantaneously, which significantly enhances the caching efficiency.

Caching at the edge involves storing data in multiple global locations to serve users from the nearest point, reducing load times and server strain. In contrast, browser caching stores data locally on the user's device for instant access but does not benefit other users.

Max Stoiber
Max Stoiber
23 min
09 Dec, 2021

Comments

Sign in or register to post your comment.
Video Summary and Transcription
Max Stoiber, co-founder of GraphCDN, discusses the challenges faced with RethinkDB and the need for caching in a read-heavy API. He explores how GraphQL clients handle caching and the potential of running a GraphQL client at the edge for faster response times. Authorization and cache key management at the edge are also discussed, along with the benefits of edge-caching and the importance of caching in GraphQL APIs. The audience response reveals that a significant percentage are already caching their APIs, while different use cases for caching and the concept of edge computing are explained.

1. Introduction to Edge-Caching GraphQL APIs

Short description:

I am Max Stoiber, co-founder of GraphCDN, a GraphQL CDN. I have worked on open source projects like Styled Components and React Boilerplate. In 2018, I was the CTO of Spectrum, a modern community forum combining real-time chat with public posts. We experienced significant user growth but faced issues with the database choice.

♪♪ Hello, everyone. I am super excited to be here today and to talk to you about edge-caching GraphQL APIs. My name is Max Stoiber. I am in beautiful Vienna, Austria here. Unfortunately, I can't be there in person this time, but I am really excited to be here. And if you want to follow me practically anywhere on the internet, I am at MXSTBR, basically everywhere.

I am the co-founder of GraphCDN, which is the GraphQL CDN. If you are in the React community, in the React JS community or in the JavaScript community, more generally, you might have used some of the open source projects that I helped build, like Styled Components or React Boilerplate or Microanalytics or a whole bunch of others. I am really active in that scene. And so if you're there, you might have used some of those projects as well.

The story of GraphCDN and how we got there started in 2018. At the time I was the CTO of another startup called Spectrum. And at Spectrum we were building a modern take on the classic community forum. So essentially we were trying to combine the best of what PHP BB gave us 20 years ago with the best of what Discord and Slack give us nowadays. That was essentially the idea. It was a public forum, but all of the comments on any posts were real time chat. So we try to take these two worlds that are currently very separate, where communities in Slack and Discord write lots of messages, but none of them are findable and make them public and a little bit more organized so that you could find them afterwards on Google or elsewhere. We tried to combine those two worlds together.

Now that actually worked out surprisingly well, which led to quite a bit of user growth. As you can imagine, with all of this user-generated content, lots of people found us on Google and elsewhere and started visiting Spectrum quite regularly. That meant we had quite a bit of growth. Now, unfortunately, I had chosen a database that wasn't very well supported. I chose RethinkDB, which nowadays doesn't even exist anymore. The company behind it shut down after a while. And I chose that database originally because they advertised themselves as the real time database. And their key feature, or the thing they praised externally, was that you could put this changes key at the end of any database query and it would give stream real time updates to that database query to you. And so you could listen to changes to practically any data changes, which felt like a fantastic fit for what we were trying to do. Because obviously, almost anything in Spectrum was real time, right? The posts popped in in real time, the chat was real time, of course, we had direct messages which had to be real time. So this felt like a great fit for what we were trying to do. Lesson learned, in hindsight, rely on the databases that everybody uses.

2. Challenges with RethinkDB and the Need for Caching

Short description:

There's a reason everybody uses Postgres and MySQL and now Mongo. RethinkDB, the real-time nature of it didn't scale at all. We had hundreds of thousands of users every single month, but RethinkDB couldn't even handle a hundred concurrent change listeners. We had this database that didn't scale and essentially we had to work around that limitation. We had an ideal use case for caching because our API was really read-heavy. We wanted to switch to a more well-supported database. However, that's a lot of work. We'd originally chosen GraphQL for our API because we had a lot of relational data. The one big downside that we ran into was that there weren't any prebuilt solutions for caching GraphQL at the edge, which is what we wanted to do. Now we wanted to essentially run code in many, many data centers all around the world, and we wanted to route our users to the nearest data center and cache their data very close to them for a very fast response time, but also so that we could reduce the load on our servers. The question I wanted to answer was, can't I just run a GraphQL client at the edge? To answer the question, I want to dive a little bit into how GraphQL clients cache.

There's a reason everybody uses Postgres and MySQL and now Mongo. There's a reason those databases are as prevalent as they are and it's because they work. I didn't know, I'm a lot wiser now, I wasn't that wise back then. And so it very quickly turned out that RethinkDB, the real-time nature of it didn't scale at all. We had hundreds of thousands of users every single month, but RethinkDB couldn't even handle a hundred concurrent change listeners.

Now, as you can imagine, every person that visits the website starts many different change listeners, right? We're listening to changes of the specific post that they're looking at. We're listening to changes of the community that the post is posted in. We're listening to new notifications. We had a bunch of listeners per user and essentially our database servers were on fire, literally on fire. Well, thankfully not literally, but they were crashing quite frequently. I Googled servers on fire and found this amazing stock photo of servers on fire, which if your data center looks like this, you have some really serious problems. Ours weren't quite as bad, but they were still pretty bad. So we had this database that didn't scale and essentially we had to work around that limitation. We wanted to switch to a more well-supported database. However, that's a lot of work. Rewriting the hundreds of database queries we'd written and optimized up to that point, migrating all of that data without any downtime, that was just a whole project and we wanted to get there eventually, but we needed a solution for us crashing literally every day, right at this moment.

As I was thinking about this, of course, I realized that caching, we have an ideal use case for caching because our API was really read-heavy. Of course, it's public data, lots of people read it, but not as many people write to it. And so actually we had an ideal use case for caching. We'd originally chosen GraphQL for our API because we had a lot of relational data. We were fetching a community, all the posts within that community, the authors of every post, the number of comments, a bunch of relational data and GraphQL was a fantastic fit for that use case. It worked out extremely well for us and we really enjoyed our experience of building our API with GraphQL. The one big downside that we ran into was that there weren't any prebuilt solutions for caching GraphQL at the edge, which is what we wanted to do.

Now we wanted to essentially run code in many, many data centers all around the world, and we wanted to route our users to the nearest data center and cache their data very close to them for a very fast response time, but also so that we could reduce the load on our servers. Now, if you've ever used GraphQL, then you know that that is essentially what GraphQL clients do in the browser. If you've heard of Apollo Client, Relay, Urql, all of these GraphQL clients, what they are is essentially a fetching mechanism for GraphQL queries that very intelligently caches them in the browser for a better user experience. So in my head, basically, the question I wanted to answer was, can't I just run a GraphQL client at the edge? GraphQL clients do this in the browser. Why can't I just take this GraphQL client that's running on my local browser, put it on a server somewhere and have that same caching logic but at the edge. To answer the question, I want to dive a little bit into how GraphQL clients cache. If we look at this example of a GraphQL query, which fetches a blog post by a slug, and it fetches its ID, title, and the author.

QnA

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

From GraphQL Zero to GraphQL Hero with RedwoodJS
GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
From GraphQL Zero to GraphQL Hero with RedwoodJS
Top Content
Tom Pressenwurter introduces Redwood.js, a full stack app framework for building GraphQL APIs easily and maintainably. He demonstrates a Redwood.js application with a React-based front end and a Node.js API. Redwood.js offers a simplified folder structure and schema for organizing the application. It provides easy data manipulation and CRUD operations through GraphQL functions. Redwood.js allows for easy implementation of new queries and directives, including authentication and limiting access to data. It is a stable and production-ready framework that integrates well with other front-end technologies.
Local State and Server Cache: Finding a Balance
Vue.js London Live 2021Vue.js London Live 2021
24 min
Local State and Server Cache: Finding a Balance
Top Content
This Talk discusses handling local state in software development, particularly when dealing with asynchronous behavior and API requests. It explores the challenges of managing global state and the need for actions when handling server data. The Talk also highlights the issue of fetching data not in Vuex and the challenges of keeping data up-to-date in Vuex. It mentions alternative tools like Apollo Client and React Query for handling local state. The Talk concludes with a discussion on GitLab going public and the celebration that followed.
Get rid of your API schemas with tRPC
React Day Berlin 2022React Day Berlin 2022
29 min
Get rid of your API schemas with tRPC
Today's Talk introduces TRPC, a library that eliminates the need for code generation and provides type safety and better collaboration between front-end and back-end. TRPC is demonstrated in a Next JS application integrated with Prisma, allowing for easy implementation and interaction with the database. The library allows for seamless usage in the client, with automatic procedure renaming and the ability to call methods without generating types. TRPC's client-server interaction is based on HTTP requests and allows for easy debugging and tracing. The library also provides runtime type check and validation using Zod.
Batteries Included Reimagined - The Revival of GraphQL Yoga
GraphQL Galaxy 2021GraphQL Galaxy 2021
33 min
Batteries Included Reimagined - The Revival of GraphQL Yoga
Envelope is a powerful GraphQL plugin system that simplifies server development and allows for powerful plugin integration. It provides conformity for large corporations with multiple GraphQL servers and can be used with various frameworks. Envelope acts as the Babel of GraphQL, allowing the use of non-spec features. The Guild offers GraphQL Hive, a service similar to Apollo Studio, and encourages collaboration with other frameworks and languages.
Rock Solid React and GraphQL Apps for People in a Hurry
GraphQL Galaxy 2022GraphQL Galaxy 2022
29 min
Rock Solid React and GraphQL Apps for People in a Hurry
The Talk discusses the challenges and advancements in using GraphQL and React together. It introduces RedwoodJS, a framework that simplifies frontend-backend integration and provides features like code generation, scaffolding, and authentication. The Talk demonstrates how to set up a Redwood project, generate layouts and models, and perform CRUD operations. Redwood automates many GraphQL parts and provides an easy way for developers to get started with GraphQL. It also highlights the benefits of Redwood and suggests checking out RedwoodJS.com for more information.
Adopting GraphQL in an Enterprise
GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
Adopting GraphQL in an Enterprise
Today's Talk is about adopting GraphQL in an enterprise. It discusses the challenges of using REST APIs and the benefits of GraphQL. The Talk explores different approaches to adopting GraphQL, including coexistence with REST APIs. It emphasizes the power of GraphQL and provides tips for successful adoption. Overall, the Talk highlights the advantages of GraphQL in terms of efficiency, collaboration, and control over APIs.

Workshops on related topic

Build with SvelteKit and GraphQL
GraphQL Galaxy 2021GraphQL Galaxy 2021
140 min
Build with SvelteKit and GraphQL
Top Content
Featured WorkshopFree
Scott Spence
Scott Spence
Have you ever thought about building something that doesn't require a lot of boilerplate with a tiny bundle size? In this workshop, Scott Spence will go from hello world to covering routing and using endpoints in SvelteKit. You'll set up a backend GraphQL API then use GraphQL queries with SvelteKit to display the GraphQL API data. You'll build a fast secure project that uses SvelteKit's features, then deploy it as a fully static site. This course is for the Svelte curious who haven't had extensive experience with SvelteKit and want a deeper understanding of how to use it in practical applications.

Table of contents:
- Kick-off and Svelte introduction
- Initialise frontend project
- Tour of the SvelteKit skeleton project
- Configure backend project
- Query Data with GraphQL
- Fetching data to the frontend with GraphQL
- Styling
- Svelte directives
- Routing in SvelteKit
- Endpoints in SvelteKit
- Deploying to Netlify
- Navigation
- Mutations in GraphCMS
- Sending GraphQL Mutations via SvelteKit
- Q&A
Build Modern Applications Using GraphQL and Javascript
Node Congress 2024Node Congress 2024
152 min
Build Modern Applications Using GraphQL and Javascript
Featured Workshop
Emanuel Scirlet
Miguel Henriques
2 authors
Come and learn how you can supercharge your modern and secure applications using GraphQL and Javascript. In this workshop we will build a GraphQL API and we will demonstrate the benefits of the query language for APIs and what use cases that are fit for it. Basic Javascript knowledge required.
End-To-End Type Safety with React, GraphQL & Prisma
React Advanced 2022React Advanced 2022
95 min
End-To-End Type Safety with React, GraphQL & Prisma
Featured WorkshopFree
Sabin Adams
Sabin Adams
In this workshop, you will get a first-hand look at what end-to-end type safety is and why it is important. To accomplish this, you’ll be building a GraphQL API using modern, relevant tools which will be consumed by a React client.
Prerequisites: - Node.js installed on your machine (12.2.X / 14.X)- It is recommended (but not required) to use VS Code for the practical tasks- An IDE installed (VSCode recommended)- (Good to have)*A basic understanding of Node.js, React, and TypeScript
GraphQL for React Developers
GraphQL Galaxy 2022GraphQL Galaxy 2022
112 min
GraphQL for React Developers
Featured Workshop
Roy Derks
Roy Derks
There are many advantages to using GraphQL as a datasource for frontend development, compared to REST APIs. We developers in example need to write a lot of imperative code to retrieve data to display in our applications and handle state. With GraphQL you cannot only decrease the amount of code needed around data fetching and state-management you'll also get increased flexibility, better performance and most of all an improved developer experience. In this workshop you'll learn how GraphQL can improve your work as a frontend developer and how to handle GraphQL in your frontend React application.
Build a Headless WordPress App with Next.js and WPGraphQL
React Summit 2022React Summit 2022
173 min
Build a Headless WordPress App with Next.js and WPGraphQL
Top Content
WorkshopFree
Kellen Mace
Kellen Mace
In this workshop, you’ll learn how to build a Next.js app that uses Apollo Client to fetch data from a headless WordPress backend and use it to render the pages of your app. You’ll learn when you should consider a headless WordPress architecture, how to turn a WordPress backend into a GraphQL server, how to compose queries using the GraphiQL IDE, how to colocate GraphQL fragments with your components, and more.
Relational Database Modeling for GraphQL
GraphQL Galaxy 2020GraphQL Galaxy 2020
106 min
Relational Database Modeling for GraphQL
Top Content
WorkshopFree
Adron Hall
Adron Hall
In this workshop we'll dig deeper into data modeling. We'll start with a discussion about various database types and how they map to GraphQL. Once that groundwork is laid out, the focus will shift to specific types of databases and how to build data models that work best for GraphQL within various scenarios.
Table of contentsPart 1 - Hour 1      a. Relational Database Data Modeling      b. Comparing Relational and NoSQL Databases      c. GraphQL with the Database in mindPart 2 - Hour 2      a. Designing Relational Data Models      b. Relationship, Building MultijoinsTables      c. GraphQL & Relational Data Modeling Query Complexities
Prerequisites      a. Data modeling tool. The trainer will be using dbdiagram      b. Postgres, albeit no need to install this locally, as I'll be using a Postgres Dicker image, from Docker Hub for all examples      c. Hasura