Kurt Kemple, Marc-André Giroux, Mandi Wise, Tejas Shikhare
data:image/s3,"s3://crabby-images/1d9da/1d9daa80e4aaf66cd297e8a41d03cd4d3f7463a4" alt="React Summit 2022"
The panel discussion focuses on the durability of GraphQL APIs, particularly how to maintain their scalability and reliability under various conditions, and handling GraphQL at scale.
Testing GraphQL APIs is challenging due to the infinite possibilities of queries and the contextual information used by resolvers. The recommendation is to focus on integration testing and testing middleware layers like rate limiting and error handling.
At GitHub, GraphQL API scaling is managed by using a custom implementation of data loaders to avoid N+1 queries and focusing on server-side application level caching.
Netflix handles spikes by scaling GraphQL servers horizontally, implementing service side throttling, client retries, and using an L7 proxy application gateway to reject excessive requests.
Effective caching strategies for GraphQL APIs include using persisted queries to reduce request size and enable HTTP caching, employing data loaders to batch and cache requests, and implementing application-level caching at the resolver level.
Panelists suggest handling errors in GraphQL by translating GraphQL-specific errors to standard HTTP status codes, using the 'errors as data' approach to enrich client applications with detailed error information, and ensuring robust observability and error tracking systems.
Panelists recommend resources like Marc Andre's book 'Production Ready GraphQL', and general materials on distributed systems, site reliability engineering, and domain-driven design to understand and build reliable APIs.
Today, we have an exciting panel discussing the durability of GraphQL APIs and how to ensure they can scale without issues. Mark from GitHub, Mandy from Apollo, and Tejas from Netflix will share their experiences. Let's start with the importance of testing in maintaining a reliable GraphQL API.
♪ ♪ ♪ ♪ Thank you for joining us today. We've got a really exciting panel. I'm very excited to be joined. There's a lot of excitement, in case you can't tell, by these wonderful folks who have been using GraphQL for quite a long time in a lot of different environments and just have really kind of pushed GraphQL really to the edges of what's capable. And so, you know, when we talk about GraphQL and dealing with GraphQL at scale, one of the things that we don't really talk about too often is kind of the durability of GraphQL APIs. And what does that really mean, durability? Well, it's kind of like the SRE-type focus on graphs, like how do we keep graphs up and how do we make sure that they're able to scale and that we're not going to encounter issues? So that's going to be a lot of what today's panel is going to be about.
I'm going to turn over the floor for a quick sec just in case anyone had some follow-ups on introductions there, if you had anything you wanted to add about what you're currently doing and kind of how you're working in the GraphQL space today. So I'm just going to go in order of how I see. So Mark, that would be your up first. Yes, sure. Thank you for having me on. I think as an introduction said, I work at GitHub, work on the API team. So we've got a set up that's not the most common for GraphQL where we use it as a public API for a third party. So yeah, I'm just excited to be chatting about this with this context in mind. Awesome, cool, and thank you so much again for joining us. And then Mandy, I've got you in the next window over. Hello. So I'm a Solutions Architect at Apollo, which means I work with a lot of our enterprise customers and see the kinds of interesting challenges they bump up against using GraphQL at scale everyday. So I'm very excited to be a part of this panel too. Awesome, cool. Thank you so much. And Tejas, that brings us to you. Yeah, sure. I'm a Software Engineer at Netflix, and I'm actually working on the API team at Netflix, and we are currently building GraphQL for our studio ecosystem. Awesome, very cool. So as you can see, we've got a wide range of focus here from some of the top GraphQL-consuming companies out there in the industry, and yeah, Apollo's own Mandy. So all right, thank you so much for joining us. And without further ado, let's go ahead and get into some of these questions. So I'm going to start it off with the first one, which is just like – it's actually pretty hard to know that a service is reliable without testing. So what types of testing do you find to be the most important when it comes to keeping a GraphQL API up and running smoothly? And I guess, yeah, I'll just lead off with Mark.
Testing a GraphQL API can be challenging due to the infinite possibilities of queries and the use of contextual information in resolvers. It's important to extract logic outside of the GraphQL layer for better testing. Integration testing, observability, and unit testing are beneficial in maintaining smooth-running APIs. At Netflix, unit testing, integration testing, and end-to-end testing with and without mocking are used. Error handling and additional production ideas are also prioritized.
We'll just go through again, and we'll switch up the rounding later. Sure. Yeah, so this is an excellent question. I think it's good to acknowledge right off that testing a GraphQL API is kind of difficult by the nature of GraphQL, right? There's almost an infinity of possibilities of how people could query your graph, which is a feature, but also something that makes it hard to test every possibility. There's also the fact that your resolvers that are backing your graph often use contextual information from that kind of infamous context argument. So if you only focus on testing the graph itself, it's very hard to be confident in the backing logic behind the graph.
So I think my main thing, and I think I've said that a lot before, but I think focusing on extracting your logic outside of the GraphQL layer makes it so much easier to test that your domain logic is well tested. And then you can focus on testing the GraphQL parts separately. So we like to focus on more integration testing for GraphQL, test as many different queries as we can that represent client use cases, and test all our middleware layers, so our rate limiting. We'll talk about this more, I'm sure, about everything that's with error handling on the GraphQL side, so that would be my advice.
Yeah, awesome. Thank you. Mandy, you have a follow up to that. What areas do you find that like testing to be the most beneficial in helping APIs stay running smoothly? So in terms of testing, there's another way you can think about it too with GraphQL APIs where because they're evolutionary in nature, you make sure that as you're releasing new features on your graph, that you're doing so in a way that doesn't cause breaking changes for existing clients, so your observability tooling is a really important part of that story. So what that means is in practice, you'd probably be collecting operation traces and making sure that your clients identify themselves when they're using your graph, so that when you push those changes, you can check against those operation traces and make sure that you're releasing changes to your graph in such a way that's not going to break queries that are being currently made by existing consumers of your graph.
I love it. Yeah, that's a whole other avenue to think of, which is like aside from the actual nuts and bolts testing is like the observability, seeing what's happening in an existing system and using that as a baseline for testing as well. It's really cool. Tejas, you have a followup? How do you all handle testing your APIs in Netflix? Sure. Sorry. I'm going to reiterate some of the same things Mark and Mandy mentioned. I really like unit testing for the code that is inside the logic, inside the resolvers and data loaders. It's nice to kind of separate that out. Integration test is really great for testing context passing between the parent and child, data fetchers or data loaders, etc. And then smug test, we use a lot, actually, and we find them extremely useful for end-to-end testing of your GraphQL queries. And these we do both with and without mocking. We really find that super useful. And then error handling is another one. You want to trigger error scenarios using mocks because not all the time you're going to be able to test that behavior well and test it end-to-end. And then for production, we have two other ideas I can share that we like to use.
We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career
Comments