1. Is GraphQL Still Relevant in 2023?
Today, we're going to talk about whether GraphQL is still relevant in 2023. We'll cover reasons to use GraphQL and discuss the growth of the ecosystem. The biggest selling point of GraphQL used to be boilerplate, but that's not the case anymore. Another selling point is its language-agnostic contract.
Hey, there. Today, we're going to talk about the question if GraphQL is still relevant in the year 2023.
First, a few words about us. That's me. My name is Lenzi Wirtronics. I'm a Senior Staff Software Engineer at Apollo GraphQL. That means I'm a maintainer of Apollo Client. I'm the author of RTK Query. I'm a Redux 2 co-maintainer, and I'm one of those ADHD persons with a million weird hobbies. And I'm just saying that because I think as a person with ADHD in our industry, we need representation. And that's what I'm here for. You can find me on GitHub as franiris, on Twitter on fry, or on mastodon as franiris at chaos.social.
I'm today joined by my colleague, Gerald, who will be introducing himself. Thanks, Lenz. My name is Gerald Miller, and I'm a Principle Software Engineer at Apollo. I work alongside Lenz as a maintainer of the Apollo Client Library. You can find me both on Twitter and GitHub under my handle geraldmiller. And with that, we start into the talk.
First, let's talk about why we wanted to do this talk. A lot of times people are getting into Graph QL for reasons that might not be unique to Graph QL nowadays. There are very good reasons to still use Graph QL, and we're also going to cover those. But first, we're going to cover where the ecosystem has grown up, and maybe you don't need to go for Graph QL to get those benefits. Because honestly, while I work at Apollo, and I'm happy for everyone out there who is using Graph QL, I also don't want people to use Graph QL for the wrong reasons and get frustrated with it.
So the biggest selling point of Graph QL back in the day was boilerplate, because what you're seeing right now is the amount of code that was necessary to catch data from an API and Redux. And if you compare that to the amount of code that was necessary in Apollo Client to do the same thing, there's a striking difference. But this selling point isn't really true anymore nowadays. Because if you compare this to how it's done in Redux today and that's all the code you get a hook for it, or how it would be done in TANStackQuery, which is also known as ReactQuery, there's not so much of a difference. So this wouldn't be one of those reasons where I say, yeah, you have to use Graph QL just for this. Another very big selling point of Graph QL was also that you had this language-agnostic contract. So you could use it in the front end and in the back end, and everything was type safe and you could use it with TypeScript or JavaScript or Go or Java.
2. Exploring GraphQL Benefits
With GraphQL, you can explore the schema, add fields, and see the results. Tools like Wegger UI and tRPC make exploration and type safety easier. Code generation in GraphQL provides types for queries. Open API specifications can be used with REST APIs. Alternatives to GraphQL should be considered, but it still has unique benefits.
And yeah, that's true, but never was really unique. There was stuff like SOAP, the old among us might remember, or the Swagger specification, which over time has been renamed, has become the open API specification and is something that nowadays is very common and most services can just auto-generate it for you.
With such a schema or specification in place, we can look at exploring it. This was magic for me when I started using Graph QL, like just clicking through the whole schema, having documentation everywhere, just being able to add fields and send off requests and seeing all of that come back. For example, there's Wegger UI, which lets you do the same and it might not let you combine different results, but you can explore everything. You can read the documentation that's embedded. You already see what will be coming back. This is an amazing tool.
Also, if you're just using TypeScript in your backend and your frontend, and you happen to use a monorepo, you could use something like tRPC to make RPC calls between your frontend and your backend, that are just perfectly type safe and super easy to explore because you have autocomplete in your browser. This stuff will become even more common once we start adopting React Server Component. Another big benefit would be code generation. In GraphQL, you just write your query out and from that, the GraphQL code generator in the past would have generated a hook or nowadays, it just annotates your query with types and everywhere you use that query, you will have all the types at your fingertips. But of course, that's possible within REST API as well. If you have that open API specification, you can use the RTK Query OpenAPI CodeChain to get the RTK query endpoints. Generated for you, or you use Orble, which will generate all the code you need, for example, for react query or also for SWR. So going from here, if these are all the benefits that you would personally see from using something like GraphQL, you might want to consider alternatives. But of course, GraphQL still has very unique benefits, and that's what we're going to talk about in the next part.
3. Exploring GraphQL Benefits Continued
Let's talk about the obvious benefits of GraphQL, such as fetching only the data you need and fetching data from a single endpoint. We'll compare data fetching in REST and GraphQL using a music album page as an example. In REST, each resource has its own endpoint, which can result in overfetching and wasted bytes. With GraphQL, you can request specific data and avoid unnecessary requests. However, fetching related data in GraphQL can be challenging, as you need to wait for all the required IDs. It can lead to multiple fetches and increased complexity.
Thanks, Lens. Now let's shift gears to talk about GraphQL and some of its benefits. I'm going to call these the obvious benefits because while they were once groundbreaking, these have become pretty standard talking points of GraphQL. I wanted to focus on them though, because it's useful to remind ourselves of what GraphQL brings us.
First up, let's talk about data fetching, specifically fetching only the data you need. While this point has been made over and over, and likely one of the first things you've heard about GraphQL, it's helpful to remind ourselves that this is actually a pretty big deal. As front end developers writing queries to request only the data you need, and knowing exactly the payload you're going to get is extremely useful. And let's not forget that this can also be hugely beneficial for apps that are running on a mobile connection. You're only transferring data from your server that you need. So if you need a mobile optimized payload, you can request a query that is optimized as such.
Another benefit to using GraphQL is that you can fetch data from a single endpoint, which makes your request predictable. We don't have to stitch together data from multiple endpoints, nor compromise our API design by having to create front end specific endpoints. To remind ourselves how painful this can be, let's pretend we were building a music album page, such as the one you'd find on Spotify, Apple Music, or some other related music service. We'll look at how you might query this data both from REST and from GraphQL. Let's start by looking at data we want to fetch. We'll fetch some album details, including the name, the album art and the release date, the artists that perform on the album, including their names, the tracks that are part of the album, including some information about them and the total number of tracks in case we want to paginate that information, and we'll get the artists that perform on each track, in case the track features an artist that differs from the album.
Let's first look at this as if we were going to request this from a REST endpoint. As a backend developer, I might have built this REST API to be as pure as possible to ensure it can scale to any front end, not just one we were building by the frontend team. And by pure, I just mean that each resource has its own endpoint and does not contain embedded information for a related resource. It's certainly a design decision and perhaps a compromise an API development team would need to make. Too much information in a single endpoint could result in overfetching and wasted bytes over the network. So first up, let's look at requesting the album information, which might find it a URL like this one. It's a pretty typical REST endpoint. We'll get the artists at an endpoint like this, and the tracks at an embedded endpoint like this one. But what about the artists for each track? Here's where it gets a little bit tricky because we only want to fetch artists from tracks that are on the album, not just any track. So how do we structure our endpoint in a way that makes sense? To stick to our concept of purity, we could provide an endpoint like this. With this endpoint, we introduced a couple problems. I need the IDs here, and I won't know them until we've loaded all of the tracks. So we have to wait until we've loaded the tracks before I can execute this query. On top of that, this only takes a single ID, which means I need to fetch against this endpoint once for each track, which means I need to fetch again and again and again, and you get the picture.
4. GraphQL and Fragment Co-location
Another idea is to embed artist information in the track itself and provide a batching endpoint for specific track IDs. Creating a nested relationship endpoint raises design considerations. GraphQL allows for natural expression of relationships through queries. Fragment co-location enables reusable components with specific data needs. Parent components can safely render child components without knowing their data requirements.
Another idea is that we could break our contract of purity, and we could embed the artist information in the track itself and provide a batching endpoint where I could request specific track IDs. We haven't fully solved for the fact that we have to wait for the tracks to load before I can execute against this endpoint, but at least now we're doing a single request instead of n number of requests.
We could also consider creating an endpoint that has some kind of nested relationship. But my relationship really isn't between the album and the artist. It's between the track and the artist. So you could argue that the album's ID might be a valid scoping mechanism here, but it does open us up to some other design considerations, such as how deeply nested our endpoints should be.
The point here isn't to argue about which is the correct way to structure this endpoint, nor is it to provide an exhaustive set of solutions, but mainly it's to provide some of the trade-offs and considerations you have to make when designing an API like this.
Let's look at this through the lens of GraphQL. So first I'll execute the query to get the album with its name. We'll get the album art in the release date, we'll get the related artists and each of their names. We'll get the tracks. We'll get the total. We'll get the track information. And now here's where it gets easy. We finally just load the artists for each track and notice how the relationship is expressed naturally through the query, because it's just another field on the type.
Now let's take a look at fragment co-location. As React developers, we're accustomed to writing components everywhere. Many of these components require data from the server. Fragment co-location allows us to express a component's data needs without the responsibility of needing to fetch that data itself. This makes it uniquely reusable and then I can plug in this component to any other parent component that issues queries without that parent component needing to know specific details about the data requirements on his children.
So let's take a look at an album title component. Let's say I'm building this reusable component that accepts the album as props. I can express its data needs via GraphQL fragment, which describes exactly what this component needs to render. Here. I'm using Apollo's recommended approach to fragment declaration, but the concept of apply to any GraphQL client that provides a way to embed fragments as part of queries. To use this fragment in my query. I simply add that fragment definition to my query. We'll take a look at this through the lens of the parent component. Notice here how my parent component can now safely render the album tile, knowing it's fulfilled its data requirements, but without needing to know the specific details about them. If for example, I wanted to add some additional information to my album tile, such as the label or the copyright, I can update my fragment without having to go find the query that executes this information.
5. Caching GraphQL Data on the Client
Let's talk about caching GraphQL data on the client using a normalized cache. Apollo's in-memory cache is one example of a normalized cache implementation. Normalized caches reduce data redundancy and improve data integrity. They automatically update data wherever it is used.
This also means that anywhere this album tile is used, it will automatically be updated to load this data.
Let's shift gears and talk a little bit about caching GraphQL data on the client. Many of the data fetching libraries out there ship some kind of caching functionality, and Apollo is no different. So for those of you that have used Apollo, you likely have used Apollo's in-memory cache.
In-memory cache is known as a normalized cache. Apollo isn't the only library that provides a normalized cache implementation, so what I'm about to talk about generally applies to all normalized caches. But what is a normalized cache and how might it help us in our applications?
6. Data Normalization and Reducing Redundancy
Data normalization reduces redundancy and improves data integrity. In a block application, there is data redundancy between the author and the login user. Two queries are used to fetch current user information and blog data. Normalizing the data involves extracting the user into its own object and creating references to it in the queries. This allows updates to the user to be reflected in both queries.
Data normalization gives us the ability to reduce data redundancy and improve data integrity by ensuring our cache data is structured in a way that adheres to some principles. By the way, if you go to this Wikipedia article, you'll notice this is expressed in terms of a database, but I think the concept applies well here.
To illustrate what normalized data looks like let's take a look at a block application. It's pretty bare bones, but you'll notice something in particular with this screen. If you take a look closely, you'll notice that we have some data redundancy here. In this case, I'm both the author of the blog and I'm also the login user. So my username is synced between these two areas of the UI.
Let's take a look at what our GraphQL queries might be and what the resulting data might look like to build this UI. I'm going to use two queries for each of these areas of UI to best illustrate this concept of data normalization. So for this, let's start with the current user information. Here I have a query that looks like this, where I might fetch the current user along with my avatar and the username. When we execute this, the resulting payload might look something like this.
Now let's take a look at the fetching data for the blog itself. Here we have a query that looks like this, where I fetch data for the title, the content, the banner, as well as the author information. And if we execute this, we might have a payload that looks like this.
Let's bring back up the data from the current user query and compare it to the blog. You'll notice that we have some data redundancy here between the two queries. We know these records are the same record because the type names and the IDs both match. So how might we go about normalizing this data to reduce redundancy?
I'm going to illustrate this the way the Apollo client does it, but this concept should apply to normalized caches in other libraries. So first of all, let's extract the user into its own object, independent of the data from the other queries. We're going to identify this blob of data using a form of type name colon ID. So this blob of data can be identified as user colon one. Next we need to update each of our queries data to now point to this new object. To do so, we're going to create a reference from each of these fields to the new data. We'll remove the data from the current user and replace it with a reference. Next, we'll do the same thing to the author and the blogs that also references the user. And now each of these queries data points to the same user record. The advantage here is that updates to the user can now be reflected in both queries. Since we have normalized the data between them and to show why normalized data can be useful. Let's take an example of actually updating my username.
7. Updating Username with GraphQL Mutations
Let's take an example of actually updating my username. We do this in GraphQL through mutations. We'll submit a mutation to change my username and see the updates reflected in both the blog and the current user dropdown.
Let's take an example of actually updating my username. So again, my username is both used in the blog, as well as the current user dropdown, but I'm kind of feeling like my username is pretty plain and can be a little bit more exciting to change data. We do this in GraphQL through mutations. So here we have a mutation that allows me to change my username. We'll submit this mutation along with a payload that updates my username. We might get back a payload that looks like this. Notice here how the mutations payload contains the same type and ideas, the blogs author and current user. So if we take a look at that cache again, we can see the issuing this mutation will now update the username and user colon one with my new username going from Gerald Miller to my new empty string username. And just to look at it from the perspective of our UI, let's go ahead and select the option that allows me to change my username. I'll go and input my brand new username and empty string and execute this mutation. You'll see that once this mutation finishes, voila, we've got the updates in both places. We didn't have to invalidate either query nor do we have to refetch data for each. We were simply able to update that shared data record and see that change reflected in both places.
8. GraphQL's Upcoming Feature: The Defer Directive
GraphQL's upcoming feature, the defer directive, allows receiving data incrementally, improving performance for fields with longer resolution times. By accessing faster-resolving fields first, front-end developers can render information quickly and stream in the remaining data over time.
Now let's talk about an upcoming GraphQL feature. So remember that GraphQL is a spec, which means that it evolves and we get some new capabilities over time. One of these I'd like to highlight is the new defer directive. Defer allows you to receive data incrementally rather than all at once, which can be useful if you have some fields in your schema that take much longer to resolve than others. You can access data for the fields that resolve much faster without having to wait for those deferred fields to load.
Just to show you an example, here we have a GraphQL schema that serves up products with their reviews. When we execute this query, we notice it's taking over a second to load, which is a bit longer than makes us comfortable. After reviewing some metrics from our schema, we've come to notice that the reviews field is the holdup here. We're okay if the reviews don't load right away because we'd like to show information about the product as quickly as possible. So we can use defer to help us. Let's highlight the reviews field, right click and wrap with inline defer. And let's execute this query again. If we take a look at the timeline down to the bottom, right, we notice that the product information loads much more quickly while the reviews data comes in much later. This can be useful because as a front end developer, I can start rendering that product information without having to wait for the reviews. And I can have the reviews stream in over time. I hope that some of this information wasn't surprising to you, but I hope it reminds us why GraphQL can be so powerful.
9. Exploring the Power of GraphQL Federation
Let's explore the powerful benefits of GraphQL beyond the obvious ones. With GraphQL Federation, you can combine multiple GraphQL graphs into one super graph, reducing the need for coordination between teams. By adding additional fields and types, you can create a unified schema that provides all the necessary data. As a front-end developer, you can leverage this super graph without having to coordinate extensively with other teams. The super graph also allows for easier schema evolution and can be treated like Git with a schema registry.
Let's now take a look beyond some of the obvious benefits and look at some of the other ways that GraphQL can be extremely powerful. So this was the stuff we just directly have in mind as front end developers when we think about GraphQL.
If there's another layer too, there are other teams that have to create that API for you and that you have to interact with. Oftentimes when you have an app, you don't directly interact with your internal APIs, but you have a back end for the front end. And every time you need a little change, you will need to have that BFF team change something for you. That can be frustrating and needs a lot of communication.
So why not take a look at the GraphQL benefits that can make that less painful for you. So let's assume you have all of those teams, and if you really want them to work independently, each of those teams essentially needs to maintain their own API, or there will be a lot of communication between them. Of course, that places the burden on your client. You now have to communicate with all those API's, or you again need that back end for the front end, which is a lot of pretty useless work.
So with GraphQL, you can use GraphQL Federation to combine multiple GraphQL graphs into one big super graph. And we're not talking super graph in the sense of, Oh, it's super marketing term. No, we're talking the mathematical sense, a graph of graph is a super graph. So how would that look if we have teams that have overlapping data? Imagine we have a user API that's maintained by one team, and they might have this type user that has an ID and that has a name. And probably also a lot of other fields that we are just going to think about here. And then you have another team that has a product API. And at some point there's an overlap, you don't really want to have the products, but you also want to know for a user, which products that user has ordered.
So now your product sub-graph team can go ahead. They can add their product type and they can add an additional field to that user type. And their router will go ahead and automatically using GraphQL Federation, just combine those into one graph. And you end up with this user type that has a name and the purchases on that. You as a front-end developer, just vary that. And while internally this will end up in multiple calls to multiple services, you don't have to care and you don't need a team to touch on that, if you make a change on your front.
So in the end, you end up with a lot of teams that don't have to coordinate too much, of course, they still have to have a general plan of how everything will look, but they can't just add new fields to their entities. And they can also annotate an entity that's really owned by another team without too much coordination, and that frees up time that would otherwise be spent in annoying meetings. And for you as a front-end developer, it's a take all with it. You can use everything in there without having to coordinate too much with those teams. Once you end up with a super graph, you also have everything in one place, you have that big schema and you can follow it how it's evolving. And you can kind of treat it like you would with Git. For that, we have a schema registry.
10. Observing Changes and Making Informed Decisions
GraphQL's super graph allows you to observe changes, have multiple versions, and add checks and boundaries. It helps make informed decisions about removing fields by providing usage statistics and querying information. You can decide whether to remove a field, wait for adoption to slow down, or communicate with the front-end team.
It essentially allows you to observe all the changes that are made to that super graph, and it also allows you to have multiple different versions of that super graph out at once. Think of a staging environment that already has fields that you don't have elsewhere. And of course you can also use that to add some checks and boundaries. So in this case, we have one developer who wants to remove this title and that might be safe, might not, we don't know really. And that developer also hasn't been talking to all the application teams too much, so he doesn't really know if it's a news. So well, he just has a pull request and CI kicks in and CI tells him that, yes, this is still a news and we just click on that, we get into our CI and we get all of this information. We see that recently there have been three different operations still accessing that one field and we can click on each of those operations, we can see when it was queried, how often it was queried. And this now allows us to make an informed decision. If we really just can remove it now, because we don't care about the few clients that are making these requests. Or if we want to wait a little longer until adoption has slowed down even more, or if we maybe need to talk to a front end team to stop using this field.
Comments