Video Summary and Transcription
This Talk discusses various aspects of Apollo Cache in GraphQL and Apollo Client 3. It covers topics such as cache fetch policies, normalization, updates, and garbage collection. The importance of proper data storage and management in the cache is emphasized. The Talk also explores the challenges of managing lists and the need for custom update functions. Overall, it provides insights into optimizing the performance and efficiency of Apollo Cache in software development.
1. Introduction to Apollo Cache
I'm Raman Lally from Shopify, giving a talk on befriending the Apollo cache. We've been using GraphQL and moving to Apollo client 3. Understanding the cache and fetch policies is crucial. The cache is stored in memory and rebuilt with the application. It's a representation of your data, not the actual data. Fetch policies determine data retrieval from the cache or network.
So I'm Raman Lally. I'm here from Shopify and I'm giving a talk on befriending the Apollo cache. That was my only meme. I only had space for one, so that's more of me. So really the reason why this talk came about is because we've been using GraphQL forever and I only started recently and we're moving over to Apollo client 3 and people had run into these weird bugs and I'm going to talk about one. But I wanted to talk about how we can avoid those and getting to know how the cache works is the best way.
So someone had created this query that was pulling out this product metadata and they had like this query. It looked like that, that's not it exactly. But there was something wrong in this query and that second piece of data just wasn't coming in. Right? They were querying it, nothing's there, and we're going to come back to this in a minute and see how we could fix it.
So what's happening in the cache. What exactly is in there? And where is it? Right? Like, is it, you know, is it a data object we're keeping somewhere? These are things I didn't know. And then now you guys might know. So it's in memory, as the name might tell you. And that's exactly where it's stored. So every time you would rebuild your application, it would get rebuilt. Every time you refresh the page, it's coming back. It's not persisted anywhere, unless you've actually persisted it yourself. And what's inside of it? And it's not actually your data. It's like a representation of your data. So it takes whatever data you got back from your query, and we store a version of it. So before I talk about any of that, I want to talk about how we get that data. And that is the fetch policies. So these essentially define when to get your data from the cache and when to get it from the network. So there's like six of them, and I'm going to just go through them really quickly. Mainly because this is one of the main things that would cause a bug in your application. Let's say you're expecting to get data from network right away, or you need a new fresh You're not expecting to get it from the cache. You would probably want to swap these around. So here is our first one. Cache first, it's our first.
2. Apollo Cache Fetch Policies
The cache has different fetch policies: cache-and-network, network-only, and cache-only. Cache-and-network retrieves data from the cache first, then updates it from the network. Network-only fetches data from the network and updates the cache. Cache-only retrieves data from the cache. The fetch policy depends on the consistency and freshness of the data you need.
And it's the default one, and it's really simple. Is all of your data in the cache golden? If it's not, we're going to go to the network. And the keyword there is all. So if you have an identical query, but you're asking for one extra field, it's going to regardless, because all of that data is not in the cache. And then very similar to this is only the cache. And the same thing is true here where if all that data isn't in there, it's going to give you an error and it's not going to come back. And we have a few others like caching and networking. So this one is interesting, because it's going to go and get it from your cache, and then refill the cache from the network, right? So if you had some really pi, like a lot of data that's changing often, and you want it to be incredibly consistent, this would be the way to go. And then it'll go to the network, refill your cache, but you'll always have the cache first. And then this is very similar, except for it's only going to the network and then updating your cache. So if you needed to get just the updated data first and you're going to wait, something like you're going to load and wait for it, and then we'll save it in the cache if you're going to have a subsequent query, grab it from there. And then finally, just network. Really simple, nothing else there.
3. Apollo Cache Normalization
So we have our data and we know when we're getting it from the cache. Normalization is how the data is stored in the cache. Objects are split and given unique identifiers. The default is using type name and ID, but custom identifiers can be used. The cache is a flattened data structure, making it easy to access. The normalized cache holds references to objects, allowing multiple queries to use the same types of objects.
So we have our data, and we know when we're getting it, and we know when we're getting it from the cache, but what's in the cache? How did that data get stored? Right? I said it's not exactly your data, it's just a representation of it. And normalization is how it got stored. So it's normalized in steps, and there's three, essentially, where you can break it down to.
Our first one is, whatever data object comes in, we're going to split it. And we're going to split it into all the object entities that it could be. Where, you know, at a time, I believe that was every data object that can exist in your application would get split and it would get normalized, and that would be really cool. But that's not quite the case.
So imagine we had this really cool query. You'll see that this is from a demo app I wrote that I won't actually get to show you, but I'll link it at the end. And it's really ugly. So essentially here, we can see that the three objects that get broken up all have one thing in common, and they're all uniquely identifiable. And that's going to be, like, I guess that rolls us into the next point, where the second step is after we break up all these objects, we want to give them unique identifiers, right? And the default way of doing that is just by using type name and ID. So if your object has an ID field, it would be normalized, or they would try to normalize it. But that's only usually, because you might not have an identifier that is exactly ID. So in that situation, you would use the key fields API. So you can define your own identifiers, like, let's say you had UID as an identifier for an object, you could use that directly. You could also use multiple nested fields to create it. But this is the same concept as just using ID and type name, right? You can generate your own, and this way, those objects would also get normalized. Otherwise they would get normalized under their parent object. So at the very root, you'll have just the query.
So we have them all broken up, we have them in these individual objects that are identifiable, and we're gonna save them in this flattened data structure. And we want that flattened data structure because it's easy to access, and we can make it as small as possible. And that's essentially the idea, is we took that other query, and this is exactly what the normalized cache looks like. So if you had, you know, if you were to extract the cache and take a look at it, which I'm gonna show you in a minute, but this is exactly what it looks like. And you can see that that, like... All these objects that had identifiers, they're not actually inside or nested inside of the other objects anymore. We're just holding a reference to them. And any time, if this query needed to get this object, it would get it from this reference. But the magic comes in when you have multiple queries that all use these same types of objects.
4. Apollo Cache Updates
The power of normalization is seen when multiple queries use the same types of objects. Automatic updates are great, but sometimes they don't work as expected. Not-so-automatic updates can occur, and we'll discuss them. When data is normalized and cached, it can either get merged or added if there's existing data. Automatic merging happens when updating a single entity with its identifier and updated fields.
But the magic comes in when you have multiple queries that all use these same types of objects. Like, we use types everywhere. They'll just be referencing the same object in the cache. And that's really the power of that normalization.
These are washing machines. So this was my riff on something about automation, something automatic. I couldn't think of anything better than this. I thought maybe a car transmission would work, too. But essentially the idea is we have our data, and we're gonna get new data after. Right?
And I'm sure you might have seen at some point where if you requery, data gets automatically updated. And that's really cool. And we love that. And automatic things are really cool. But sometimes automatic things don't work so well. So I'll throw this shirt into the wash, and it's got a stain on it. And usually, all my stains get washed out. But this one didn't. And I think this actually happened to this shirt before. And it was blueberry. And it was really ugly. So I was like, okay, yeah. The washing machine sucks. I probably should have done it by hand. But the washing machine doesn't suck, because I probably should have just done something to it before it went in there to make the automatic work better. So that's what we're gonna talk about. Sometimes not-so-automatic updates will happen.
So whenever data comes into your application, and it gets normalized, and it's being cached, one of two things will happen if there's already data there. It'll either get merged, or it'll get added. There's only two options. So when is it getting merged automatically? What are those scenarios where it happens, and it works, and we're really happy about it? The first one is if you're just updating a single entity, and you're returning that entity with its identifier and its updated fields.
5. Updating Apollo Cache
Objects can be easily merged by their identifier. If you have a list of entities, you need to return the whole collection to update each of them. Sometimes, the response data isn't related to the desired update, requiring a custom update function. Lists are challenging to manage, especially when not returning the entire list or when the order changes. Adding or removing objects from a list also requires a custom update function. The cache doesn't assume how data should be stored or look inside it.
It's really simple to do. Again, as we saw, all those objects are just in this hash. We can easily grab them by their identifier, and we can merge these new fields in. And this is the one that will happen the most often.
But the second one is if you had a list or a collection of entities, and you returned all of them with all of their identifiers, and all the fields that need to be outdated. So this doesn't work if you return, let's say, some of them, or just one of them. You have to return the whole collection back in order to update that for each of them. So talk about when it doesn't work, because these are the ones where we run into it, and it's a really ugly situation.
So first, let's say your response data that's coming back isn't related with the update that you want to happen. So I know there were some situations where we had an object we were going to favorite, like a product that was being favorited. And you favorited it, the mutation goes out, it comes back. You return the ID and the favorite status, and that updated, and it would update that product wherever it was being used in that UI. The thing it's not going to update is how many products are favorited. Like let's say you had a UI that was showing the number of favorited objects. It might be related, but it's not the same data. So in this scenario, you would have to write your own update function and update that data yourself, even though it might seem related to you.
So then the rest of these are about lists, because that's really the hardest thing to manage, where it's like, if you, again, don't return the whole list of updated objects, you're not going to get that automatic update. The same is true if the order changes. So if you were to send out, let's say, objects in order 1, 2, 3, 4, and you're going to change it to 1, 2, 4, 3, when it comes back, the objects are still the same. The only thing that changes is the order. It's not going to reflect in your UI. That's something you'd have to write automatically. And it's mainly because the cache makes no assumptions about how you want to store your data or what your data should look like inside the cache. Those objects are identical. And the only thing it has reference to is the references to those objects. And then finally, adding or removing things, which also really sucks, because if I was going to unfavorite something, I can update that object's favorite status, but I can't remove it from a list of favorite objects. That was something you'd have to write an update function for, because, again, it doesn't know that you can't return something from a mutation and say, OK, yeah, now remove this for me. You can just return something. So again, the update functions exist to do that, but in these scenarios, like, it can be a bug where if you're not expecting it to update automatically. And we've definitely run into those.
6. Apollo Cache Issue and Solution
The issue occurred because the identifier for product metas and metadata were the same. While the objects got normalized inside the cache, the value without an ID couldn't be normalized. Adding an ID to the value solved the problem and allowed for proper normalization.
So we'll come back to this, because now we've talked about a couple of the things that would come into play if we wanted to solve this issue. But essentially the idea here was the product metadata is the same type as this metadata type down here. And then this person was querying this, and they were like, oh, like, slug is undefined. I don't know why slug is undefined. And it's mainly because the identifier for product metas and the identifier for metadata were the same, right? So those objects got normalized inside the cache. And you would expect that their children would also be. The issue is values has an ID, and this value does not have an ID. And what happened to this value is it was tried to be normalized, but it couldn't. And the other value's object was the one that was saved inside this totally normalized object. So the way you would solve this, boom, just add an ID to it. And now I can find it, they can update it, and now they can be, you know, normalized properly.
7. Garbage Collection and Eviction in Apollo
In this part, we'll discuss garbage collection and how it works in Apollo. Unlike JavaScript, where garbage collection happens behind the scenes, in Apollo, we manually interact with it. Garbage collection cleans up unreferenced objects in the cache and returns the IDs of collected items. We'll use a contrived app example to demonstrate the impact of mutations on cache size and identifiers.
So now we're on to, like, the last part, and the part I was the most excited about, which was garbage collecting and how the eviction works. So I'm sure everyone's run into garbage collection at some point. I realized after the fact that it's recycling, and it's not garbage. I was thinking about putting something over it, but I didn't.
So generally, just garbage collection, we're going to try to recover memory from our app, and in JavaScript, that usually happens behind the scenes, right? We don't manually have to interact with it. In Apollo, we do manually have to interact with it. And this is how we would do it. I tried to make it as tiny as possible while still being readable, right? But mainly just because it's so small, right? You would call this garbage collection to clean up any unreferenced objects in your cache. And we'll show what that's going to look like in a second.
But the other thing is this will return the IDs of anything that got collected. So here's that contrived app I was talking about. Using the same queries as before. So there's a GraphQL server. We're querying it for all those pixels, including all the whitespace. And they're each individually identifiable. So it takes up... You can see I just printed out that cache size. That's not the actual cache size. That's just the number of keys that were in the cache. But it's just to represent how big the cache might be at that point. And it's that many items big. And the idea was that there's like a... Since all of these have IDs, they're all addressable, they're all being normalized. Let's say we were going to go ahead and change Pikachu's color to orange. You might notice that the cache is twice as big now. Or more. Right? So the issue here is that we went ahead and we made this mutation... This is a very contrived example. But essentially all of those identifiers changed. Or the vast majority of them.
8. Garbage Collection and Object Retention
And they came back and they weren't able to be merged. So what happened? They had to be added. All of those other objects are still there. They're just not accessible via the actual root object. We don't need them. Our UI doesn't need them. But they're still there and they're taking up space. That's really the main reason we want to get rid of these things.
And they came back and they weren't able to be merged. So what happened? They had to be added. Right? All of those other objects are still there. They're just not accessible via the actual root object. We don't need them. Our UI doesn't need them. But they're still there and they're taking up space. That's really the main reason we want to get rid of these things.
So let's say we do run it. And this is, again, a more crude version of all that other data. How do we collect all these items that aren't being referenced? The idea is that the garbage collector takes a look at the normalized cache and recursively go through each node that exists inside the cache until it finds all of your leaf nodes at the very end. And anything that wasn't visited will be removed.
So this is our new query. The one with the all-orange color with all those new identifiers. And this is our original query that had the orange color. And you can see it's not being referenced by root anymore, because we're not using that data anymore. It's not our main query. We would go through, reach all these nodes, boom, they're all good. And the ones that weren't visited will be removed by the garbage collector.
So, let's say there was a specific object we wanted to keep or we wanted to make it so even if it's not accessible, it won't be removed, and we would use a retention API for this. But really what happens inside the cache when you retain something is we add it to this extra root ID. So, inside the normalized cache there's a separate, I guess, field for keeping track of all these identifiers. And the odd thing I noticed when I was working with this was if you ever write a fragment, like if you ever use the write fragment or write query, those objects that you write to directly also get retained automatically. So if you were writing to all objects, like a whole bunch of random objects in your cache, and you were running the garbage collector, you're not getting collected, it's because they were all getting retained by default, and it's because you altered them directly. So if you were going to get rid of those, you can release them after the fact.
So this is more or less what it'll look like. So let's say we wanted to retain the cheek. I think that's what I call it over here, right? Yeah. And what will happen is that the garbage collector will go through, it'll visit all those nodes that it knows about, then it'll go ahead and visit all these nodes that you've referenced inside your retention, and boom, it'll not get rid of those bad boys. So the last part is that the thing that the garbage collector isn't going to get rid of is any object that is accessible, right? You might want to get rid of those manually.
9. Eviction and Garbage Collection
Adding and removing things manually, eviction API for removing objects from the cache, running garbage collector to remove unreferenced objects.
And we talked about it before where adding and removing things are something you have to do manually instead of it happening automatically when you requery. So again, that's where the eviction API comes in. So you can evict a whole object from the cache directly if you wanted to. The only issue is, let's say you got rid of that root that was referencing all of those other nodes that we saw before, so it had a reference to each of those objects. Nothing has a reference to those objects anymore theoretically if you got rid of that top level object. So if you're ever gonna use this, you would have to run the garbage collector after the fact to get rid of all those objects that can no longer be referenced.
10. Apollo Cache Object Cycle
You can evict specific fields to have more control over query updates. We went through the cycle of fetching, normalizing, updating, and collecting/evicting objects. Demos are available for reference.
And then you can go a step further and just evict specific fields if you'd like to. Which gives you a lot more power to update queries after a change comes in. And then there you really have it. We went through the cycle of an object from fetching it to normalizing it inside the cache to how we would update it or how those things would get updated automatically and then finally to collecting and evicting them. I couldn't write the last line because it doesn't work back into fetching. So I felt really awkward about that. I was gonna show the demo but I don't have time for that. I did link to all of the really ugly demos that I wrote and they just have examples of everything I was talking about in here. So if anyone wants to go check those out, I can.
Cache Strategy and Evolution
Speaking of per query, is cache strategy something you should decide before starting or can it evolve over time? It did evolve over time for us, at least. I would say definitely evolve it over time. You don't want to commit to something completely before you start your application. Things could change very quickly, right? They can. They can and they will.
And then shameless plug. Thank you. Thank you, Raman.
Let's ask some questions. I call my Uber pickups garbage collection. What fetch policy do you recommend using for a standard single page app? The default one. Cache first, honestly. I would say it would cover the vast majority of the scenarios you would want to cover. I don't think you could run into an issue using that one compared to the others. But you would want to use a mix. You can do it on a per query basis. If you have 10 queries, some of them might just only ever want to be hitting the cache or erroring.
Speaking of per query, is cache strategy something you should decide before starting or can it evolve over time? It did evolve over time for us, at least. I would say definitely evolve it over time. You don't want to commit to something completely before you start your application. Things could change very quickly, right? They can. They can and they will.
So be ready. Have you had any cache horror stories? I have one. And that's the whole reason why I wrote this in the first place. Drama. We had this home page with all of these carousels in it. And it was already a bad idea, because there were three carousels in this home page, and we didn't have any input on that. But they were all sharing similar product data. And what was happening is, you could favorite products inside the carousel. And we didn't realize this issue until later, but because of the way that our mutations were written, they were only returning whether something actually got favorited or not. Not its ID and the favorite status. So you would favorite an object in the first carousel, and in carousels two and three, they were not favorited. And someone brought this up, and we were like, this is such a huge failing of our mutations design strategy. So people just took a dive, and we found out, oh, you could automatically update everything.
Refetching Queries and Manual Garbage Collection
We considered refetching queries for updates but realized it's not a good approach. The question of how we got here and the importance of favorites. When to run garbage collection manually depends on the application. If orphaned objects are expected, it can be done after any query. Raman will be available for further discussion.
Someone was like, we could just refetch all of these queries to make sure everything gets updated every time. But you know, that's a horrible way to do that. So we didn't want to go there.
Would you say that favorite was what brought you here? And look at you now? Yeah, look at me now.
We have one more question, which is, when should we run garbage collection manually rather than automatically? I guess it depends on your application. Like there's going to be lots of applications where you don't really need to run it manually. Like you're not going to end up with a ton of orphaned objects before someone would change the page or like, you know, move somewhere else. But if you knew you were going to, and you knew we were going to throw away objects like that, you could run it, you know, after any query. Like it's not very time intensive, right. Cool.
As a reminder, Raman will be in the speaker booths right outside after this if you want to chat to him or ask any more questions, another big round of applause.
Comments