1. Memory Usage in Web Applications
Memory usage is often overlooked in web applications. We optimized a complex React app, but our memory usage became very high. Browsers limit memory usage, and excessive memory usage impacts performance and user experience.
Memory usage is a commonly overlooked part of modern web applications, with frameworks like React trading it for performance and ease of development. The reason why we do this is that normally web applications are simple enough. And memory in clients is plentiful. But is this always the case? And what do we do when it's not?
Hi, I'm Giulio Zauza, I'm a software engineer. And I wanted to bring you some lessons I learned the hard way while optimizing a complex React application. In fact I'm working on a complex web application called Flux. It's a browser-based electronics CAD tool which enables quick and collaborative electrical circuit and PCB design. Under its hood, it's a big-type script application that uses WebGL to render complex documents, and it's built using React, 3.js and Reactive Fiber.
We're working hard on performance as we wanted the application to be snappy and responsive but also scalable. We need to be able to support giant documents with thousands of components, each one made of dozens of complex shapes. Originally, we thought that frozen time during interactions and FPS were the things to optimize for, but we soon realized that that was actually just a part of the picture. When doing those kind of optimization, sometimes you trade memory usage for performance, which is a thing that in our case backfired. In fact, we built a rendering system using lots of memorization whenever it was possible, and we used caches to prevent re-renders, and we used a lot of pre-computations. This initially actually started making performance better, but at a certain point, it actually made things way worse. The memory usage was very, very high when opening larger projects and went over the gigabyte mark very easily. We did those optimizations because we followed what is still considered a best practice for performance with React, that is using memorization as much as possible. In fact, there is even a famous article that advocates for using React memorization for every known symbol value. What we found, though, is that using this strategy can become harmful as it will affect your memory usage in ways that are even unexpected.
Well, but you might ask, why do we care about memory usage? Especially now that clients have more RAM than ever. Well, there are three ways in which memory usage impacts your application negatively. The first is that browsers heavily limit your memory usage. Desktop Chrome, for example, has an hard memory limit around four gigabytes per tab. The moment you go over that limit, the browser will just kill your tab without any way for you to recover from it. And this limit gets even lower on mobile devices such as Safari, iOS, for example. The second reason is that the more memory usage waits on the garbage collector. Even if you're trying to optimize for speed alone, you will probably see a lot of entries in your time profiles related to garbage collection activities such as my major and minor GC. That is happening when too many memory allocations are being done in a short amount of time and the browser is forced to pause your application to take care of them. And the third reason is that using too much memory worsens the user experience. Many users are using their device for multiple things at once. And this means that if your web app is holding entire gigabytes of memory, you will make the user experience of everything else running on their client significantly worse.
2. Analyzing Memory Usage and Tooling
It's important to keep an eye on memory consumption and optimize it. We'll focus on identifying memory usage and making distinctions. Transient memory usage can be hard to catch and may cause app crashes. Count versus size and shallow versus retain memory are important concepts. Different allocation types in JavaScript VM and JS code also take up space. Let's explore the available tooling for analyzing memory usage.
And because of those reasons, I would say that probably regardless of the type of application that you're building, it's always a good idea to keep an eye on memory consumption of your app and optimize it when needed, especially now that Chrome is starting to tell users the memory consumption of single tabs when you hover on the title of the tab. Imagine running a simple to-do list app and seeing half a gigabyte occupied by it. That's not a good impression, I would say.
So suppose that you are in one of those situations and you want to make things better for your application. How do you go about it? Well, the approach I commonly follow runs on three points. The first is that you need to identify what is taking up so much memory in your app. And then once you've found what is taking up too much RAM, you can use some strategies to optimize it. And lastly, you want to try to prevent those things from happening again in the future. And you can use automated memory testing in your CI pipeline for this. In this talk, we'll focus on the first point only, as the other two points are also very big and implementation-dependent.
When it comes to analyzing memory usage, I think it's useful to introduce some terms and make some distinctions first. The first one is about transient versus static memory usage. We call static memory usage the set of memory allocation that stays somewhat stable throughout the execution of the app, and it's the one that you would expect to find while taking a heap snapshot when your app is at the steady state. Transient memory usage, instead, it's when your app allocates a lot of memory at once and to release it shortly after, creating a peak in memory usage. This could be very hard to catch with only heap snapshot, and a very big peak could crush your app.
Another important distinction is count versus size. As you can have single units of allocation that occupies a lot of memory. But you can also have many smaller allocations, which alone are small enough. But together they fill up your RAM. In the second case, it can become more difficult to find them out and optimize them. We then have two terms that come up often in memory profiles, which are shallow and retain memory. As JavaScript relies on using nested data structures and pointers, there is a distinction between the size of an object itself versus the size that that object is pointing to. For example, we can have an array of 10 elements, each one being a different string with a million characters each. Because of their size, the string will occupy in total around 10 megabytes, whereas the array will just occupy a few dozen bytes. Since the array is containing and retaining those strings though, we will say that the retained size of that array would be 10 megabytes anyway, as it's the reason why the string has been kept in memory anyway.
The last thing that I think it's important to introduce are the different allocation types inside the JavaScript Visual Machine. Chrome, for example, makes the distinction between objects, arrays, strings, typed arrays, and it's also important to notice how even your JS code takes up space in memory. And by browsing the outputs of a memory snapshot, you can learn some very interesting things, like the fact that the JS function on its own can take up space as they were objects, since they need to keep track of their closures. So it's better avoiding creating functions in loops for this reason.
Okay, now that we established some terms, let's look at the tooling that there is available for analyzing memory usage.
3. Analyzing Memory Usage and Tools
The Chrome Memory Profiler offers two useful analysis options: heap snapshot and allocation sampling. Heap snapshots provide a breakdown of object types and memory retention. Allocation sampling helps identify memory usage peaks. By analyzing the snapshot, we discovered a large map object causing high memory usage. It turned out to be a global variable used for event subscriptions. Removing the useState hook and replacing the map with a set reduced memory usage by up to 50%. However, most memory footprint is dominated by smaller objects, which are challenging to optimize individually. The Chrome memory profiler doesn't group objects by type, but exporting the heap snapshot to a JSON file allows for statistical analysis using tools like Memlab.
The starting point is, of course, the Chrome Memory Profiler, which has three analysis options available. You may experience the heap snapshot and the allocation sampling are the most useful ones. But for two different purposes. The heap snapshot takes a full image of the current memory usage as a given point in time, giving you a breakdown of the types of objects in memory and why are they being retained. Using a heap snapshot, you can look at different object types, their shallow memory usage, and their retained memory usage as we said before.
With any snapshot though, you are missing information about what's changing in time and you're not able to look at transient peaks of memory usage. If you're interested in memory usage peaks instead, you can use the allocation sampling method. You can start and stop it later and after you run it for some time, it provides you a graph, describing how much stuff was allocated during the time and by which functions. This is a really powerful tool for debugging memory usage spikes, but it's less suited for analyzing the total memory footprint of your app at a steady state. And just by using those two tools, you can make really powerful memory usage optimizations.
For example, I can show you how we use them to optimize the update and work. After taking a memory snapshot of the app Flux at a steady state, we can click on the shallow and retained size columns to sort the allocation types per memory usage and hopefully find the biggest offenders and remove them. We can check them both, both retained and shallow, so that we can find both large single allocations and smaller locations that are retaining a large amount of memory. And while looking at that list, there was a thing that stood out immediately.
We have a lot of map objects, like around 10,000, most of them pretty small, but apart from the first one, which was taking up around 84 megabytes of RAM. And that immediately seemed like a big red flag since I didn't know that we had such a large map for something in the app. And in the lower part of the profiler UI, we can see why and who was retaining that large map object, which turned out to be a global variable that we were using for subscribing to events. In fact, every time in our app, a subscription happens, we're using a useState hook to generate and store a string UID, which was later used to remove the subscription on CleanApp. This turned out to be extremely inefficient for memory consumption, as it needed to instantiate the useState hook, store the string, and also keep the closure around that lambda function. And by removing that useState hook call and replacing the map that kept strings just a set data structure, we were able to reduce the memory footprint of subscriptions, which saved up even 50% of RAM in some cases. And this is an example of an obviously memory inefficient data structure, which was easy to find and optimize.
Unfortunately, though, that's not always the case. In fact, it was like one of the very few easy things that we could optimize in our app. If you look at this profile, you can actually see how the remaining majority of memory footprint is actually dominated by objects. And unfortunately, we don't have a single object to optimize anymore, but rather to million instances of smaller objects that are being kept in memory. The Chrome memory profiler doesn't make it easy to understand as it doesn't group objects by type. And there is no way that you can scroll through all of them manually and find out what is going on. Thankfully, though, the Chrome memory profiler can export your heap snapshot to a JSON file. So there is hope for us to perform some statistical analysis on them. Without having to write a lot of code, an importer, and a parser ourselves, there is a fantastic tool to analyze those files called Memlab.
4. Analyzing Memory Usage with Memlab
Memlab is a complete package for memory analysis. It can detect memory leaks, run in CI pipelines, and analyze heap snapshots. By clustering objects by type and properties, we can make sense of memory usage. React hooks, like useState and useMemo, use a data structure called fiber node. Simplify and optimize hooks to reduce memory usage. Another Memlab analyzer helps identify the heaviest React components in terms of memory usage.
And the Memlab is maintained by Meta Open Source, and it's a complete package for memory analysis. For example, it can be used for detecting memory leaks automatically in your app, or it can even run in CI pipelines to periodically check your memory usage, which is really cool.
Another cool feature of Memlab is that you can use it as a framework for doing analysis on heap snapshots. You can, in fact, write your own analysis plugin using JavaScript. I've developed some that helped me understanding the memory footprint of my app better. And I'll be sharing the code of those so that other people can use them as well.
And the first analyzer that I did was to answer the question, which types of objects are taking up the most space out of the two millions that we found in a snapshot? And so to answer this question, I wrote this analyzer plugin that loads the heap snapshot, cluster the objects by type, by the name of the properties that they have inside, and accumulate their size together, sorting them by bigger size first. This way I'm hoping to make some sense out of the hundreds of megabytes that are being occupied by smaller objects.
I've run the analysis on a snapshot from my app and the results I've got from it were really interesting. The type of objects that occupied the biggest size in the app had this form, with property names, base queue, base state, memorize state, next state, next end queue. It was not an object that was part of my app, but I wonder where it came from. And actually it came from the channels of the React framework. And this is actually how React hooks work internally. In fact, every React component that is rendered keeps its state information in a data structure called fiber node. Every fiber node objects contain a memorized state property, which is a pointer to a linked list that contains all the information about your hooks. Every time you perform a hook call in your React component, a new object of that type is allocated and appended to the next property in that linked list. The value of that hook is then stored in the memorized state property of the linked list node. This data structure is then used by React when it needs to render that component, as it can walk the linked list with every hook call, like useState, useMemo, useCallback, and get the value that it needs to return. And this is why the order of hook call matters.
By reading these results from the heap analysis, we gained an interesting insight. useMemos and hooks are expensive, but not for the reason you might think. It's not really the content of the memorized value itself that is taking up a lot of space, but rather all the support of the structures that are needed for the hook to work. So from this, we got a strategy. Simplify, merge, kill all the hooks that we have in the hot path, and reduce them as much as possible. The big question though was which React components is the heaviest, and we need to optimize first to gain the biggest memory benefit. To answer this question, we wrote another Memlab analyzer, which did the following things. First, find all the Fibernode data structure in Memory Snapshot. For each Fibernode, determine which React components it belongs to by looking at the type property. Then compute statistics about that Fibernode, like which hooks it was using and how much memory it was occupying and then accumulate all the computed statistics, grouping them by React Component Type. And by doing this, we were finally able to understand which React components were the heaviest ones in terms of memory usage.
5. Optimizing React Components and Memory Analysis
We optimized the heaviest React components and reduced memory usage by 50%. Use our code to solve memory problems. Memlab helped answer questions about string memory allocations. Transitioning to number UIDs would only save a couple of megabytes. Memory analysis is difficult, and optimizations can backfire. Chrome profiler is useful, but tools like Memlab provide further analysis. React hooks are expensive for large-scale projects. Attendees were thanked for their participation.
We used it to decide what to optimize first. And after we optimized even just the top one in the list, the heaviest React components we had, we were able to cut 50% of memory usage in our app. And now you can also try to do the same with your code if you're having memory problems with the code that we're going to publish.
We then use Memlab to answer other question that we had about our memory usage. One, for example, was about the string memory allocations. Out of all the strings that we had, how many of those were UIDs? Should we transition from string UIDs to number UIDs? Well, the Memlab analyzer that we did allowed us to get an answer from that question. And actually we found out that we could have saved just a couple megabytes, so it was not worth it for now.
So, to summarize what we learned by analyzing memory usage. The first is that memory analysis is difficult and sometimes the optimization you think will make things better can backfire. So performing analysis of those types helps a lot. The second is that the Chrome profiler is cool and it's very useful, but sometimes you have to analyze things further by using tools like Memlab which is even customizable and you can use it to answer your own questions. And the last thing is that the React hooks are not cheap especially when you have thousands of instances of the same React components. So if you're building something big with React, keep that in mind. And thank you for attending this talk. I hope that this will help you finding why your app is taking up so much RAM.
Comments