English versionEN

[EN] How to Show 10 Million of Something: Frontend Performance Beyond Memoization
[ES] Cómo Mostrar 10 Millones de Algo: Rendimiento del Frontend Más Allá de la Memoización

How to Show 10 Million of Something: Frontend Performance Beyond Memoization

When discussing frontend performance, there are usually two topics: Lighthouse scores, and rerenders. But when working on applications that deal with large amounts of data and pagination is not an option, entirely different categories of optimizations become necessary.

Through the case study of Axiom's trace viewer, we will examine the solutions that keep your application running (and running fast!) each time your data size grows by an order of magnitude.

This talk has been presented at React Day Berlin 2024, check out the latest edition of this React Conference.

FAQ

Recommended tools for optimizing web vitals in the React ecosystem include Next.js, Remix, Astro, and Quick.

Rendering performance focuses on optimizing how the application performs while users are interacting with it, rather than just on load times.

Nadia Makarovic is recommended for resources on optimizing React rendering performance.

Chris faces the challenge of dealing with enormous amounts of data, such as tens of thousands or millions of items, in React applications.

The first step is to measure compute, memory, and network performance to identify bottlenecks before starting optimization.

The Trace Viewer app had memory issues due to using MobX, which wrapped proxies around each object.

Chunking data and using cursor-based pagination was implemented to manage large data sets effectively.

Constant evaluation of bottlenecks and making incremental improvements to performance-critical paths is recommended to maintain a culture of performance awareness.

The most common conversation around performance in React apps is web vitals, which determine how fast the app loads and becomes interactive.

The strategy used was to keep spans as regular JavaScript objects outside of MobX and React, using MobX only for smaller state updates.

performance

Christopher Ehrlich

29 min

16 Dec, 2024

Comments

Video Summary and Transcription

Today's Talk focused on performance optimization in React apps, specifically in handling and rendering enormous amounts of data. The speaker discussed various techniques and tools to achieve better performance, such as optimizing data relay, rendering, data processing, memory usage, and denormalization. They also highlighted the importance of optimizing network performance, payload size, and request handling. The Talk emphasized the need to measure before optimizing, focus on bottlenecks, and make small improvements that actually benefit users. Overall, the Talk provided valuable insights and recommendations for improving performance in React apps.

Available in Español: Cómo Mostrar 10 Millones de Algo: Rendimiento del Frontend Más Allá de la Memoización

1. Performance in React Apps

Short description:

Today, I'll discuss the specific type of performance I was looking for in React apps. While web vitals are important, users care more about performance during usage. Rendering performance can be optimized with resources like Nadia Makarovic's writing and upcoming React Compiler. For heavy runtime compute, you can explore RustWasm. I'll focus on dealing with enormous amounts of data. We'll start with a demo of Axiom's Trace Viewer, a performant app with over a million spans. Now, let's talk about achieving similar results.

Hello, today, I'm going to talk about performance in React apps. Now, when I started working on the project that I'll discuss in this talk, I found that there are many resources on React performance and frontend performance in general. But most of them were not actually all that relevant to the specific type of performance that I was looking for.

Probably the most common conversation around performance is the one of web vitals. Now, web vitals are super important. They determine how fast your app loads, how fast it is interactive, and so on. And nowadays, there are many great solutions for this. If you're in the React ecosystem, you can try Next.js or Remix. And there are also many other great tools, such as Astro or Quick. However, I mostly work on what I would call applications or dashboards, so I find that users don't actually care that much how fast the site loads as long as it's reasonably fast. And they care much more about how it performs once they're using it.

That brings us to the second topic that is discussed quite often, which is rendering performance. This is a topic I deal with fairly often, but I find that there are many excellent resources on this nowadays. For example, I'd recommend the writing of Nadia Makarovic. I found that once you know how to optimize React a little bit, it gets quite fast, and the entire problem of rendering performance is mostly solved. Additionally, React Compiler is coming soon, which will help us even more with this. And then, of course, there are other frameworks that focus even more on rendering performance, such as Solid.js.

Another topic on which there are already plenty of resources is the one of having very heavy runtime compute. And there are many things you can do here. One of them, for example, is to write RustWasm, and if you're interested in that sort of thing, then Ken Wheeler has some really good talks about this topic. But the kind of performance that I found myself struggling with is how to deal with enormous amounts of data, tens of thousands, hundreds of thousands, or even millions of items. Now, as this tweet here shows, I think a lot of people don't understand why you would ever want to do this, and it's a very reasonable question, which is why I want to talk about it today. So, what we're looking at today is how to show 10 million of something.

Let's start with a demo. The application we're going to look at today is Axiom's Trace Viewer. It has a fairly normal feature set for this kind of application. So, for example, you can page through error spans, you can look at the details of various spans, and if a span is particularly slow, you can take a look at how it compares to usual spans of that type. Of course, it also has many other features, but the main point I want to show is that it's quite performant, even though we currently have a Trace Open with just over a million spans, and this is in local host, which has quite a bit of overhead compared to running the app in production.

So, now that you've seen the app, let's talk about how you can achieve something like this. Some quick details about me.

2. Handling 10 Million Items

Short description:

My name is Chris. I recommend avoiding handling 10 million items as it takes a lot of time and makes the code base more difficult. Listen to your customers and develop against real data of a similar scale. Use pagination, streaming, or aggregation. Negotiate requirements if asked to show millions of items. Measure before optimizing, focusing on compute, memory, and network. Rethink the architecture to improve capability. Initial errors didn't originate in the front end.

My name is Chris. I work for Axiom, which is the application you just saw, and in the open source world, I'm known for some contributions to CreateT3App and TRPC and some other projects.

So, how do you handle 10 million items? Well, my recommended thing to do is just to not do it. Now, of course, this is not very helpful advice, but I want to be serious here. Try to avoid it. This type of work takes a lot of time that you could spend on fixing bugs, writing new features, and so on, and it also makes your code base much more difficult for the next person to work on.

I think many of us love this idea of solving very interesting technological problems, but what you need to think about is, is this the best way you can spend your time to make your users' lives better? The way to know whether you need this or not is to listen to your customers and see what their frustrations are. The other thing that can help you figure out if you need this or not is to consistently develop against a real server with real data of a similar scale as your biggest users have.

So, if you should avoid it, how can you do that? It's going to depend on your situation, but there's many options here. You can use pagination, so getting in 10 or 20 or 100 results at a time. You can use streaming, so only getting in the specific results that are currently needed. Or maybe you don't even need the individual items, you just need to run some sort of aggregation on them. In that case, you can do it server-side, or in many cases, even in the database. And the final thing I want to point out is that if your product owner asks for this, I would really suggest to negotiate the requirements and figure out why it is that they want this. Maybe they're actually presenting you with an XY problem, and there's really a much better solution, such as one of the three things above.

But let's say you have negotiated, you have thought about this, you have considered the alternatives, and you've come to the conclusion that the only way to make your app good is to show millions of items at a time. What do you do now? The first thing that's very important is to measure before you start optimizing. There's a good chance that you're completely wrong about where your bottlenecks are. There are three main things to measure, compute, memory, and network, and we'll look at each of them later on.

So now that we're through the introduction, let's talk about the specific things that helped us out in our situation. I'm going to omit some implementation details here, so I'll show a few things that don't one-to-one represent how Axiom behaves in production. But these details are very specific to our situation, and I think we'll end up with a more useful talk this way.

Here's the starting point. It's late 2023, and tracing in Axiom works great, up until about 5000 spans. But we didn't know that last part at the time, because why would anybody ever want to create traces that big? And then we launched a new pricing plan, and people realized that they could use our tracing as effectively a profiling tool without incurring huge costs, and so they did, and we couldn't handle it. This is what happened if you tried to open a trace with even just 10,000 spans. I think you can see from the error message that we hadn't even considered this possibility or this way of failing.

After some quick investigation, we realized that we really had to rethink the entire architecture of the trace viewer. So let's look at what we did step by step to improve our capability by several orders of magnitude. Now this initial set of errors didn't actually originate in the front end.

3. Optimizing Data Relay

Short description:

We stopped relaying the data through the back-end for front-end and had the front-end talk directly to the main back-end for huge amounts of data.

We had a small node server that we were relaying the data through that was communicating to our actual back-end, which is written in Golang. We could have probably kept optimizing that node server, but we very quickly realized that the requirements weren't just twice or ten times as high as what we currently supported, but probably closer to a thousand or even ten thousand times. So the first thing we had to do was to stop relaying the data through the back-end for front-end.

Having a BFF can be great. It gets deployed together with the front-end, it helps with security, version skew, and many other things, but it's not the right fit for every situation, and it clearly wasn't here anymore. The solution here is simple, right? Instead of relaying through the BFF, you just talk to the back-end directly. But this does bring some difficulties with it. For example, you might need to rethink how auth and security works. But for huge amounts of data, it becomes very important to pipe it through as few places as possible. So we changed the architecture to have the front-end talk directly to the main back-end for this use case, and this worked great.

4. Optimizing Rendering and Data Processing

Short description:

To optimize rendering performance in React, virtualize the data by only rendering the rows in the viewport. Two recommended libraries for this are Virtua for a small and light solution, and Tanzac Virtual for a more powerful and configurable option. To investigate compute blocking, use the performance tab in the Chrome DevTools and focus on self time. Look for recognizable code patterns and become familiar with common names in your app. To improve performance when virtualizing a list, move data processing into the rendering component, even if it adds a slight overhead.

But very quickly, we started hitting other limits, and the first one was rendering performance. This is probably not surprising if you've tried to render thousands or tens of thousands of items in React before, but it doesn't go very well. The good news is this is the easiest one to solve, and the solution is to virtualize the data. All that means is that instead of rendering every single row, you only render the ones that are in the viewport and a couple above and below to make scrolling smooth.

And there are two libraries I'd recommend for doing this. If you want something that's very small and light, less than 2 kilobytes of bundle size, and basically plug and play, then I would recommend the library Virtua. And if you're looking for something more powerful and configurable, then Tanzac Virtual is an excellent choice. This solved our actual rendering, but we still had a lot of compute that was blocking the application. Let's look at how to investigate that.

The main tool for this is the performance tab in the Chrome DevTools. Before we get into how to analyze this, there's two magic numbers I want you to know about. The first one is 16 milliseconds, which is one frame at 60 hertz. So if you block for more than 16 milliseconds, you're going to drop a frame. The second number is 200 milliseconds, which is approximately the longest amount of time you can block on a click before it starts feeling quite bad for the user.

Typing should of course always stay at 60 frames per second, but especially if you need to make network requests, users tend to forgive latency of up to 100 or in most cases, even 200 milliseconds. In terms of how to read this diagram, what you mostly care about is the self time, which is how much time is spent not in children of the current span, but in the span itself. However, if you scroll down, you'll mostly end up in library functions or system calls and stuff like that. So at that point, you want to go back up and find the last thing that you recognize or that is part of your own code base.

A quick rule of thumb here is to look for the last blue thing. But if you have source maps, you can do much better than that. And you can see on the bottom of the screen here, it's referencing the parse.ts function, which is part of our own code base. And if it is instead referencing a function inside of a library, the file name would instead start with node modules. But also having said that, there's no one quick trick that will let you understand every frame graph immediately. So if this is something you do often, you should really get used to skimming these top to bottom and finding the patterns that are relevant to your app and becoming familiar with the names of things that happen quite often in your app.

Now the performance problem we had in our app was that we were now virtualizing the list, but we were actually still processing all the data upfront. I'm sure you've seen an example like the component on the left before, where you get a list of items, and then you process them, and then you render them one by one. The problem here is that if you have a million items, then you need to process a million items even if you're only rendering five of them.

The most common solution here is to move the processing of the data into the component that does the rendering for one piece of the data. This does mean that the app will run a little bit slower after the initial load, because every time a new row is rendered, it needs to process the data for it. But usually that overhead is negligible.

5. Optimizing Memory Usage and Denormalizing Data

Short description:

After optimizing compute, we faced a memory problem when handling a large number of spans. Using MobX for state management caused performance issues due to millions of nested objects. Switching to JavaScript primitives improved performance but posed challenges with arbitrarily large objects. To address this, we denormalized the data and loaded spans incrementally instead of sending all data at once.

And that was actually, in our case, all of the optimization we had to do in terms of compute. But there was another much, much bigger problem. After a certain amount of spans, the app would just run out of memory. This makes sense as each additional item requires additional memory, and there just wasn't enough of it to go around.

Memory in Chrome is a bit complicated, but a rule of thumb that I use is that you can usually get four gigabytes if you're the active application, but you should really try to stay under one gigabyte if at all possible. And the way to know how much you're using is to open the memory tab in the dev tools and take a memory snapshot. In this case, we have opened a trace with 8,210 spans, and you can see that the memory snapshot takes 750 megabytes. So we could only handle maybe 30,000 or 40,000 spans before running out of memory.

The memory tab in the Chrome DevTools is quite powerful, but to give a crash course, what I often do is sort by retain size and then drill into that and look for the first thing I recognize. And then in the lower half of the tool, you can see what is actually using that thing. So in our case, you can see that even though we only have about 8,000 spans, there are 1.9 million of these observable values and they're overwhelmingly used in the Traces store. This made the cause of the issue quite clear.

We were using MobX as our front-end state management solution to have fine-grained reactivity in the UI. But the way MobX works is that it wraps a proxy around each object, which is usually a very good solution. But in this particular case, it gets expensive very fast if you have millions of deeply nested objects. We experimented with other state management solutions or even just using built-in React state, but we weren't really happy with any of it. So what we decided to do is to just use JavaScript primitives. We kept the spans, which are large objects, outside of MobX and even outside of React, instead of just using regular JavaScript objects, and kept something much smaller in MobX that was just enough to let us know when to update the UI.

Now, of course, React state and other state management solutions are built for a reason, right? We don't want to have to tell our application when to re-render the UI. However, in this case, we realized we don't actually need the most fine-grained reactivity, as we get all of the spans at once, and then we just need to notify the application that we have them. And once they're loaded in, they're not going to change. Keeping the expensive stuff in raw JavaScript was enormously successful. We suddenly had support for around 100,000 spans, or in some cases even 200,000 or 300,000. But the problem is that spans can have an arbitrary number of attributes, and as you can imagine, a couple million of arbitrarily large objects get expensive even if you don't have the overhead of a state management solution. Also, loading them in over the network was taking forever. So we needed to go one step further. Until now the server had been sending all the data for all the spans, and then we used that to render the UI. And that just wasn't feasible anymore. What we needed to do was to denormalize.

6. Denormalizing Data and Optimizing Loading

Short description:

To improve read performance and simplify query execution, we denormalized the data model in the frontend. The trace viewer consists of a span tree and span details, with the waterfall view only requiring a few attributes. We optimized loading by sending only necessary information for the spans and fetching span details as needed. We implemented a prefetch on hover using react-query and used the span ID in the DOM to avoid passing it to the event handler. Different behaviors were implemented for smaller and larger traces, including full text search for full spans and network requests with search parameters for span tree information. This approach resulted in five types of requests, with Axiom serving as the database API.

What we needed to do was to denormalize. Denormalization means introducing redundancy into a data model, typically in the database, but in our case, in the front end, to improve read performance or to simplify query execution. More specifically, if you think about the trace viewer, there's two parts to it. The span tree on the left and the span details on the right. Now the span details on the right do actually need all of the attributes because you need to be able to scroll through all the details of the span, but it turns out that the waterfall view on the left only actually needs a few attributes. If you think about what the waterfall is showing, you need to know the span's name and the service that emitted it, you need to know its start time and duration, and you need to know the id and the parent id in order to be able to create a hierarchy of spans. But that's it. You don't need any of the other attributes.

So what we started doing is only sending those necessary things for all of the spans, and then fetching the detail for each span in the sidebar as necessary. Of course, having a loading spinner in the sidebar is not ideal, so we needed a solution for that as well. What we ended up doing was a fairly simple prefetch on hover using react-query. Once the cursor enters the span, we wait 300 milliseconds, and if the cursor is still over the span, we fetch its details. A neat trick we used here is to put the span id into an attribute in the DOM itself in order to not need to pass the span id to the event handler. This means that all rows share one handle mouse enter, one handle mouse leave, and one timeout ref. To take this further, you can also do cute things like preload the next and previous search results or the next and previous error spans so that users can tap through those fast.

But of course, there are still going to be situations where this behaves worse, for example, if the user pages through errors or search results quite fast, or if they click very fast. So what I'm showing here is not actually the whole truth. What we actually did is, depending on the size of the trace, we would either load the full span or only the necessary information for the waterfall. This means that most of our users who deal with small traces have the real-time performance, but for the users who do need us to be able to handle huge traces, we can do that as well. Another example of where there's different behavior for smaller and larger traces is in search. So when we have full spans, we do full text search in JavaScript, but when we only have the necessary information for the span tree, we of course can't do that because we don't have any of the other attributes. So what we do in that case is we send a network request with the search parameters and then return a response with all the span IDs that match the search. And once all of that is implemented, here are the five types of requests we end up with. Now this slide is a bit of a fabrication because we don't actually use a traditional REST API for any of this. Instead, we use Axiom, the database, as our API. But if you were building a REST API to behave like what we're doing, then it would look something like this. The first thing you would need is an endpoint that gets you the information of a trace based on its trace ID. And all that means is it has a start time of x, an end time of y, and it has a certain number of spans. Based on the number of spans, you know if it's a small or a big trace, and if you should fetch the full span details or only what's needed for the waterfall. And the start and end times can help enormously with query performance if your table has an index on the timestamp.

7. Optimizing Data Fetching and Network Performance

Short description:

We implemented the necessary endpoints for returning spans, handling search, and managing the display and filtering of spans and search results. By optimizing memory usage and using denormalization, we were able to display a million spans at a time. However, we still faced challenges with out-of-memory errors when mapping over large sets of spans. To address this, we recommend using a for loop and manually processing each element. Additionally, we optimized array operations by using push and pop instead of shift and unshift. The biggest remaining challenge is efficiently fetching the actual data, despite denormalization. Understanding network requests and their performance can be done using the initiator tab in the Chrome DevTools' network tab, which provides insights into the source of network requests.

Then, of course, we need the two endpoints that return the spans. So either you get a full trace and that just takes a trace ID, or in the case of fetching the trace waterfall, you also pass a trace ID and then you get back what is necessary for the waterfall. In our case, we actually batch this, so we also provide a count and an offset. If you're not fetching full spans, then you need a separate endpoint that will get you all the details for a span. This is used to populate the span details on the right of the screen.

Finally, you need the endpoint that handles search. This takes a trace ID and some search text and returns a list of span IDs that match the search. This is the architecture we ended up with in JavaScript. The first thing we have is the spans, and this in our case is an object where the key is the span ID and the value is the span itself. The reason we do it this way is so that accessing any span is O(1). The second thing we have is called the span ID tree, and the reason we need this is because spans have parents and children, and we need to be able to traverse that hierarchy quickly. So we build this up where each object has only the span ID and the information about its children, and then if we need to know the details about a specific span, we get it from the spans object.

The first thing we have in MobX is called filtered spans, and this is just an array of span IDs that represents the order that the span should be rendered in the trace viewer. Users can expand and collapse individual spans and their children, so this needs to be able to change and have the UI react to that. The second thing we have is search results, which is a set of span IDs. And the purpose of this is that if the user types in some search text, we want to highlight all of the spans that match their search. Finally, we have a get span function that just returns a particular span based on its span ID.

This is the point where we broke through being able to display a million spans at a time, but there were still some things that needed to be improved. One problem that remained is that sometimes if we mapped over a large set of spans, we would still get out of memory. The reason for this is that the way map works is if you map over 1 million elements, you will briefly end up with 2 million elements, as it creates a second array for the results of the map. What that means for you is that if you do need to process data, I would recommend using a for loop and manually processing each element. After you've processed it, you can remove it from the original array and hope that the garbage collector does its job. To take that a step further, shift and unshift are each O(n), but push and pop are O(1). So if you're dealing with very large arrays, I would recommend sticking to push and pop.

At this point, we had solved most of the big issues around memory and compute, but there was still one very big problem and this is actually the biggest remaining problem in the application currently. This problem is getting the actual data in. Denormalization helps a bit because fetching spans that only have the necessary information for the waterfall view is a lot less data than fetching the full spans, but at some point, it's still a lot of data. Now, if you want to understand what your network requests are and why they're slow, you can use the network tab in Chrome DevTools and there are two sections there that people don't use that often. The first is the initiator tab, which I find great. If you have source maps, you can see which function and which line a network request actually comes from in sort of a stack trace way, so that you can find out where in your code this network request is actually being caused.

8. Optimizing Network Performance and Payload Size

Short description:

The timing tab provides durations for different request categories. Queuing time can indicate concurrency limitations. Waiting for server response may require server-side optimizations. Content download can be optimized by sending a smaller payload and using a columnar response format. Although there is a serialization cost in the frontend, it is worth the performance gain. Another optimization is using a lookup table for service names instead of sending them repeatedly, but the savings in payload size were not significant.

And the second one I find very helpful is the timing tab. In the timing tab, you'll get the durations for the request broken up into a few different categories. The first one is queuing, and if you spend a long time queuing that generally means that the concurrency limit is reached. You're generally allowed either six or eight concurrent requests, so if more than that are in the queue, some of them will have to wait. But queuing can also be related to some other things such as DNS, the browser deciding it's more urgent to deal with other things like HTML and CSS, waiting on the main thread, or things related to service workers. But generally if you're spending a long time queuing, it's due to concurrency.

The second important one is waiting for server response, and all this means is it's the amount of time your back end actually needs to process the request and send out a response. So if this takes a long time, you should see if there's anything you can optimize server side.

The last one is content download, and other than getting faster servers, the only way you can optimize this is to send a smaller payload. Now sending a smaller payload isn't always easy. If you're sending unnecessary data, then of course you should stop sending that. But if you're already only sending the necessary data, then it can be tough to find optimizations. But here's one that we found that helps quite a lot. That optimization is to change the wire format from thinking of data in terms of rows to thinking of data in terms of columns.

In Axiom, we didn't specifically make this change for tracing, but it also had great performance impacts on tracing. If you implement this, what it means is that you don't need to include the column names in every row. So in the example on the left, you can see that every row has ID and every row has service. This means that in your Stringify JSON, you're sending that ID and that service tens of thousands or hundreds of thousands or millions of times. Now it gzips to some extent, but even so there's a bit of overhead. And switching to a columnar response can help a lot with that. Also if your backend is in a language with a more efficient type system and memory model than JavaScript, then arrays of equally sized items tend to be much faster to work with than structs or objects, so it can also be beneficial in that sense.

Of course if you want to display a bunch of rows of data in React, then React usually likes to have that data in a row format. So you end up paying a serialization cost in the frontend, but we found that to be absolutely worth the performance overhead and everything is much faster in total. Now as I said before, at the scale we're dealing with here, which is tens of thousands or hundreds of thousands or even millions of items, even with these optimizations it's still a ton of data. So we also looked into one other optimization. In a typical trace, even if there are a lot of spans, they are usually only produced by a small number of services.

So for example, you might have one service that represents your API, maybe you have another one that represents auth and another one that represents a database and so on. And even in the case of ten thousand spans, it's probably less than a thousand services. So instead of sending an array that has these service names in it over and over, would it maybe be cheaper to instead create a lookup table of the service names and then just send the lookup table index to the front end along with the table itself? What we found is that there are minor savings in the payload that get sent this way, but it just wasn't worth the effort. gzip and v8 both already optimized very very well for repeated string values, so this ended up just not making a very big difference.

9. Optimizing Request Handling and Performance

Short description:

Chunking data and using cursor-based pagination can optimize the handling of slow requests. Paginated APIs can improve performance by allowing smaller and cheaper retries. Chunk size depends on the data type. Avoid ID-based cursors for better parallelism. A culture of caring about performance is crucial, focusing on bottlenecks and making small improvements can have a significant impact. Optimize only where it matters to avoid wasting time. Ensure optimizations actually benefit users.

But we had another problem. If this slow request for a huge amount of data fails, then we need to start all over. And that's not great. So what we started doing instead is chunking our data. What chunking means is that instead of sending all the results at once, we send them little by little and use cursor-based pagination to know what the next block of results should be. I mentioned pagination at the start as a way to avoid dealing with the whole problem this talk is about, but even if you decide that this is an issue you need to deal with, a paginated API can still help you a lot.

Several small requests will often finish faster than one large request, and if one of the smaller requests fails, it's much cheaper to retry just that small one than to retry the large one if it were to fail. There is also a downside to it, which is that several small requests tend to be more expensive server-side unless you're quite good at caching. But we found the difference made by chunking to be so big that it was non-negotiable to end up chunking the data. Finding the ideal chunk size can be a bit of trial and error. In our case we found that about 10,000 elements was the right size. But this is going to depend enormously on what type of data you're dealing with.

My final bit of advice around chunking, and we're currently not actually doing this but I wish we were, is to avoid ID-based cursors. There's roughly two ways of paginating. The first one is to say, give me the first 10 results, okay now give me results 11 to 20, and so on. But in some case, the backend doesn't actually know where result number 11 starts, so what you can do instead is use the ID of the last row that was returned as the cursor for the next request. This fixes the problem of the backend knowing what data to send out next, but it means that you need to wait for the first batch in order to be able to fetch the second batch, so you lose any advantages that would be gained by parallelism. If you can use offset-based pagination, I'd strongly recommend it, but you don't lose as much performance by not doing it as you think you might.

The last point I want to make is that you need a culture of caring about performance. What this means is that you need to constantly look at where the bottlenecks are, and if there's anything you could do to minimize them. There are a ton of smaller improvements that I haven't talked about today because they won't be relevant anywhere except axiom-specific code basing goals, but the point I want to make is that making many of these small improvements in the hot path does actually pile up. If you make a bottleneck 10% faster 24 times, you've actually made it 10 times faster, which is a huge improvement. But of course this applies only to your bottlenecks, and as I said at the beginning, excessive optimization in places where it's not important can end up being a huge waste of time that doesn't give your users any benefit. So before spending time on optimizing, you should always make sure that what you're optimizing actually does impact your users. That's all I wanted to talk about today, so thanks for listening and I hope it was helpful for you.

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

A Guide to React Rendering Behavior

React Advanced 2022

25 min

A Guide to React Rendering Behavior

Top Content

Mark Erikson

Replay.io

This transcription provides a brief guide to React rendering behavior. It explains the process of rendering, comparing new and old elements, and the importance of pure rendering without side effects. It also covers topics such as batching and double rendering, optimizing rendering and using context and Redux in React. Overall, it offers valuable insights for developers looking to understand and optimize React rendering.

react performance deep dive react rendering

Speeding Up Your React App With Less JavaScript

React Summit 2023

32 min

Speeding Up Your React App With Less JavaScript

Top Content

Watch video: Speeding Up Your React App With Less JavaScript

Miško Hevery

Qwik Creator

Mishko, the creator of Angular and AngularJS, discusses the challenges of website performance and JavaScript hydration. He explains the differences between client-side and server-side rendering and introduces Quik as a solution for efficient component hydration. Mishko demonstrates examples of state management and intercommunication using Quik. He highlights the performance benefits of using Quik with React and emphasizes the importance of reducing JavaScript size for better performance. Finally, he mentions the use of QUIC in both MPA and SPA applications for improved startup performance.

performance frameworks builders and founders qwik react less

React Concurrency, Explained

React Summit 2023

23 min

React Concurrency, Explained

Top Content

Watch video: React Concurrency, Explained

Ivan Akulov

Google Developer Expert, Web Performance Consultant, Netherlands

React 18's concurrent rendering, specifically the useTransition hook, optimizes app performance by allowing non-urgent updates to be processed without freezing the UI. However, there are drawbacks such as longer processing time for non-urgent updates and increased CPU usage. The useTransition hook works similarly to throttling or bouncing, making it useful for addressing performance issues caused by multiple small components. Libraries like React Query may require the use of alternative APIs to handle urgent and non-urgent updates effectively.

react performance best practices react 18 deep dive react concurrent mode

How React Compiler Performs on Real Code

React Advanced 2024

31 min

How React Compiler Performs on Real Code

Top Content

Nadia Makarevich

Coder, writer, author of Advanced React book

I'm Nadia, a developer experienced in performance, re-renders, and React. The React team released the React compiler, which eliminates the need for memoization. The compiler optimizes code by automatically memoizing components, props, and hook dependencies. It shows promise in managing changing references and improving performance. Real app testing and synthetic examples have been used to evaluate its effectiveness. The impact on initial load performance is minimal, but further investigation is needed for interactions performance. The React query library simplifies data fetching and caching. The compiler has limitations and may not catch every re-render, especially with external libraries. Enabling the compiler can improve performance but manual memorization is still necessary for optimal results. There are risks of overreliance and messy code, but the compiler can be used file by file or folder by folder with thorough testing. Practice makes incredible cats. Thank you, Nadia!

performance

Optimizing HTML5 Games: 10 Years of Learnings

JS GameDev Summit 2022

33 min

Optimizing HTML5 Games: 10 Years of Learnings

Top Content

Watch video: Optimizing HTML5 Games: 10 Years of Learnings

Will Eastcott

CEO & co-founder of PlayCanvas

PlayCanvas is an open-source game engine used by game developers worldwide. Optimization is crucial for HTML5 games, focusing on load times and frame rate. Texture and mesh optimization can significantly reduce download sizes. GLTF and GLB formats offer smaller file sizes and faster parsing times. Compressing game resources and using efficient file formats can improve load times. Framerate optimization and resolution scaling are important for better performance. Managing draw calls and using batching techniques can optimize performance. Browser DevTools, such as Chrome and Firefox, are useful for debugging and profiling. Detecting device performance and optimizing based on specific devices can improve game performance. Apple is making progress with WebGPU implementation. HTML5 games can be shipped to the App Store using Cordova.

performance game development game engine

The Future of Performance Tooling

JSNation 2022

21 min

The Future of Performance Tooling

Top Content

Addy Osmani

Engineering Leader Working on Google Chrome

Today's Talk discusses the future of performance tooling, focusing on user-centric, actionable, and contextual approaches. The introduction highlights Adi Osmani's expertise in performance tools and his passion for DevTools features. The Talk explores the integration of user flows into DevTools and Lighthouse, enabling performance measurement and optimization. It also showcases the import/export feature for user flows and the collaboration potential with Lighthouse. The Talk further delves into the use of flows with other tools like web page test and Cypress, offering cross-browser testing capabilities. The actionable aspect emphasizes the importance of metrics like Interaction to Next Paint and Total Blocking Time, as well as the improvements in Lighthouse and performance debugging tools. Lastly, the Talk emphasizes the iterative nature of performance improvement and the user-centric, actionable, and contextual future of performance tooling.

performance tooling devtools

Workshops on related topic

React Performance Debugging Masterclass

React Summit 2023

170 min

React Performance Debugging Masterclass

Top Content

Featured Workshop

Ivan Akulov

Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)

react performance best practices advanced debug react debugger react performance react profiler

Next.js 13: Data Fetching Strategies

React Day Berlin 2022

53 min

Next.js 13: Data Fetching Strategies

Top Content

Workshop

Alice De Mauro

- Introduction- Prerequisites for the workshop- Fetching strategies: fundamentals- Fetching strategies – hands-on: fetch API, cache (static VS dynamic), revalidate, suspense (parallel data fetching)- Test your build and serve it on Vercel- Future: Server components VS Client components- Workshop easter egg (unrelated to the topic, calling out accessibility)- Wrapping up

performance next.js best practices react server components

React Performance Debugging

React Advanced 2023

148 min

React Performance Debugging

Workshop

Ivan Akulov

performance debug optimization

Building WebApps That Light Up the Internet with QwikCity

JSNation 2023

170 min

Building WebApps That Light Up the Internet with QwikCity

WorkshopFree

Miško Hevery

Building instant-on web applications at scale have been elusive. Real-world sites need tracking, analytics, and complex user interfaces and interactions. We always start with the best intentions but end up with a less-than-ideal site.
QwikCity is a new meta-framework that allows you to build large-scale applications with constant startup-up performance. We will look at how to build a QwikCity application and what makes it unique. The workshop will show you how to set up a QwikCitp project. How routing works with layout. The demo application will fetch data and present it to the user in an editable form. And finally, how one can use authentication. All of the basic parts for any large-scale applications.
Along the way, we will also look at what makes Qwik unique, and how resumability enables constant startup performance no matter the application complexity.

performance frameworks qwik

High-performance Next.js

React Summit 2022

50 min

High-performance Next.js

Workshop

Michele Riva

Next.js is a compelling framework that makes many tasks effortless by providing many out-of-the-box solutions. But as soon as our app needs to scale, it is essential to maintain high performance without compromising maintenance and server costs. In this workshop, we will see how to analyze Next.js performances, resources usage, how to scale it, and how to make the right decisions while writing the application architecture.

performance next.js best practices architecture

Maximize App Performance by Optimizing Web Fonts

Vue.js London 2023

49 min

Maximize App Performance by Optimizing Web Fonts

WorkshopFree

Lazar Nikolov

You've just landed on a web page and you try to click a certain element, but just before you do, an ad loads on top of it and you end up clicking that thing instead.
That…that’s a layout shift. Everyone, developers and users alike, know that layout shifts are bad. And the later they happen, the more disruptive they are to users. In this workshop we're going to look into how web fonts cause layout shifts and explore a few strategies of loading web fonts without causing big layout shifts.
Table of Contents:What’s CLS and how it’s calculated?How fonts can cause CLS?Font loading strategies for minimizing CLSRecap and conclusion

performance optimization