Handling Data at Scale for React Developers

Rate this content
Bookmark

It is very difficult to scale modern web applications to millions of concurrent users. Oftentimes, we've got to provision and consider in-memory Key/Value stores, search engines, analytics engines, and databases, all while preserving traceability through the layers. This talk expands on the technical details of web apps at this scale, and offers a simpler way to achieve the same effect without the technical hassle.

This talk has been presented at React Summit 2022, check out the latest edition of this React Conference.

FAQ

The main topic of the talk given by Tajus is handling data at scale for React developers.

The presentation slides for Tajus's talk were created by Sarah Vieira.

Tajus used an amazing tool called Excalidraw to illustrate data at scale in his talk.

The three ways to fetch data in React discussed by Tajus are: 1. Render then fetch, 2. Fetch then render, 3. Render as you fetch.

'Render as you fetch' in React 18 means that React starts rendering a component and, when it reaches a component that is not yet ready because it doesn't have data, React pauses rendering that component and continues rendering the rest of the tree. Once the data is ready, React goes back and renders the paused component.

The purpose of using the useTransition hook in React 18 is to differentiate between urgent and non-urgent updates. It allows React to prioritize urgent updates (like user interactions) over non-urgent ones (like data fetching), thereby improving user experience by reducing jank.

The common steps to scale a database as described by Tajus include: 1. Distributing the API to avoid a single point of failure, 2. Scaling the database vertically by adding memory and disk space, 3. Scaling the database horizontally by adding primary instances and replicas, 4. Using an in-memory database to speed up data reading, 5. Adding a search engine to handle large data volumes and replicating data in real-time using tools like Kafka.

The recommended way to implement 'render as you fetch' in production, according to Tajus, is to use a framework or library that is battle-tested and handles edge cases, such as Next.js or Remix, rather than implementing it manually.

The role of suspense in React 18 for data fetching is to allow React to pause rendering at specific points in the component tree if the data required for a component is not yet available. This enables better control over loading states and improves the overall user experience by showing fallbacks or placeholders until the data is ready.

The key takeaway from Tajus's talk about handling data at scale in React is to understand the different strategies for data fetching (render then fetch, fetch then render, render as you fetch) and the importance of using frameworks and libraries to handle complex data fetching scenarios efficiently. Additionally, he emphasizes the use of React 18's concurrent features to improve performance and user experience.

Tejas Kumar
Tejas Kumar
23 min
17 Jun, 2022

Comments

Sign in or register to post your comment.

Video Summary and Transcription

This Talk discusses handling data at scale for React developers, including scaling databases and the need for search. It explores different ways to fetch data in React, such as using useEffect, fetch, and setState. The Talk also introduces Suspense for data fetching and how it improves user experience. It covers controlling React Suspense, handling search, and using render-as-you-fetch. The Talk concludes with a discussion on the RFC status and fetching in event handlers.

1. Handling Data at Scale for React Developers

Short description:

We're here to talk about handling data at scale for react developers. Let's get more specific with that and actually look at a diagram of what we mean by data at scale. Usually, you have a React app or React UI that talks to an API that then talks to a database. At some point, you're going to experience growth and performance becomes important. So you distribute your API, have multiple APIs, and load balance between them. But if you're successful, your database may become the bottleneck, so you need to scale it.

I was totally playing that guitar. Hi! How are you? Full? Full from lunch? Satisfied? A little more knowledge and information and fun react things? It's like three of you, three of you are awake. Four? Again, how are you feeling? Are you ready to take in some stuff? Off by one errors, you know?

Anyway, Hi! Nice to see you! I'm Tajus, I used to tell people. I used to tell people it's like something, but now I say like advantageous. Anyway, so I'm the director of developer relations at Zara. Look at this beautiful thing. That is my favorite slide and also one of my five slides I have. We're going to be writing a lot of code in this talk and learning properly. This by the way was by Sarah Vieira, she's here, she's doing the last talk today, so catch that if you want to learn how to do 3D stuff. But that's not what we're here to talk about today.

We're here to talk about handling data at scale for react developers. Handling data at scale for react developers. What does that mean? This sounds like a very abstract marketing talk and the answer is because it's not a marketing talk but it is abstract on purpose so that I can change it at the last minute as I always do, okay? But let's get more specific with that and actually look at a diagram of what we mean by data at scale. To do that, we're going to use an amazing tool called Excalidraw. How many of you have heard of Excalidraw? Yeah, if you want to applaud Excalidraw, yeah, for sure. Data at scale. This is what it looks like. Usually, you have a React app or React UI, let's say, right? Is the text okay? Can you all see? Good. I knew. I just asked and what we're going to do is you usually have a React UI that talks to an API that, let's zoom out a little bit, that then talks to a database and these connections usually look a little bit like this. So this maybe oversimplified, but that is most applications. Is this at scale? Probably not. This is a single host database and so on. At some point, you're going to experience like, we're growing and performance is important. So what do you do? You probably will distribute your API. Having a single point of failure is usually a no-no, so what you'll do is you'll do that and you'll have multiple APIs that can fetch multiple times and whatever. And you can load balance between them. And then, OK, you're like, this is cool, but you're going to, if you're successful, what do successful things do? They grow. So if you grow, you're going to be like, oh no, our database is now the bottleneck. Let's make it, let's scale it.

2. Scaling Databases and the Need for Search

Short description:

So you'll scale your database vertically or horizontally. Scaling vertically means adding memory and disk space, while scaling horizontally involves having a primary instance and replicas. As your data grows, you may notice slow reading times from the database due to disk limitations. To address this, you can add an in-memory database for faster reading, with the option to fall back to the disk if there's a cache miss. Eventually, as your data volume increases, search becomes a common feature needed for platforms like GitHub, TikTok, and Instagram.

So you'll scale it vertically. And this vertical scale usually means adding memory, adding disk space, adding stuff. And it gets quite expensive. You eventually build a supercomputer. Or, if you want to scale your database the other way, you scale horizontally, meaning you have maybe a primary instance and some replicas. So when you get data, it spreads out across the replicas and so on and so forth.

But then you're going to grow more. And we're talking about data at scale, so it's important to establish this context. You're going to grow a little bit more. At some point, you're going to be like, wait, our database is still, reading from it is slow. And that's because usually databases read from disk. Disk by design is not as fast as what? Memory. Memory. So, let's now, this, I've had this conversation, like, at least 50 times in the past week. That's a lie because I'm a public speaker. I really haven't. But you know how it is.

So, you'll add some type of in-memory database just to read from it faster. This will probably be distributed as well. And so now your app will talk to that thing to get data fast. And if it's a cache miss, then you read from the database. Okay? This is close to what things look like at scale. I think Kent C. Dodd's website has something like this in the back. But as you accumulate now data volume, what is the one common feature across things with a ton of data volume? Search. So, GitHub, TikTok, Instagram. Eventually, when you get enough data, you're going to need search. And so now it gets complicated, right? Because your app can read from the search engine, but it's gonna be empty. You're just like... Okay.

QnA