Rendering Data That Disagree

The data returned from one endpoint to a frontend app should reflect the state of the world at a (recent!) point in time in the past. But it may not! And the data returned from multiple endpoints often reflects the state of the world at multiple points. What can be done, and what should be done? What's the worse that could happen, and what's the worse that reasonably, probably going to happen?

Let's talk through the tradeoffs, look at how some some teams approach it, and hear how much they actually care.

The main focus of Tom's talk is how to handle rendering data that may be inconsistent when fetching from databases for web applications, and strategies to ensure data consistency.

Data consistency is important because inconsistencies can lead to visual errors or logical errors in applications, such as missing users or messages, which can confuse or frustrate users.

Running separate SQL queries can lead to inconsistencies, such as a message appearing without a corresponding user if the queries are not executed in a single transaction.

Inconsistent data can cause visual inconsistencies on the user interface, leading to confusion or concerns, such as thinking a purchase was made incorrectly or missing important information.

Tom suggests using transactions to ensure data consistency and leveraging back-end systems like Convex to maintain consistency from the database to the client.

React can compound data consistency issues due to its asynchronous nature and the way it handles state updates, potentially leading to flashes or inconsistencies if not managed carefully.

Batching updates can help by ensuring that multiple state updates occur simultaneously, reducing the chance of rendering inconsistent data between updates.

Convex is a system that allows subscribing to multiple queries and receiving consistent updates at the same logical timestamp, ensuring a consistent client view and extending database guarantees to the front end.

Developers can account for inconsistencies by clearly documenting which data sources are consistent with each other, and by using systems that ensure consistency, like Convex.

Separate API endpoints might lead to data inconsistency because they may involve different transactions or databases, making it challenging to synchronize data updates effectively.

Thomas Ballinger
Thomas Ballinger
11 min
28 Oct, 2024


Video Summary and Transcription
Hi, I'm Tom and I want to talk about rendering data that may disagree when using SQL queries. It's important to consider whether these queries are in the same transaction or separate transactions, as this affects data rendering. Implementing transactions can ensure atomic data queries and avoid inconsistencies. Managing data consistency in React can be challenging, especially with rich clients and live updates. React Query offers ways to handle data invalidation logic. Asynchronous data fetching in React can lead to inconsistent data between requests.

Hi, I'm Tom, and I want to talk about rendering data that disagree. So let's say you are rendering a web page. And maybe you're using server components, maybe it's a client component that's SSRing, maybe you're back in PHP or Python or Perl or Ruby land, and you need to grab some data.

You're probably going to get it from a database. And so you fire off your first SQL query, which is a select for let's get all the users that are in a given chat channel, and then you fire off another SQL query to get all the messages in that channel. And now you have the data you need. You can just render this thing.

This is discourse or my favorite, Zulip. And in the center, we have a bunch of messages in a chat channel and over on the right, we have the users that are in that chat channel. But hopefully with that first just that problem description, your mind went to, ah, but are those two SQL queries in the same transaction? Are they like here? These are two queries that run separate transactions or are they in the same transaction? That's important information for us because it tells us what we can do with that data.

What if you first read your users and then a new user is added to the channel and they also post a message and then you read your messages. These adding a user and adding a message might be an atomic action that could happen together. It could be a join message or something. And if we read them like this, we read users and then we read messages, and we assume that every message has a corresponding user, we're going to be disappointed. This is one of those cases where, well, let's see. What could happen? Maybe it's just a visual inconsistency. Maybe all that happens is that you see this page and you say, oh, that's funny. There's a person here, but there's nobody corresponding in that user's thing. Often it doesn't matter. And often what you do, what our users are trained to do is refresh the page.

For better or worse, that's how the web has always worked. Like, uh-oh. Maybe it's eventually consistent. You refresh and it's all better. But maybe that visual inconsistency is a big deal. Maybe that's the difference between thinking I have my Taylor-Tomlinson tickets and I don't have my Taylor-Tomlinson tickets. Maybe I see something, it's a purchase, and something confuses me there and I really get worried as a user. But maybe it's not two separate UIs. Maybe it is one UI and I have code that expects both of these pieces of data. I'm assuming every message has a corresponding user, and what this talk is about is that it is your job to know if that data could be inconsistent.

And you know this TypeScript error we see a lot, where you have a map and you say, oh, I think it's a map of every user. And you look in there to get a user by ID and it says, uh-oh, what you went in there for is not there, and you always just slap a exclamation mark at the end because, of course it's there. This was supposed to be a full list. Well, in this case, we don't get to ignore that TypeScript message. What if we are missing users? And we don't get to ignore it because this is what happens, right?

This is a pretty bad thing. Whether it's better or worse than a visual inconsistency, I guess it's up for debate. But in general, I want to say it's worse. So there's a neat idea, call a transaction. And if these two queries had been in the same transaction, we would not be worried here. And then if we did get an error, it would be a very actionable error. It would be, hey, something is fundamentally wrong about your data model. If your database returned these two lists, and you ended up with a message with no user, there's something real wrong here, right? Instead of the alternative of users just complain sometimes that like, ah, it seems like data is loading a lot. And when that's how a inconsistency, fundamental inconsistency in your data manifests, because you've coded around some users just don't have some messages don't have users. And that's fine. You know, it takes longer to find problems in your system. Consistent data can mean that those client side errors are meaningful. But you can't always have consistent data. Maybe you're getting this data from multiple databases or several APIs. So the job of you as a front end or the rack developer here is to know that these are consistent.

React Query is not a data fetching library, but an Asian state manager. It helps keep data up to date and manage agent life cycles efficiently. React Query provides fine-grained subscriptions and allows for adjusting stale time to control data fetching behavior. Defining stale time and managing dependencies are important aspects of working with React Query. Using the URL as a state manager and Zustand for managing filters in React Query can be powerful.

