Video Summary and Transcription
This Talk discusses the journey of data fetching libraries in React's new streaming SSL, focusing on the use of suspense for data fetching. It covers the backstory of suspense and data fetching, the plan and green light for its implementation, challenges with Next.js app router and SSR, data transport and flushing timing, the importance of timing and data transport, delayed rehydration and stream closure, the need for remaining data and required functionalities, challenges faced by vanilla React users, and audience questions about React server components.
1. Introduction
Today I'm going to talk about the rocky journey of data fetching libraries in React's new streaming SSL. It's a frustrating topic, but also very interesting. I'm Lancey Bertronick, a senior staff engineer at Apollo GraphQL, working on the TypeScript Apollo client and co-maintaining Redux toolkit. Find me on Twitter as Fry or on GitHub as Frynias.
Today I'm going to talk about—and I have to read this up—the rocky journey of data fetching libraries in React's new streaming SSL, and I'm terribly sorry for the title And I'm terribly sorry for the talk. I wish I wouldn't have to do it, but here we are. And it's a frustrating topic, but it's also very interesting. I was already introduced. I'm Lancey Bertronick. I'm a senior staff engineer at Apollo GraphQL, and I'm working full time on the TypeScript Apollo client. I'm also co-maintainer of Redux toolkit and do a lot of other open source. I have an ADHD. I have more hobbies than you can count. You can find me on Twitter as Fry or on GitHub and the internet generally as Frynias.
2. The Backstory of Suspense and Data Fetching
Let's dive into the backstory of how suspense for data fetching in Apollo client started. In October 2018, React Lazy was introduced, allowing lazy importing of files and bundle splitting. A few months later, the hooks APIs were released. In March 2022, React 18 brought concurrent mode and more features, including the renaming of Suspense. However, the use of Suspense for data fetching was still not recommended. In the Apollo client timeline, React Apollo hooks with suspense support was released in October 2018. In September 2019, the hooks were merged into the Apollo client package, marking the first official mention of suspense for data fetching.
So I said, this is a frustrating topic to me. Those always have a backstory. So let's look into a villain backstory here. How did this all start? It started when we wanted to add suspense for data fetching to Apollo client. And coincidentally, in the last talk you already saw that's working. So at this point I could leave a stage and everything's fine. But it didn't always.
So let's go back in history first and talk about suspense first, because this is the thing that actually it wasn't React forever. Why are we talking about this, this year? Shouldn't this have been a thing we stopped thinking or talking about? So let's talk about the history of suspense first. And we go back to October 2018 when the first suspensy thing was introduced in a React and that was React Lazy, which gave you a way of lazily importing files, doing bundle splitting, loading them later and having React kind of fetch the load, like do the loading state for that like behind the scenes without you having to do it. Um, on the same timeline, a few months later, the hooks APIs came out, like in February, 2019. And in March, 2022, this was a big gap. A React 18 came out and essentially the whole thing was like we have concurrent mode now like Suspense was renamed to concurrent mode and got a lot more features at one point and we were really happy and we're like, yeah, we can start doing this now. But then we scrolled down through the release notes and somewhere in that blog article was a footnote. Uh, in React 18, you can start using Suspense for data fetching and opinionated frameworks like Relay, Next.js, Hydrogen and Remix. And that was the depressing part about this ad hoc data fetching with Suspense is technically possible, but still not recommended as a general strategy. And yeah, we, we, we all like every data fetching lever was kind of experimenting with that, but we are a good community and we are listening to our React overlords. So we, we didn't really release anything, uh, uh, especially not on purpose. I think some, some libraries have something for awhile and then saw that sentence afterwards and proofed it out again. Um, so that's like the official react timeline there's of course, there's a second timeline and that's the Apollo client timeline and I want to remember I want you to keep a focus on this October 18 we are in, in the pre hooks times, hooks are not out yet, but there was a conference talk on hooks just. Right now. Um, and we go back to October 18 and someone actually released the library called React Apollo hooks and that library had suspense support using that React lazy workaround. And I, I, honestly, I was flabbergasted when I looked that up because I wasn't aware of that preparing this. And I just want to say, like, I don't know who did that, but kudos to that person. You are amazing. Um, that was a really, really cute, cool experiment. Um, and in September, 2019, so like half a year after the react hooks came out, and a little while after the Apollo hooks had been in beta for a while, uh, there was an issue where we merged the, uh, the hooks into the real Apollo client package. And that was the first official mention of suspense that I could find in our issues. That was issue 5,357. And it says when react suspense and data fetching approach finalizes.
3. Highlighting the Plan and Green Light
In the next couple of months, 2019, we waited for the plan to mature. We got too many people asking for it, so a general outlined a general RFC strategy for the use of Spence query hook. React 18 introduced streaming SSR, and Apollo client 3.8.0 alpha zero released with that hook. We had a meeting with the React team, along with 10 stack query and the R2K team, and we got the green light to proceed with suspense for data fetching. It worked really nicely.
And now I want to highlight the next part, hopefully in the next couple of months, 2019. And then we waited and we waited and we waited. And it, it's good that it took some time. It had to mature. Um, and essentially we, we would still be waiting if we would be adhering to that footnote, but at some point we got too many people asking for that and we had to at least start doing, getting a plan into place, what wanted to do so a general who was on stage here just a moment ago, uh, put out as you 10,231, so about 5,000 issues later. Uh, that outlined as general RFC strategy for the use of Spence query hook. And what is also clear was that react 18 got this streaming SSR thing. So that was probably also something we should add support to, but we weren't really sure how to. So we, we put that out to the side for a while. Um, then Apollo client three point 8.0 alpha zero released in December of last year with that hook the first time. And I said, we are a good community, so we started trying to get an audience with a react team. And we had that in February of this year and we essentially got the green light. So this is not something that we didn't talk about or anything that was actually a meeting where also, uh, 10 stack query joined in and like the, the R2K team. So like we as a data fetching library, all the community kind of got the green light, you can go ahead with suspense for data fetching now. Um, so we did and it worked really nicely. So your suspense query worked fine.
4. Next.js App Router and SSR
At this point, I hadn't played with the Next.js app router, but I decided to try it out. The slides I'm using are not a video, but the actual application. I encountered an issue where the same request was happening on both the server and the browser, and the component was rendering in both places. This caused the data to not be transported from the server to the browser, resulting in wasted requests. We had to address this issue and decided to use server-side rendering (SSR) as we've always done.
But at the time we also got a lot of requests about this, uh, weird Next.js app router and honestly at this point I hadn't played with it. And I thought, yeah, okay, let's try it out.
And here we come to the point where I have to say something about my slides. These slides are a Next.js app router app. Everything we will see here is not a video, but it is actually the application and me doing super dirty tricks to show you what's happening on the server and what's happening on the client at the same time.
So we go into our next slide, that slide is going to use users Ben's query and hopefully, uh, yeah, hopefully the conference network is there and that request will finish at some time. Um, Of course it doesn't, uh, no, no. Yeah, perfect. Perfect. Uh, so like we had one fetch in the browser of that's fine, right? Uh, and I was really happy. Like we're done here. We don't have to do anything extra to support, uh, to support the app router and then I refreshed the page.
And after we refreshed the page, let me go back to full screen. Suddenly that same request happens on the server and in the browser. And this is also something that is really irritating me. Uh, like, especially since I drafted these slides, uh, they run at the same time, like all the experiments I did in the past, they would run on the server, the server would finish and then they would run on the client. But apparently I found the edge case of reproduction here to have a component rendering, both environments at once. I'm going to be debugging that for weeks from now.
Um, but the main problem is the same. The component runs in both places and it fetches data in both places. And actually that data isn't even transported from the server to the browser, so the server makes a request and throws it away and nothing good comes out of it. So obviously that's not a good thing. So suddenly we are in SSR territory. We wanted to do that at a later point in time. We don't have a chance to do that at a later point in time. We have to do it now. So the first thought of course is let's do SSR as we've always been doing SSR. Let me refresh the page though that loading state at the bottom goes away because that's my hack to make that data transport over.
Um, so SSR in the old world was like before the React tree is actually rendered, so in Xjs it would be like in get server-side props or something like that, we hook in, we execute our own code that means we create an Apollo client instance, we execute get data from tree, uh, which renders that component with an Apollo provider outside of it and passes that client instance in, that one would render the whole tree, trigger all data fetching inside that tree, give us like a promise that we can await until all the loading in there is finished. And then we repeat that, that actually happens internally in get data from tree.
5. Data Transport and Flushing Timing
We need a way to get the data over the wire into the running application to add more stuff to the cache. There was an RFC called Eject-To-Stream, but it's not there. The next job was to use server inserted HTML, but there are questions about multiple rendering and flushing timing. The Next.js documentation recommends having a global queue to flush and clear independently. The server inserted HTML context allows conditional calling of hooks, but the flushing timing is still unclear.
So we do the waterfall, we render that thing as often until nothing more starts loading, and then we take all the data from the client, we transport that over the wire in some way, and then we render the HTML and then all that transported data gets rehydrated on the other side. That's traditional SSR with Apollo Client or pretty much every other data fetching library, and I was like, okay, yep. Let's do that.
So rendering starts on the server. We start collecting the data to send to the browser, but the pros already started running and will not take any more data. So that traditional approach was absolutely not feasible because suddenly our components are like our application is running on the client and on the server at the same time. And that goes for every client component, every client component. If you refresh the page, so if it's the first page load, all those components will run in both environments at the same time or shortly after each other, hopefully. So we need something else. We need a way to get that data over the wire into the running application to add more stuff to the cache.
And there was an RFC that was called Eject-To-Stream and it looks super promising, but it's not there. Like, there was a discussion if they even want to include that in a reactor, if the framework should handle it, but that whole situation wasn't really clear, so that's a dead end. Then the next job was, use server inserted HTML, which is a Next.js specific API, targeting CSS and JS frameworks to get the generated CSS over to rehydrate there and everything like that. That stuff, everything is passed into a component, into some kind of global queue, and at some point in time that will get flushed over. I have questions because I'm suspending my components, so it will be rendering multiple times, like it will always start from the start again, and it will call user server inserted HTML multiple times. So do I send the same data over multiple times or what's happening here? So that's one question to keep in mind. Another question is like, when exactly does this flushing happen? Because as we're going to see in the future, this is all very timing sensitive. So these two questions are the questions we want to answer. Question number one, if a component suspends twice, do we transport the same data like three times because suspend, suspend and then the real render? Yes. So the Next.js documentation by now shows that you should have some kind of your own global queue with stuff that should get flushed and you should clear that independently. I think it wasn't in the docs back then, but it is now. So that's great for everything doing it today. Um, and there's also this server inserted HTML context, which is an implementation detail of the hook, which allows it to call it conditionally because we can't, or we shouldn't call hooks conditionally, but here it works. So we call that conditionally, but we still have to keep that global queue because of reasons. Um, the other question was like, when does the flushing happen? And this is like, you, you search the code for hours or days or weeks in my case. And at some point you stumble about this, uh, create inserted HTML screen, uh, stream thing that also changed name three times by now. Uh, Oh no. I wanted to show you the, the GitHub source code, but the conference wifi, I hope I can back and get back to the last page. Okay. Does someone need now the key combination to try to get the browser back to the last page.
6. The Importance of Timing and Data Transport
The timing is crucial because the app is already running on the server, but the client can mutate data before it's rendered on the server. This can result in mismatched and confusing data. We aim to transport data over the wire immediately rather than waiting for the next suspense boundary to finish. To achieve this, we transport not only the result but also the information that the query has started on the server. The browser then simulates the request, and the server ensures that the data is sent before the browser retrieves it. This approach helps prevent the client from receiving incorrect data.
You can, you command left command left. Back. Uh, we, we're not going to go into that too much. Apparently back. Yeah. We, we're just going to stay here and you said, if here's a sticker for you, I have more stickers later. Come up to me. Um, so why, why is the timing so important? Uh, like, uh, the problem is the app is already running and the browser wide it renders on the server. So the, we are a normalized cache and. That means we have an interactive application. That's not really filled with all the data data's coming in from the server, but the client can already mutate that data on the client. So the client could have newer data than actually is rendered on the server. So all of that is very confusing. Uh, for that, I have this wonderful, uh, confusing and totally not up for a Beamer generated diagram, but it's impossible to get it smaller. Uh, the end of my talk, there will be QR codes to a very birdie long RFC that explains everything here in detail. The important thing here is, uh, this is on the brow a server. This is in the browser. Uh, and we are assuming that components are rendering first on the server, then not then on the browser, not at the same time. So we start the render, we do a query, we get a result. Uh, and then something else suspends too, but the browser is already active and the user does something changes the cache and the browser. Uh, and then we sent over the results and we override the newer data because we send the result over much too late. Uh, and we get weird mashed together data. So for us, it would be important to get data in over the wire immediately and not like whenever the next suspense boundary, uh, finishes, which is the point where I actually wanted to talk about when the page didn't load. Um, so the point is data is only transported, uh, shortly before the next expense boundary has finished and that can take forever, or it can be immediately we have no control over that. Um, so what can we do when the query starts on the server? We tried to not only transport over the result. We also transport over the information that the query has started over the server, because we have more of a chance that information at least gets over the wire fast, and then the browser already starts to simulate that request. Um, and because a Poli client has query deduplication, if the browser in the meantime would try to retrieve that same data, it would at least wait for the server to actually send that data over. So we don't get completely bogus mesh together data on the client. Um, and when that finishes, we resolve that and that simulated query has done its job. Um, this is all we can do, because as I said, we have no control over when data will actually be transported over.
7. Delayed Rehydration and Stream Closure
This delayed rehydration can lead to hydration mismatches. We snapshot the result from the server and transport it individually. If the stream closes too soon, we detect the scenario and restart the queries on the client. The biggest problem is the need for platform-specific or framework-specific APIs. We need different packages for each framework. Extra data is transported over the wire for pre-hydration mismatched thing.
The other thing is, uh, this delayed rehydration can lead to hydration mismatches and that's the other thing. Uh, if everything goes well here and we get the data over soon-ish, but then the server keeps rendering for some reason, because suspense, um, and here happen cache updates, then the server renders HTML, that's outdated, but the browser already has newer HTML. So we get the dreaded rehydration mismatch. We have to get around that too, because those warnings are very irritating to developers, even if it's like kind of correct. And you want everything to rerender them. So what we are doing is we, uh, snapshot the result from the server. We transport the result of the hook over individually, we render once with that hook, and then we immediately re-render with the values that are actually correct in the client. And that way we don't get a hydration mismatch, but we just send a ton of data over. Uh, but it makes React happy. And that's what it's about being a library maintainer, uh, about making those trade offs. Um, then there's that last scenario when the stream closes too soon. We already had those sibling hooks teasered, uh, use background query and use read query, use background query is essentially prefetching, uh, in a parent component for use read query to use that ongoing request on a child component, potentially multiple suspends component, uh, boundaries down. Um, that's very nice. But if that child component renders conditionally, then maybe nothing will suspend. The data will never get transported over. And our stream is already closed. The server has data that can send over to the client. So we have to detect those scenarios and then restart those queries on the client. Um, so essentially we can work around everything here. This is fine. Uh, it's work, it's cool. I would prefer not to work around all of those things I would like to, that would be like real tools we could use provided by react. Um, so the current problems, just to sum this up really quickly, the biggest problem is we need platform specific API APIs or framework specific. Uh, I'm running out of time. So I'm going to be very fast here. Um, this is working in next JS. We have, we will need another package for Redwood chairs that only changes one line in our package because the export has a different name. And for every framework after that, we will need another package because it's not a react internal. It's a framework internal that we're using the timing sub-optimal I've showed you, uh, and for that extra pre hydration mismatched thing, we have to transport a lot of extra data over the wire, uh, and that those things stream we can't delay. We can't tell her react.
8. Remaining Data and Required Functionalities
We need the use three method from React and a register cleanup handler method to manage ongoing processes and prevent wastage of server resources. These functionalities should be provided by React to avoid the need for multiple packages. Without these, it would be challenging for client libraries to function effectively.
Hey, we, we still have data incoming. Uh, we will send that in a bit because of chase excess finished rendering chase excess finished rendering. Uh, and we're not going to talk about the bundling story. That's two more hours. Um, so what do we need? We need that use three method from react and not from every single framework, we need something like a register cleanup handler method where we can say, Hey, there's something still going on. Please wait for this, or at least stop the request that's not going to be transported to the client anymore. So we don't waste server resources. Uh, both of these need to be provided by react because we can't keep 200 packages around for every framework. That's not feasible. And that would be the end of the world for every client library.
9. Challenges and Promised Links
Right now, vanilla React users who want to use streaming SSR face challenges in implementing their own framework and building a package around it. Framework authors outside of Next.js have no official guidance and must rely on reading the Next.js source code. This lack of communication and documentation hinders the progress of the entire ecosystem. Despite the experimental nature of the package, it works well and can be used today. QR codes are provided for accessing the source code, the inject into stream RFC, and additional resources on React server components.
Um, and right now we also cannot support vanilla react users that are use streaming SSR because they would have to implement their own framework and then build their own package around their self-made framework for all of this to work. Um, also framework authors right now. And like you'd shout out, I've been talking to the redwood people recently. They are doing a react server components right now. Framework authors right now, if they're not working for next chairs, I have no real guidance. All they can do is go through a few communication channels, uh, where it always feels like you're, you're pushing the other team. There's no official documentation on this. All you can do essentially is like read the next chess source code and do something similar. And the next trace source code works on very specific timing implementations. Like knowledge about react internals that most other people don't have because they don't have the react team sitting next door. So this, this chest needs a lot more communication and documentation for the whole ecosystem to move forward. And not only those few who managed to get that call and withstand Abramov and without, without that call. But then Abramov that I had in February last year, I would be thinking about this like next year and it would still not have figured it out. Um, despite everything, this works really well. You can use it today. Um, the package still has experimental in the name because I'm not happy with the timing and nobody should be, but it works really well. And if we ignore the blazing fire around us, then we're just going to be happy. Um, yeah. And I promised you links. Uh, I know that links are not always nice. So this is these are QR codes. You can check out the source code for this talk. You can check out the inject into stream RFC. Uh, if you want to see those super weird graphs that I showed off in between, there's like a endless ride up with all details and things that I learned about react server components. Also, there was a lot of discussion with Dan in there. So even just reading that discussion, how, how he explains things to me might help you understand that more. And that's also just a super ranty blog post about react server components in general. I love them, but I can rent for hours. So yeah, I, I hope you took something away from all of this. Uh, and I hope we have some questions and if you have questions, I have stickers.
10. Audience Questions and React Server Components
No, I'm serious. Let's go to the audience questions. Matt asks if the approach was due to existing client design and suggests completely changing the client. React server components are praised for their amazing functionality, but the lack of education and documentation is criticized. Despite this, React server components are considered worth learning and will be an amazing tool once fully documented.
No, I'm serious. Let's go to the audience questions. Um, and the first question. I have is from Matt. Matt is asking, did you end up with this approach due to existing client slash book design and could this be simplified if you completely changed how the client works? I could rewrite a Polo client, which is a library that exists for eight years now, completely changed everything. And make a complete user base. Very angry. Um, to have some things internally, like for every piece of data I put in a timestamp when it was received, so I can match up these timestamps between client and server and see that data is outdated. And then maybe want to throw this away. And not that away, but that also means I need to synchronize with time between the server and the browser, which is like a problem that probably has not really been solved in this pro in this world. Uh, so it's probably also useless to try that. Um, but I think it's the best solution we can have for right now. Um, and maybe other libraries will come up with something completely new, but I don't think that existing libraries will come up with something really different. I love how you say for now, and that's basically for a lot of things that we're doing right in our industry. It's always the best we can now, and there's not a lot of things that I was super proud of six years ago that I would still do today. So yeah, good, good addition to your answer.
Uh, next question is from our valued audience member anonymous. Do you think React is going the wrong direction with React server components? Seems like a lot of added food guns. React server components are amazing. I absolutely love them and using them feels like, like that magical thing. I wished I had 10 years ago when I was still doing.net ASP and you could like, like write that, that class that had the click handler and if you clicked on the browser, it would execute that code on the server that totally didn't work, but it was a nice promise. Um, and now all of that suddenly starts working and it's really, really cool. What really isn't cool is the way of the education around this happened because it didn't happen. Um, at some point it was just there after being experimental for probably around five years or something. Like there was a demo from a conference you could try and then you had a production thing. There was nothing in between and all you can read up as the next JS documentation, and that's also still changing around as React server components are changing around. Um, there is no course you can read. Uh, the React documentation just came out with hooks. Um, but not with server components and like the community would have needed a slow introduction to all of this and it just got smashed in the face, like that's the part I'm not happy about, but React server components, once they are fully documented, um, including the internals for framework developers, for library developers will be an amazing tool in a toolkit and it's totally worth learning them. And it's totally worth that additional mental overload.
All right. Well agreed. Agreed. Um, we're out of time for a Q and a session.
Comments