Video Summary and Transcription
Today's Talk discusses HTTP clients, servers, microservices, and maximizing performance in Node.js. It covers topics such as TCP, latency, HTTP Keep-Alive, pipelining, the Node.js event loop, timeouts, and introduces the Undici library. The speaker emphasizes the importance of reusing connections, minimizing blocking, and using benchmarks to measure performance impact. Undici is highlighted as a new client for Node.js that eliminates the need for multiple agents and offers easy configuration options.
1. Introduction to HTTP and Node.js
Today, I'm going to talk about HTTP clients, servers, and how to improve the throughput of our HTTP client in Node.js. I have a good grasp of user needs and maintain Node.js. As part of my job, I work with cloud servers and have experience with building fast and scalable Node.js applications. I'm also a co-author of Fastify.
Hi, everyone. I am Matteo Collina, and today I'm going to talk to you about HTTP. clients, servers, maybe microservices a little bit, and how can we double or maybe even triple the throughput of our HTTP client in Node.js.
So first thing, a little bit about me, I am Matteo Collina. I'm part of the Node.js technical steering committee. I'm the co-creator of FASTI5 web framework and PinnoLogger. I'm a software architect and consultant by trade, and you know, I'm technical director at NearForm. Follow me on Twitter at Matteo Collina.
So a couple of notes. I also have maybe 6 billion downloads on NPM for the whole of 2020. I don't know. I was totally stunned by this. So maybe I know what I'm talking about, maybe not. Make it up what you want for yourself.
So what I'm doing, I typically work in helping companies building fast and scalable Node.js applications. This is one of the key things of what I do as part of my job. I am also part of the maintainers of Node.js and a key part of his ecosystem with 6 billion downloads per year. I probably know what I have a good grasp on what our user needs and what they are complaining with and so on. I need to balance all the time those two things. One side help our clients and the other one maintaining Node.js. This gives me a lot of perspective on what I can do, what I need to do for the development of Node.js application if it's an ecosystem. So the two sides of my job strengthen each other to some extent.
As part of my job, I most of the time work with cloud servers. So there is a client, typically a web browser or a mobile app that talks to the cloud and, specifically, to one server, which can be multiple instances, but it's still the same thing that runs. It's what we call a monolith at this time. You know, I also wrote— As I said, I'm a Fastify co-author, so shameless plug here. Use this thing. It actually works really well. This task is actually not up-to-date. Ah, I'm sorry.
2. Introduction to Microservices and Node Core HTTP
This part discusses the use of a fast web server and framework for Node.js, which is suitable for building both small and large apps, including monoliths and microservices. The speaker highlights the need for microservices to scale teams and avoid overlapping responsibilities. They also address the issue of chattiness in microservices systems and emphasize the importance of communication between microservices. The speaker then introduces the Node Core HTTP as the focus of the presentation, explaining its role as the backing for popular HTTP clients. They discuss the process of creating a TCP socket and the potential latency involved. Additionally, they mention the concept of the congestion window for new sockets.
So, essentially, this is a really fast web server, web framework for Node.js. You can build small and big apps with this, and it works really well, both for monolith, but also for microservices.
Now, why would you need microservices? Because you need to scale teams. Microservices are a clear way of scaling teams so that you can have different teams to maintain different parts of your system so that they don't overstep on top of each other. It's actually great.
However, one of the problems of microservices system is their chattiness. In fact, all the microservices chat a lot between each other, and you need to call data that is managed by some other microservices. So you have actually a lot of communication between the various microservices. From time to time, somebody will call this a microservices mesh. And what we are going to focus on most of the time in this presentation is this link between microservices. And I've been researching this problem for three, four, five years, something like that. So it has been brewing for some time in my head.
You can have an HTTP server with everything, everyone can work, but even the most basic ones. So let's consider a very simple server that you just do a little bit of a timeout of one millisecond. Very simple. This to simulate a very fast database that always replies us with hello world in one millisecond. Hey, it's great. And an HTTP client, the Node Core HTTP. Why we're just focusing on Node Core HTTP? Well, because Axios, Node fetch, request got, they all use this as their backing. It's great. And so every single time you're doing those things we are doing, you know, they create a TCP socket.
So essentially the sender open up, when they open up a TCP socket they need to do a little bit of a dance. This is typically one round full roundtrip to get to establish this, which is, you know, quite a lot, OK? Because, you know, depending on the latency, the distance, physical distance between the two, it can even take some time. It can be 10, 20 milliseconds, something like that. So we're talking very little numbers. But remember, you have maybe 200 milliseconds to respond to your client or maybe 400, whatever you want. But, you know, the more hopes you do, the higher your latency gets. So you don't really want to spend time because you haven't... So once the 3NShake has established is not even finished, like you haven't transferred any data, right? You just created the socket. Consider that if you're using TLS or SSL and so on, it takes even longer. So, but that's not just the case, because once you create a TCP socket, in fact, it's, you know, there's a concept that is called the congestion window, which is considered slow at the beginning for new established sockets.
3. Understanding TCP and Latency
The server sends bytes to the client, which then needs to acknowledge them. As the congestion window grows, more bytes can be sent without acknowledgement. TCP's success lies in its ability to work on networks with varying bandwidth. The need for acknowledgements arises to prevent data loss, but it introduces latency.
So what happens on the left, as you can see on the left, what happens is that the server sends some bytes, then the client needs to hack them, and then it sends more bytes, and so on and so forth. While once the window, the congestion window has got bigger, in fact it can send a whole lot of bytes without a hack. This is the reason why TCP is so successful, by the way, because it enables it to work on very small bandwidth networks or very high bandwidth networks. Why would you need all those hacks? Because if you lose us, if you lose some message in between, this is the maximum amount of data you will lose. However, this comes at a cost, which is latency.
4. Maximizing Performance with HTTP Keep-Alive
To maximize bandwidth, it is crucial to reuse existing connections in order to avoid losing the work done by the network layer. In Node.js, the HTTP 1.1 feature called KEEPALIVE allows for the reuse of HTTP sockets, which is particularly important for TLS. By using HTTP clients with Keep-Alive turned on, we can increase the performance and throughput of our applications. To test this, a scenario with one client making 500 parallel requests to a server was used, and the results showed significant improvements. However, it is important to note that these results may vary depending on the system, so it is recommended to run benchmarks to measure the actual impact.
So one of the key problems with this is that if you want to have the maximum bandwidth, you must reuse the existing connection. Like, once you have established a connection, send out some data, increase the congestion window, this grows over time, by the way, which is great, one of the greatest feature of TCP, we must reuse the existing connection. If you don't reuse the connections, you are losing all this work that was done for you by the network layer.
So in order to do that in Node.js, you need to use an HTTP 1.1 feature, which is called KEEPALIVE, which you can see here. You can see KEEPALIVE true and you can set the maximum number of connections to KEEP OPEN. This enables you to reuse the keep the sockets, your HTTP sockets for HTTP alive. More importantly, it's also more important with TLS, because you actually can avoid the full establishing of the crypto context, the secure context between the two. So it's actually very, very important for TLS as well.
So this is the theory. We should be able to increase our performance, our throughput in our applications just by using HTTP clients with Keep-Alive turned on. Is this the case? Is this the case? Well, let's see. Scenario. We have one client that calls one server with 500 parallel requests on the same route. And more or less this is equal as 500 parallel in-bound requests. And the target server takes 10 milliseconds to process the request and it declines a limit of 50 sockets. So this is completely synthetic, okay? It doesn't match your system, so always measure this stuff. Don't trust me. Run your benchmarks.
5. HTTP 1.1 Pipelining and Reliability
Always use an agent. HTTP 1.1 pipelining is an obscure feature that can be used on the server but not on the Node.js HTTP client. It suffers from head of line blocking and only works for small files. It doesn't work well on unreliable links, but is less of a problem in reliable data centers.
Now, this is the difference between the two. So if you forget anything from this talk, always use an agent. That's it. That's the only thing you need to remember. Always use an agent. The difference, it's so massive that you can't even consider not using one.
But can we still improve things? Because I've been researching this topic for a while, so I might have some more things to say to some extent. Well, yes, we can. In fact, there is something that's called HTTP 1.1 pipelining. Now, this is one of the most obscure feature of HTTP 1.1, and something that people will say don't use this is wrong. The browser don't use this, it's not supported by browser, but it's part of the standard, and you can actually use this on the server. So Node.js HTTP server support pipelining out of the box, you don't need to do anything to enable them. Node.js HTTP client does not, however. So it is that, you need to do something else, you won't use this technique.
In HTTP 1.1 pipelining all responses must be received in order. This means that you are suffering from head of line blocking, and a slow request can stall the pipeline. So essentially, if the first that you ask is actually very slow, or the other request will be packed up, waiting to go until the first one finishes. So this is a problem. HTTP pipelining only work for small files. So the problem is always retransmits. So the moment, if you start losing a pass and packet, everything goes nuts. So you can't actually do anything I'm sorry. It doesn't really work well on unreliable links. However, our own data centers are actually reliable links. So if we need to call them from one microservice to the next, those links, those connections, those sockets are actually very reliable. They don't fail. It's not that it's somebody moving around with their iPhone and they are connected with different cells so the connection goes on and off. This is actually... You know, they're very reliable on the data center. So this is less of a problem on the data center.
6. Node.js Event Loop and Performance
The Node.js event loop works by scheduling I.O. to be done asynchronously. The event loop waits for something to happen and then calls back into C++. To improve application performance, it is important to minimize the time blocking the event loop. Flame graphs can be used to visualize and minimize function calls.
Now, one more thing to be concerned with. It's actually very important to note about the... One thing to remember about how the Node.js event loop works. So, whenever you get a tcp socket, whenever we would do any I.O., essentially, your JavaScript code schedules some I.O. to be done asynchronously. And then it calls back and it starts in the event queue. In practice, what does this mean? This means that the event loop is waiting for something to happen. Then it calls back into C++, and from C++ it calls JavaScript, and from JavaScript, it finishes, it does next tick, it does promises and so forth, finishes C++, and then it starts again the event loop. Now, there is a moment between those two where the event loop is blocked. So, when the C++ and JavaScript function is executed. So, what does this mean? It means that if we want to improve the performance of our applications, we need to minimize the time in which we are blocking our event loop. So, and it's basically, it's increased the key strategy to increase the throughput. And you can actually use these to put flame graphs, use flame graphs to visualize the functions and actually minimize them. It's pretty great, it works very well.
7. Timeouts, Echo Resets, and Introducing Undici
Now, this is one of the problems we are trying to solve - timeouts and echo resets. If you use agents, you might end up having echo resets and timeouts. The problem is that a socket might die, and you want to minimize this. By default, Node.js used a FIFO strategy, but recently a new scheduling strategy called LIFO was added to the HTTP agent in Node.js. This LIFO approach minimizes timeouts and echo resets by reusing the sockets that worked the last time. Now, let me introduce you to a new library called undici, which comes from http 1.1.11 and translates to 'eleven' in Italian.
Now, this is one of the problems, right. The next one that we are trying to solve, and this is the bonus point, is that it's timeouts and echo resets. So, if you use agents, okay, you might end up having echo resets. And what does those, and timeouts. So, the problem is that a socket, if you are using them, they might die. And if they die, what you need to do is that you might want to reschedule them.
So if they die, it's a problem, because it can happen and I can say, oh, I'm sending data on this socket. I'm trying to reuse it. When I use it, it die. It's not available anymore. I'm getting a record of that. It's really bad. So, what happens is that you want to minimize this. By default, the original strategy for Node.js was FIFO first in, first out, which means that it was reusing all the sockets to try to create the least amount.
Now, the problem with that approach is that all the sockets are the ones more likely to timeout because they are old. And so, recently, we added a new scheduling for the HTTP agent in Node.js. It's called LIFO. It's a different strategy. Last in first out means that the last one that you use, we try to use that. Which means that we are actually going to create more sockets because, essentially, we let more sockets expire. However, with the LIFO approach, it actually minimizes the amount of timeouts and econ resets because you're actually going to reuse the sockets that worked the last time. So you know that this works. It's just there. So it's way, way more probable that your request goes through.
Okay, so we have known all of these things. Let me introduce you to a new library called undici. So what is undici? Well, undici comes from http 1.1.11. And you know 11 it's a secular word, right? But in Italian it means undici. So you can translate 11 to undici. So that's why undici.
8. Introduction to Undici
Undici is a brand new client for node.js that uses Node Core internals. It supports both HTTP and HTTPS, uses lithos scheduling, and allows unlimited connections. It eliminates the need for multiple agents and provides easy configuration options.
And you know it's also totally a stranger things reference, in case you're wondering. So what does undici do? Undici is a brand new client for node.js implemented from scratch just by using Node Core internals. It's great. You can just use it with a global agent that will keep alive your connection by default. It uses lithos scheduling, no pipelining and unlimited connections. So it will work more or less the way you're used to. It will support both HTTP and HTTPS at the same time. There's no need to shenanigans between multiple agents and so on. It just does it all. And that's it. That's pretty cool. You can even configure HTTP agents on agent. You can configure it or you can use it directly then. It works really well, from my point of view.
Comments