Video Summary and Transcription
Swift responsiveness is essential, and LLRT is a new JavaScript runtime optimized for serverless environments that offers improved performance and cost savings compared to other runtimes. LLRT achieves fast performance by removing complexities, leveraging Rust, and optimizing the AWS SDK for Lambda. It starts almost six times faster than Node.js and provides a cost saving of 2.9 times and a time saving of 3.7 times compared to Node.js.
1. Introduction to LLRT
Swift responsiveness is essential. Serverless services like AWS Lambda sometimes introduce latency. LLRT is a new JavaScript runtime specifically tailored for serverless environments. LLRT does not incorporate a just-in-time compiler, conserving CPU and memory resources. LLRT offers virtually negligible cold starts and supports ECMAScript 2020 with many Node.js APIs.
Hello, everyone. In today's world of modern applications, Swift responsiveness is essential. Developers expect excellence experience where every action triggers an immediate response.
Serverless services such as AWS Lambda allows developers to build modern applications without the need to provision any servers or additional infrastructure. However, these services sometimes introduce or add a bit of latency when provisioning a new execution environment to run the customer code. This is sometimes referred to as a cold start. And even though production metrics shows that cold starts typically occur for less than 1% of all invocations, and sometimes even less, it can still be a bit destructive to the seamless user experience that we're targeting.
What if I told you that there is a solution to cold starts? What if I told you that you can run JavaScript applications on AWS Lambda with virtually negligible cold starts? My name is Richard Davison. I work as a partner solution architect, helping partners to modernize their applications on AWS using serverless and container technologies. And I am here to talk about the project that I've been building for some time called LLRT and how it redefines serverless latency.
So LLRT is short for Low Latency Runtime. And it's a new JavaScript runtime built from the ground up to address the growing demand for fast and efficient serverless applications. Why should we build a new JavaScript runtime? So JavaScript is one of the most popular ways of building and running serverless applications. It also often offers full stack consistency, meaning that your application backend and frontend can share a unified language, which is an added benefit. JavaScript also offers a rich package ecosystem and a large community that can help accelerate the development of your applications. Furthermore, JavaScript is recognized as being rather user friendly in nature, making it easy to learn, easy to read and easy to write. It is also an open standard known as ECMAScript, which has been implemented by different engines, which is something that we will discuss later in this presentation.
So how is LLRT different from Node, Ebbun and Ordino? What justifies the introduction of another JavaScript runtime in light of these existing alternatives? So Node, Ebbun and Ordino, they represent highly proficient JavaScript runtimes. They are extremely capable and they are very performant. However, they're designed with general purpose applications in mind. And these runtimes were not specifically tailored for the demands of serverless environments, often characterized by short-lived runtime instances with limited resources. They also each depend on a just-in-time compiler, a very sophisticated technological component that allows the JavaScript code to be dynamically compiled and optimized during execution. And while a just-in-time compiler offers substantial long-term performance advantages, it often carries computational memory overhead, especially when doing so with limited resources. So in contrast, LLRT distinguishes itself by not incorporating a just-in-time compiler, which is a strategic decision that yields two significant advantages. The first one is that, again, a just-in-time compiler is a notably sophisticated technological component, introducing increased system complexity and contributing substantially to the runtime's overall size. And without a JIT overhead, LLRT conserves both CPU and memory resources that can be more effectively allocated towards executing the code that you run inside of your Lambda function, and thereby reducing application startup times. So again, a just-in-time compiler would offer a long-term substantial performance increase, whereas a lack of a just-in-time compiler can offer startup benefits. LLRT is built from the ground up with a primary focus, performance on AWS Lambda. So it comes with virtually negligible cold starts, and cold start duration is less than 100 milliseconds for a lot of use cases and tasks, even doing AWS SDK v3 calls. It uses a rather recent standard of ECMAScript, so ECMAScript 2020, with many Node.js APIs.
2. LLRT Performance
LLRT is a JavaScript runtime that offers improved performance and cost savings compared to other runtimes. It uses a lightweight engine called QuickJS, which is less than one megabyte in size. LLRT is built in Rust and adheres to the Node.js specification. In a demo, LLRT performed significantly faster and consumed less memory compared to Node.js.
And the goal of this is to make it a rather, such a simple migration from Node as possible. It comes with what we call batteries included. So LLRT and the binary itself has some AWS v3 SDKs already embedded, so you don't need to ship and provide those, which also has performance benefits. And speaking of performance benefits, there is also a cost benefit. And more stable performance, mainly due to the lack of a just-in-time compiler, can lead up to 2x performance improvement versus other JavaScript runtimes in a 2x cost saving, even for warm starts.
So what makes this so, so fast? What is under the hood? So it uses a different JavaScript engine compared to Deno or BUN. So Deno and BUN uses engines called V8 and JavaScript Core. So V8 comes from Chrome browser and the Chrome team. So the Chrome team has created a JavaScript engine for its browser called V8, whereas BUN uses an engine called JavaScript Core that has diverged from Safari. But QuickJS, on the other hand, is a very lightweight engine. It's very capable, but it's also very lightweight. So the engine itself, when compiled, is less than one megabyte. If you compare this with both JavaScript Core and V8, they're over 50 megabytes inside of Node and BUN. So LLRT is also built in Rust using Tokyo asynchronous runtime. Many of its APIs that is implemented inside of the runtime are adhering to the Node.js specification and are implemented in Rust. So the whole executable itself is less than three megabytes, and that is including the AWS SDK.
I think it's time to take a look at a quick demo to see how it performs in action. So here I am inside of the AWS Lambda console. In this example, I have imported the DynamoDB client and the DynamoDB document client to put some events that comes into AWS Lambda to put it on DynamoDB. I also add a randomized ID and stringify the event, and I simply return a status code of 200 and OK. Let's now first execute this using the regular Node.js 20 runtime, and this time we see a call starts. So let's go to the test tab here and hit on the test button. Now it has been executed, and if we examine the execution logs here, we can see that Node.js executed with a duration of 988 milliseconds and a build and an init duration of 366 milliseconds. So in total, this is somewhere around a little over 1.2, 1.3 seconds, actually, and we consumed almost 88 megabytes of memory by doing so. What I'm going to do now is go back to the code. I scroll down to runtime settings, click on edits and change to Amazon Linux 2023, always only run time. Save it, and now let's execute it with LLRT. As you can see, this was almost instant and examining the execution logs, we can see that we now have a duration of 29 milliseconds and an init duration of 38, which means that we have a total duration of 69 milliseconds. So 69 milliseconds versus 1,300 or slightly above for Node.js. While doing so, we only consumed about 20 megabytes of memory.
3. LLRT Benefits
LLRT offers fast performance for warm starts and is optimized for latency-critical applications, data transformation, and integration with AWS services. It is not suitable for simulations or operations involving large sets of data. LLRT achieves speed by removing complexities, leveraging Rust, and optimizing the AWS SDK for Lambda. The runtime is lightweight and written mostly in Rust. It lacks a Just-in-Time compiler but provides instant performance benefits without relying on JIT profiling.
And notice that if I run the code again, for warm starts, it's also very fast. We have 45 milliseconds here, 16, 13, 14, 9, etc. So there's also no sacrifice in warm performance, and in fact, it can be up to two times less than the Node.js equivalent, mainly due to the fact of the lack of a just-in-time compiler and a simpler engine for less complexity. Also notice that I didn't change a single line of code. What I simply did was to change the runtime settings here, and I have prepared this demo by putting the LLRT bootstrap binary here. So I simply downloaded LLRT, renamed the binary bootstrap, and put it together with my sample code here.
Okay, let's get back to the presentation. So what can be good use cases for LLRT? So good use cases can be latency-critical applications, high-volume functions, data transformation, integration with different AWS services, and server-side rendered React applications can even be executed with LLRT. And also applications consisting a lot of with a lot of glue code. What I mean by this is that applications that integrate to other third-party sources or other AWS services, that is the glue between one service to the other. When it's not good to use LLRT is when you're doing simulations or handling hundreds or thousands of iterations in loops or doing some sort of you know, multicast operations or transferring large objects or large sets of data in tens or even hundreds of megabytes. This is where the Just-in-Time compiler really shines, which is a feature that is not available in LLRT.
But what is best right now is to measure and see, and I'm pretty confident that a lot of your use cases would benefit from running LLRT. And again, how can it be so fast? So it has no JIT, and the AWS SDK is optimized specifically for Lambda. This means that we have removed some of the complexities that involve database SDK, such as we cache object creation, we convert the SDK to QuickJS bytecode, and we have a lot such as cache object creation, we convert the SDK to QuickJS bytecode. And we're leveraging some other techniques that optimize for code starts on Lambda. For instance, we do as much work as possible because the Lambda runtimes have CPU boost when they're being initialized. We also write most of our code in Rust. So in fact, we have a policy that says as much as possible should be written in Rust. So the more code we can move from JavaScript to Rust, there will be a performance benefit. So in contrast with Node.js, almost all of its APIs are written in JavaScript. And they heavily depend on the Just-in-Time compiler of the V8 engine to achieve great performance. Since we're lacking this capability and writing the most of the code in Rust, we get performance benefits while still keeping the size down and get an instant performance benefit without having to rely on the JIT profiler to optimize the code over longer running tasks. And basically, everything that you're using in LLRT is written in Rust. So the console, the timers, crypto, hashing, all of that is written in Rust. There's just a small JavaScript layer on top of that. And of course, your code will be running JavaScript as well. And it's also, again, very lightweight. It's only a few megabytes, and we try to keep it as lightweight as possible, minimizing dependencies, but also minimizing complexity. So what's the catch? This is a very high-level compatibility matrix.
4. LLRT Usage
LLRT has trade-offs in terms of supported Node.js APIs. It is not fully supported but constantly being built and available as a beta. To use LLRT, download the latest release from the GitHub page, add the bootstrap executable with your code, and select custom runtime on Amazon Linux 3 inside Lambda. LLRT runs on Arm or x86-64 instances with a slight benefit in using Arm for cost savings and better performance.
You can see there's an exclamation mark here and a few checkmarks. So obviously, there has to be some sort of trade-offs in order to achieve this level of performance. And the trade-offs is that not every Node.js API is supported, but we support some of them. And they're not also fully supported. So even though there's a checkmark here, it doesn't mean that it supports, for instance, the full FS module or FS promises module. It's partially supported. But we're constantly building this runtime, and it's available as a beta today that you can check out. And I will have links to it later in this presentation.
And how do you use it? So like you saw in the demo, I just download the latest release from the GitHub page, which is github.com slash AWS Labs slash LLRT. I add the bootstrap executable together with your code. I can also use a layer, if that's your thing, or package it as a container image. I then select custom runtime on Amazon Linux 3 inside Lambda as my runtime choice. And LLRT runs on either Arm or x86-64 instances. There's a slight benefit of using Arm because you have a cost savings benefit, and it's also slightly better performance. So this is something that I recommend.
5. LLRT Performance Analysis
LLRT starts almost six times faster than Node.js, showcasing the lightness of the engine. Performance numbers show a 23 times performance improvement for cold starts and a 15 times improvement for the worst case. LLRT introduced only 109 cold starts compared to Node.js' 554 cold starts. The duration span of warm starts is 158 milliseconds for the slowest and 29 milliseconds for p99 with LLRT. LLRT provides a cost saving of 2.9 times and a time saving of 3.7 times compared to Node.js. Please test LLRT, but remember it's still experimental.
Now let's take a look at some benchmark data. So as we saw in the demo, we did a very quick sample where we saw that the cold start benefits and also warm starts benefits were significant versus Node.js. This slide here showcases some startup benefits when running on my local machine. So as you can see here in the demo, highlighted by the arrow, that LLRT starts almost six times faster than Node.js. This is a pretty unexciting demo where we just do basically a print but it showcases the lightness of the engine where it doesn't have to load a lot of resources in order to start. So it can be even faster than Dino and Bun. But bear in mind that a lot of these speeds comes from the simplicity. It's very simple to introduce a new runtime with a limited API and say it's faster. But this is one of the trade-offs, right? So we make it very lightweight, hence it's also naturally faster.
Let's now take a quick look at some performance numbers when running LLRT for a longer period of time. So this is again doing a DynamoDB PUT. So it's the same sample code that we saw in the demo. But however, this is now running for 32,000 invocations on ARM64 with 128 megabytes of memory. So notice here that the P99 latency, meaning that 99% of all invocations are below this number. We have 34 milliseconds for warm starts and 84 milliseconds for cold starts. In comparison, we have the fastest possible warm start that is only 5.29 milliseconds and the fastest possible cold start that is 48.85 milliseconds. If you compare this with Node 18, we can see that we have the P99 latency of 164 for warm starts and 1306 for cold starts. For the slowest times and for the fastest times, we have 5.85 and 1141 milliseconds for a cold start. This means that there is a 23 times performance improvement for this exact demo. For cold starts and a 15 times performance improvement for the worst case. So this is the best case versus the worst case. Also notice that the number of cold starts that you can see here. So in Lambda, even though cold starts may not be that super critical for your application, so if we can keep them lower, it also means that they are less likely to occur. Because every time Lambda has to process two consecutive events and has not done so before, meaning that there are no ready instances. It has to spin up a new one, meaning that you will introduce an additional cold start. So in my example here, we can see that LLRT only introduced 109 cold starts versus Node.js that had 554 cold starts. And again, this is due to the process being much shorter, also less likely to occur in the first place. Also notice here that the duration span of the warm starts, we have 158 milliseconds for the slowest, all the way to the fastest invocation for the warm start versus only 29 for p99 with LLRT. And again, this is due to the lack of adjustment time compiler, making the execution much more consistent. If we take a look at the latency and cost breakdown, we can see that we have a build duration of 22 minutes and 19 seconds for Node.js versus LLRT, we only have 7 minutes and 48 seconds, which translates to a cost saving of 2.9 times and a time saving of 3.7 times. And the reason why these two differ is that it depends on how they're being charged. For provided runtimes is charged a bit differently in Lambda than custom runtimes, but we still have a cost saving of 2.9x for this particular example over 32,000 invocations.
And that's it for me. I highly encourage you to test LLRT so you can follow the QR link here. And please test it out. It's still a very experimental runtime, so do not run it in production just yet, but we're building more capabilities every day. And we hope that you provide feedback. And again, I'm very, very thankful that you took the time to listen to me today.
Comments