The Need for Speed: How AWS New JS Runtime is Redefining Serverless Latency

Rate this content
Bookmark

In today’s world of modern applications, swift responsiveness is essential. Users expect seamless interactions where every action triggers an immediate response.

Serverless services such as AWS Lambda, allows developers to build modern applications without the need to manage traditional servers or infrastructure. However, Serverless services might introduce additional latency when new execution environments are provisioned and due to (by design) having less resources than traditional servers or containerized environments.

To mitigate this problem, AWS have developed an experimental JavaScript runtime, called LLRT, built from the ground up for a Serverless environment. LLRT (Low Latency Runtime) is a lightweight JavaScript runtime designed to address the growing demand for fast and efficient Serverless applications. LLRT offers more than 10x faster startup and up to 2x overall lower cost compared to other JavaScript runtimes running on AWS Lambda.

In this session you will discover how it's different from what's already out there, see its performance in action and learn how to apply it to your Serverless functions.

This talk has been presented at Node Congress 2024, check out the latest edition of this JavaScript Conference.

FAQ

AWS Lambda is a serverless computing service that allows developers to run code without provisioning or managing servers. It automatically scales applications by running code in response to triggers such as changes in data, shifts in system state, or user actions.

A cold start in AWS Lambda refers to the latency introduced when provisioning a new execution environment to run the user code. This typically occurs in less than 1% of all invocations but can affect the seamless user experience.

LLRT is different from other JavaScript runtimes because it does not incorporate a just-in-time (JIT) compiler, which reduces system complexity and conserves CPU and memory resources. This design choice makes LLRT particularly suitable for serverless environments with limited resources and frequent cold starts.

LLRT was created to address the growing demand for fast and efficient serverless applications. JavaScript is popular for building serverless applications, but existing runtimes like Node.js are not optimized for serverless environments. LLRT aims to provide a solution with virtually negligible cold starts and better performance.

LLRT offers significant performance benefits, including virtually negligible cold starts (less than 100 milliseconds for many tasks) and up to 2x performance improvement compared to other JavaScript runtimes. It is also more cost-effective, with a potential 2x cost-saving for both cold and warm starts.

LLRT is ideal for latency-critical applications, high-volume functions, data transformation, integration with AWS services, and server-side rendered React applications. It excels in scenarios requiring quick startup times and efficient resource usage.

LLRT is not recommended for tasks involving simulations, handling large data sets, or performing thousands of iterations in loops, as these scenarios benefit more from a just-in-time compiler, which LLRT lacks.

To get started with LLRT, download the latest release from its GitHub page, add the bootstrap executable with your code, and select custom runtime on Amazon Linux 3 inside Lambda. LLRT supports both ARM and x86-64 instances, with a slight performance and cost benefit for ARM.

As of now, LLRT is still in beta and not recommended for production use. The project is actively being developed, with new capabilities being added regularly. Users are encouraged to test it and provide feedback.

LLRT, or Low Latency Runtime, is a new JavaScript runtime specifically built to minimize cold starts and improve performance for serverless applications on AWS Lambda. It is designed to be lightweight and efficient, using a different JavaScript engine called Quick.js and written largely in Rust.

Richard Davison
Richard Davison
25 min
04 Apr, 2024

Comments

Sign in or register to post your comment.
Video Summary and Transcription
Serverless services like AWS Lambda allow developers to build modern applications without provisioning servers or additional infrastructure. LLRT is a low latency runtime designed specifically for serverless environments and JavaScript applications. LLRT uses a lightweight JavaScript engine called Quick.js, achieving fast execution and performance with minimal memory consumption. LLRT is ideal for latency-critical applications, high-volume functions, and integration with AWS services. It significantly improves performance, reducing cold starts and providing consistent warm start times. Users are encouraged to test LLRT and contribute to its development.

1. Introduction to LLRT

Short description:

Serverless services like AWS Lambda allow developers to build modern applications without provisioning servers or additional infrastructure. However, cold starts can introduce latency. LLRT is a low latency runtime designed specifically for serverless environments and JavaScript applications. LLRT does not incorporate a just-in-time compiler, conserving CPU and memory resources and reducing application startup times. It offers virtually negligible cold starts and uses ECMAScript 2020 with many Node.js APIs.

Hello, everyone. In today's world of modern applications, swift responsiveness is essential. Developers expect excellent experience where every action triggers an immediate response. Serverless services such as AWS Lambda allows developers to build modern applications without the need to provision any servers or additional infrastructure.

However, these services sometimes introduce or add a bit of latency when provisioning a new execution environment to run the customer code. This is sometimes referred to as a cold start. And even though production metrics shows that cold starts typically occur for less than 1% of all invocations, and sometimes even less, it can still be a bit destructive to the seamless user experience that we're targeting.

What if I told you that there is a solution to cold starts? What if I told you that you can run JavaScript applications on AWS Lambda with virtually negligible cold starts?

My name is Richard Davison. I work as a partner solution architect, helping partners to modernize their applications on AWS using serverless and container technologies. And I am here to talk about the project that I've been building for some time called LLRT and how it redefines serverless latency.

So LLRT is short for low latency runtime. And it's a new JavaScript runtime built from the ground up to address the growing demand for fast and efficient serverless applications. Why should we build a new JavaScript runtime? So JavaScript is one of the most popular ways of building and running serverless applications. It also often offers full stack consistency, meaning that your application back end and front end can share a unified language, which is an added benefit. JavaScript also offers a rich package ecosystem and a large community that can help accelerate the development of your applications. Furthermore, JavaScript is recognized as being rather user-friendly in nature, making it easy to learn, easy to read and easy to write. It is also an open standard known as ECMAScript, which has been implemented by different engines, which is something that we will discuss later in this presentation.

So how is LLRT different from Node, Abun and Ordino? What justifies the introduction of another JavaScript runtime in light of these existing alternatives? So Node, Abun and Ordino represent highly proficient JavaScript runtimes. They are extremely capable and they are very performant. However, they're designed with general purpose applications in mind, and these runtimes were not specifically tailored for the demands of serverless environments, often characterized by short-lived runtime instances with limited resources. They also each depend on a just-in-time compiler, a very sophisticated technological component that allows the JavaScript code to be dynamically compiled and optimized during execution. While a just-in-time compiler offers substantial long-term performance advantages, it often carries computational and memory overhead, especially when doing so with limited resources.

So in contrast, LLRT distinguishes itself by not incorporating a just-in-time compiler, which is a strategic decision that yields two significant advantages. The first one is that, again, a just-in-time compiler is a notably sophisticated technological component introducing increased system complexity and contributing substantially to the runtime's overall size. And without that JIT overhead, LLRT conserves both CPU and memory resources that can be more effectively allocated towards executing the code that you run inside of your Lambda function, and thereby reducing application startup times. So again, a just-in-time compiler would offer a long-term substantial performance increase, whereas a lack of a just-in-time compiler can offer startup benefits.

LLRT is built from the ground up with a primary focus, performance on AWS Lambda. It comes with virtually negligible cold starts, and cold start duration is less than 100 milliseconds for a lot of use cases and tasks, even doing AWS SDK v3 calls. It uses a rather recent standard of ECMAScript, so ECMAScript 2020, with many Node.js APIs. And the goal of this is to make it a rather, such a simple migration from Node as possible.

2. LLRT Performance and Demo

Short description:

LLRT has embedded AWS v3 SDKs, leading to performance benefits and cost savings. It uses a lightweight JavaScript engine called Quick.js, which is less than one megabyte in size compared to over 50 megabytes for engines like v8 and JavaScript core. LLRT is built in Rust, adhering to Node.js specifications, and has a total executable size of less than three megabytes. A demo in the AWS Lambda console shows a cold start duration of over 1.2 seconds with the regular Node.js 20 runtime, consuming almost 88 megabytes of memory.

It comes with what we call batteries included. So LLRT and the binary itself has some AWS v3 SDKs already embedded, so you don't need to ship and provide those, which also has performance benefits. And speaking of performance benefits, there is also a cost benefit. And more stable performance, mainly due to the lack of a just-in-time compiler, can lead up to 2x performance improvement versus other Javascript runtimes, and a 2x cost-saving, even for warm starts.

So what makes this so fast? What is under the hood? So it uses a different Javascript engine compared to Dino or BUN. So Dino and BUN uses engines called v8 and Javascript core. So v8 comes from Chrome browser and the Chrome team. So the Chrome team has created a Javascript engine for its browser called v8, whereas BUN uses an engine called Javascript core that has diverged from Safari. But Quick.js on the other hand is a very lightweight engine. It's very capable, but it's also very lightweight. So the engine itself, when compiled, is less than one megabyte. If you compare this with both Javascript core and v8, they're over 50 megabytes inside of Node and BUN. So LLRT is also built in Rust, using Tokyo asynchronous runtime. Many of its APIs that is implemented inside of the runtime are adhering to the Node.js specification and are implemented in Rust. The whole executable itself is less than three megabytes, and that is including the AWS SDK.

I think it's time to take a look at a quick demo to see how it performs in action. So here I am inside of the AWS Lambda console. In this example, I have imported the DynamoDB client and the DynamoDB document client to put some event that comes into AWS Lambda, to put it on DynamoDB. I also add a randomized ID and stringify the event, and I simply return a status code of 200 and OK. Let's now first execute this using the regular Node.js 20 runtime. This time we see a cold start. So let's go to the test tab here and hit on the test button. Now it has been executed. And if we examine the execution logs here, we can see that Node.js executed with a duration of 988 milliseconds and an in-it duration of 366 milliseconds. So in total, this is somewhere around a little over 1.2, 1.3 seconds, actually. And we consumed almost 88 megabytes of memory while doing so. What I'm going to do now is go back to the code. I scroll down to runtime settings, click on edit and change to Amazon Linux 2023. Always only runtime. Save it.

QnA

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

A Guide to React Rendering Behavior
React Advanced 2022React Advanced 2022
25 min
A Guide to React Rendering Behavior
Top Content
This transcription provides a brief guide to React rendering behavior. It explains the process of rendering, comparing new and old elements, and the importance of pure rendering without side effects. It also covers topics such as batching and double rendering, optimizing rendering and using context and Redux in React. Overall, it offers valuable insights for developers looking to understand and optimize React rendering.
Speeding Up Your React App With Less JavaScript
React Summit 2023React Summit 2023
32 min
Speeding Up Your React App With Less JavaScript
Top Content
Watch video: Speeding Up Your React App With Less JavaScript
Mishko, the creator of Angular and AngularJS, discusses the challenges of website performance and JavaScript hydration. He explains the differences between client-side and server-side rendering and introduces Quik as a solution for efficient component hydration. Mishko demonstrates examples of state management and intercommunication using Quik. He highlights the performance benefits of using Quik with React and emphasizes the importance of reducing JavaScript size for better performance. Finally, he mentions the use of QUIC in both MPA and SPA applications for improved startup performance.
React Concurrency, Explained
React Summit 2023React Summit 2023
23 min
React Concurrency, Explained
Top Content
Watch video: React Concurrency, Explained
React 18's concurrent rendering, specifically the useTransition hook, optimizes app performance by allowing non-urgent updates to be processed without freezing the UI. However, there are drawbacks such as longer processing time for non-urgent updates and increased CPU usage. The useTransition hook works similarly to throttling or bouncing, making it useful for addressing performance issues caused by multiple small components. Libraries like React Query may require the use of alternative APIs to handle urgent and non-urgent updates effectively.
The Future of Performance Tooling
JSNation 2022JSNation 2022
21 min
The Future of Performance Tooling
Top Content
Today's Talk discusses the future of performance tooling, focusing on user-centric, actionable, and contextual approaches. The introduction highlights Adi Osmani's expertise in performance tools and his passion for DevTools features. The Talk explores the integration of user flows into DevTools and Lighthouse, enabling performance measurement and optimization. It also showcases the import/export feature for user flows and the collaboration potential with Lighthouse. The Talk further delves into the use of flows with other tools like web page test and Cypress, offering cross-browser testing capabilities. The actionable aspect emphasizes the importance of metrics like Interaction to Next Paint and Total Blocking Time, as well as the improvements in Lighthouse and performance debugging tools. Lastly, the Talk emphasizes the iterative nature of performance improvement and the user-centric, actionable, and contextual future of performance tooling.
How React Compiler Performs on Real Code
React Advanced 2024React Advanced 2024
31 min
How React Compiler Performs on Real Code
Top Content
I'm Nadia, a developer experienced in performance, re-renders, and React. The React team released the React compiler, which eliminates the need for memoization. The compiler optimizes code by automatically memoizing components, props, and hook dependencies. It shows promise in managing changing references and improving performance. Real app testing and synthetic examples have been used to evaluate its effectiveness. The impact on initial load performance is minimal, but further investigation is needed for interactions performance. The React query library simplifies data fetching and caching. The compiler has limitations and may not catch every re-render, especially with external libraries. Enabling the compiler can improve performance but manual memorization is still necessary for optimal results. There are risks of overreliance and messy code, but the compiler can be used file by file or folder by folder with thorough testing. Practice makes incredible cats. Thank you, Nadia!
Optimizing HTML5 Games: 10 Years of Learnings
JS GameDev Summit 2022JS GameDev Summit 2022
33 min
Optimizing HTML5 Games: 10 Years of Learnings
Top Content
Watch video: Optimizing HTML5 Games: 10 Years of Learnings
PlayCanvas is an open-source game engine used by game developers worldwide. Optimization is crucial for HTML5 games, focusing on load times and frame rate. Texture and mesh optimization can significantly reduce download sizes. GLTF and GLB formats offer smaller file sizes and faster parsing times. Compressing game resources and using efficient file formats can improve load times. Framerate optimization and resolution scaling are important for better performance. Managing draw calls and using batching techniques can optimize performance. Browser DevTools, such as Chrome and Firefox, are useful for debugging and profiling. Detecting device performance and optimizing based on specific devices can improve game performance. Apple is making progress with WebGPU implementation. HTML5 games can be shipped to the App Store using Cordova.

Workshops on related topic

React Performance Debugging Masterclass
React Summit 2023React Summit 2023
170 min
React Performance Debugging Masterclass
Top Content
Featured WorkshopFree
Ivan Akulov
Ivan Akulov
Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)
Building WebApps That Light Up the Internet with QwikCity
JSNation 2023JSNation 2023
170 min
Building WebApps That Light Up the Internet with QwikCity
Featured WorkshopFree
Miško Hevery
Miško Hevery
Building instant-on web applications at scale have been elusive. Real-world sites need tracking, analytics, and complex user interfaces and interactions. We always start with the best intentions but end up with a less-than-ideal site.
QwikCity is a new meta-framework that allows you to build large-scale applications with constant startup-up performance. We will look at how to build a QwikCity application and what makes it unique. The workshop will show you how to set up a QwikCitp project. How routing works with layout. The demo application will fetch data and present it to the user in an editable form. And finally, how one can use authentication. All of the basic parts for any large-scale applications.
Along the way, we will also look at what makes Qwik unique, and how resumability enables constant startup performance no matter the application complexity.
Next.js 13: Data Fetching Strategies
React Day Berlin 2022React Day Berlin 2022
53 min
Next.js 13: Data Fetching Strategies
Top Content
WorkshopFree
Alice De Mauro
Alice De Mauro
- Introduction- Prerequisites for the workshop- Fetching strategies: fundamentals- Fetching strategies – hands-on: fetch API, cache (static VS dynamic), revalidate, suspense (parallel data fetching)- Test your build and serve it on Vercel- Future: Server components VS Client components- Workshop easter egg (unrelated to the topic, calling out accessibility)- Wrapping up
Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
React Performance Debugging
React Advanced 2023React Advanced 2023
148 min
React Performance Debugging
Workshop
Ivan Akulov
Ivan Akulov
Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)