Node.js startup snapshots

Rate this content
Bookmark
The video discusses the Startup Snapshot initiative in Node.js, which aims to improve startup performance by using precompiled and serialized snapshots of the Node.js heap. This initiative helps in faster startup times by deserializing the snapshot at runtime. New features added between Node.js versions 18 and 20, such as VEJ, web crypto, file API, and blob, have increased the startup initialization costs. To mitigate this, Node.js uses strategies like lazy loading and precompiling internal modules. The video also covers how users can create their own snapshots using the --build-snapshot runtime option and the --snapshot-blob option. This is particularly useful for applications where startup performance is crucial, such as command-line tools. The video mentions that while heap snapshots are used for diagnostics, startup snapshots are designed to be rehydrated at runtime to speed up the startup process. Limitations of startup snapshots include the inability to serialize in-flight asynchronous operations, which must be completed before taking a snapshot.

From Author:

V8 provides the ability to capture a snapshot out of an initialized heap and rehydrate a heap from the snapshot instead of initializing it from scratch. One of the most important use cases of this feature is to improve the startup performance of an application built on top of V8. In this talk we are going to take a look at the integration of the V8 startup snapshots in Node.js, how the snapshots have been used to speed up the startup of Node.js core, and how user-land startup snapshots can be used to speed up the startup of user applications.

This talk has been presented at Node Congress 2023, check out the latest edition of this Tech Conference.

FAQ

The Startup Snapshot is an integration within Node.js core that captures a snapshot of the initialized Node.js heap at build time, allowing for faster startup times by deserializing the snapshot at runtime instead of parsing, compiling, and executing JavaScript code.

The Startup Snapshot improves Node.js startup performance by capturing a snapshot of the initialized Node.js heap at build time. At runtime, this snapshot is deserialized, skipping the parsing, compiling, and execution steps, which reduces startup time significantly.

Between versions 18 and 20, Node.js added support for VEJ, web crypto, file API, blob, various web standards, and new APIs under Yotel, such as the argument parser and the MIME type parser.

Node.js uses JavaScript for some internal functionalities because it lowers the contribution barrier and reduces the cost of C++ to JavaScript callbacks. However, this can make it harder to maintain startup performance since JavaScript needs to be parsed and compiled before execution.

Node.js uses multiple strategies to control startup initialization costs, including lazy loading for experimental or less commonly used features, precompiling internal modules to generate a code cache, and using V8 Startup Snapshots for essential features to skip initialization code execution.

Yes, users can create their own Startup Snapshots for their applications. This can be useful for applications where startup performance is crucial, like command line tools. The process involves running a user-provided script to completion and taking a snapshot of the heap, which can then be deserialized at runtime.

Users can generate a Startup Snapshot without building Node.js from source by using the `--build-snapshot` runtime option of the official Node.js executable. This generates a snapshot blob from a given script, which can then be used with the `--snapshot-blob` option to deserialize the heap at runtime.

Heap Snapshots are used for diagnostics and are not meant to be rehydrated, whereas Startup Snapshots are designed to be deserialized at runtime to speed up the startup process. Both use the same underlying V8 infrastructure but serve different purposes.

Startup Snapshots cannot be directly controlled by users in AWS Lambda environments. The responsibility for implementing this feature would fall on AWS, the provider of the Node.js runtime in this context.

The limitations of Node.js Startup Snapshots include the inability to serialize in-flight asynchronous operations, such as TCP connections to a database. All asynchronous operations must be completed before taking a snapshot.

Joyee Cheung
Joyee Cheung
28 min
14 Apr, 2023

Comments

Sign in or register to post your comment.

Video Transcription

Available in Español: Instantáneas de inicio de Node.js

1. Introduction to Startup Snapshot in Node

Short description:

I'm Joy, working on the startup performance strategic initiative in Node. The initiative has been renamed to Startup Snapshot. Node has been adding new features, requiring additional setup during startup. From LTS 18 to upcoming 20, support for VEJ, web crypto, file API, blob, web strings, and APIs under Yotel has been added. Node core is half in JavaScript and half in C++.

As mentioned, I'm Joy. I work at Egaleo and I work on Node and V8. So I've been working on the startup performance strategic initiative in Node for a while. The initiative has recently been renamed to Startup Snapshot as we have done the integration within Node core and we are enabling this feature for userland applications, which is what I'm going to talk about today.

So let's get started. So a bit of history. The Startup Snapshot integration started while Node started gradually dropping the old small core philosophy and adding a lot more built-in features. This includes new globals, in particular, new web APIs, new built-in modules, and new APIs in existing modules. These new features either require additional setup during the startup of Node core or require additional internal modules to be loaded during the startup.

So to give you an overview from the last LTS version 18 to the upcoming 20, we've added support for VEJ, web crypto, file API, blob, a bunch of web strings, and a bunch of new APIs under Yotel, such as the argument parser and the MIE type parser. The list is longer than that, but you get the idea. Like Node is growing a lot. Another part of this challenge is that the Node core is written about half in JavaScript and half in C++. So a lot of those internals are actually implemented in JavaScript.

2. Startup Performance and Initialization

Short description:

The startup performant is harder to maintain as the JavaScript code needs to be parsed and compiled before execution. To mitigate potential prototype pollution, JavaScript buildings are not copied for internal use. Node core uses multiple strategies to control startup initialization costs, including lazy loading, precompiling internal modules, and using V8 Startup snapshots. The snapshots are serialized binary blobs capturing the VA heap and execution contacts. They are used for isolates and contacts in NOE.

The upside of this is that this lowers the contribution barrier. In some cases, it reduces the C++ to JavaScript callback costs. But at the same time, this makes it harder to keep the startup performant. For one, the JavaScript code needs to be parsed and compiled before they can be executed, and that takes time. Also, most of the JavaScript code for initialization only gets run once during startup because it's just initialization, so it doesn't get optimized by the JavaScript engine.

When implementing a library in JavaScript, we have to take potential prototype pollution into account. You don't want the user to blow up the runtime just because they delete something from the building to a prototype, like string prototype that started with. So to mitigate this, no need to create copies of these JavaScript buildings as startup for the internals to use. They don't actually use the prototype methods that we expose to users. All this can slow down the startup as node grows.

So to keep the cost of the startup initialization under control, node core uses multiple strategies. First, we do not initialize all the globals and buildings as startup. For features that are still experimental or too new to be used widely or only serve a specific type of application, we only install accessories that will load them lazily when the user access them for the first time. And second, when building releases, we precompile all the internal modules to generate the code cache which contains bytecode and metadata, and we amp them to the executable so that when we do have to load additional modules, a user requests, we pass the code cache to V8, and V8 can skip the parsing and the compilation, and just use the serialized code when it updates, when it validates that cache. And finally, for essential features that we almost always have to load, for example, the web URL API, the FS module, which are used also by other internals, or like widely used timers, like time, widely used features like timers, we captured them in a V8 Startup snapshot, which helps simply skipping the execution of the initialization code and saving time during startup.

So this is kind of like how the node executable used to be built and run. Initially, we were just embedding the JavaScript code into the executable, at build time. And at run time, we need to parse it, we need to compile it, we need to execute it to get the node core initialized, and before we can run user code and process system states to initialize the user app. And then we introduced embedded code cache. So at build time, we precompile all the internal JavaScript code and generate compile code cache, and then we embed them into the executable. And the run time, we'll ask VA to use the code cache and skip the parsing and compilation process. We'll still keep the internal JavaScript code as the source of truth, in case the code cache doesn't validate in the current execution environment, but most of the time, the code used, and we just skip the compilation process. And now, with the starter snapshot integration, we just run the internal JavaScript code at build time to initialize a note heap and then we capture a snapshot and embed that into the executable. The other two are still kept as fallback, but at runtime we just simply deserialize the snapshot to get the initialized heap. So there is no need to even parse, compile, execute. The internal code is just like, you deserialize the result. So what exactly are these VA startup snapshots? They're basically the VA heap serialized into a binary blob. There are two layers of snapshots, one that captures all the primitives and the native bindings, and one that captures the execution contacts, like the objects and functions. So currently, NOE uses the isolate snapshot for all the isolates that you can create from the useland, including the main isolate and the worker isolates. We also have built-in contacts snapshots for the main contacts, the VM contacts, and the worker contacts, although the worker contacts snapshot currently only contains very minimal stuff.

QnA

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
The talk discusses the importance of supply chain security in the open source ecosystem, highlighting the risks of relying on open source code without proper code review. It explores the trend of supply chain attacks and the need for a new approach to detect and block malicious dependencies. The talk also introduces Socket, a tool that assesses the security of packages and provides automation and analysis to protect against malware and supply chain attacks. It emphasizes the need to prioritize security in software development and offers insights into potential solutions such as realms and Deno's command line flags.
Towards a Standard Library for JavaScript Runtimes
Node Congress 2022Node Congress 2022
34 min
Towards a Standard Library for JavaScript Runtimes
Top Content
There is a need for a standard library of APIs for JavaScript runtimes, as there are currently multiple ways to perform fundamental tasks like base64 encoding. JavaScript runtimes have historically lacked a standard library, causing friction and difficulty for developers. The idea of a small core has both benefits and drawbacks, with some runtimes abusing it to limit innovation. There is a misalignment between Node and web browsers in terms of functionality and API standards. The proposal is to involve browser developers in conversations about API standardization and to create a common standard library for JavaScript runtimes.
ESM Loaders: Enhancing Module Loading in Node.js
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
ESM Loaders enhance module loading in Node.js by resolving URLs and reading files from the disk. Module loaders can override modules and change how they are found. Enhancing the loading phase involves loading directly from HTTP and loading TypeScript code without building it. The loader in the module URL handles URL resolution and uses fetch to fetch the source code. Loaders can be chained together to load from different sources, transform source code, and resolve URLs differently. The future of module loading enhancements is promising and simple to use.
Out of the Box Node.js Diagnostics
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
This talk covers various techniques for getting diagnostics information out of Node.js, including debugging with environment variables, handling warnings and deprecations, tracing uncaught exceptions and process exit, using the v8 inspector and dev tools, and generating diagnostic reports. The speaker also mentions areas for improvement in Node.js diagnostics and provides resources for learning and contributing. Additionally, the responsibilities of the Technical Steering Committee in the TS community are discussed.
Node.js Compatibility in Deno
Node Congress 2022Node Congress 2022
34 min
Node.js Compatibility in Deno
Deno aims to provide Node.js compatibility to make migration smoother and easier. While Deno can run apps and libraries offered for Node.js, not all are supported yet. There are trade-offs to consider, such as incompatible APIs and a less ideal developer experience. Deno is working on improving compatibility and the transition process. Efforts include porting Node.js modules, exploring a superset approach, and transparent package installation from npm.
Multithreaded Logging with Pino
JSNation Live 2021JSNation Live 2021
19 min
Multithreaded Logging with Pino
Top Content
Today's Talk is about logging with Pino, one of the fastest loggers for Node.js. Pino's speed and performance are achieved by avoiding expensive logging and optimizing event loop processing. It offers advanced features like async mode and distributed logging. The use of Worker Threads and Threadstream allows for efficient data processing. Pino.Transport enables log processing in a worker thread with various options for log destinations. The Talk concludes with a demonstration of logging output and an invitation to reach out for job opportunities.

Workshops on related topic

Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
Building a Hyper Fast Web Server with Deno
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
WorkshopFree
Matt Landers
Will Johnston
2 authors
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
0 to Auth in an Hour Using NodeJS SDK
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
GraphQL - From Zero to Hero in 3 hours
React Summit 2022React Summit 2022
164 min
GraphQL - From Zero to Hero in 3 hours
Workshop
Pawel Sawicki
Pawel Sawicki
How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.
Mastering Node.js Test Runner
TestJS Summit 2023TestJS Summit 2023
78 min
Mastering Node.js Test Runner
Workshop
Marco Ippolito
Marco Ippolito
Node.js test runner is modern, fast, and doesn't require additional libraries, but understanding and using it well can be tricky. You will learn how to use Node.js test runner to its full potential. We'll show you how it compares to other tools, how to set it up, and how to run your tests effectively. During the workshop, we'll do exercises to help you get comfortable with filtering, using native assertions, running tests in parallel, using CLI, and more. We'll also talk about working with TypeScript, making custom reports, and code coverage.