Milo, a New HTTP Parser for Node.js

Rate this content
Bookmark
The talk at Node Congress 2024 introduced Milo, a new HTTP parser for Node.js, written in Rust. Node.js has stable implementations for HTTP 1 and HTTP 2, and work is ongoing for HTTP 3. Milo was developed to address the complexities and vulnerabilities in the existing LLHTTP parser. Unlike LLHTTP, which is backward compatible with HTTP 09 and 10, Milo strictly follows the latest RFCs for HTTP. Rust's macro system powers Milo's simpler state machine, providing flexibility and powerful code generation. Milo can be integrated into Node.js using WebAssembly, offering a performant solution without the need for data deserialization. Future plans for Milo include Node.js integration for WebAssembly, fixing performance issues in WebAssembly, and possibly implementing SIMD. The talk also highlighted the use of Cbindgen for generating C++ header files and the benefits of a small memory footprint in Milo.

From Author:

Node.js HTTP parsing currently relies on llhttp, a parser which provides very good performance but has currently some challenges for the health of the runtime.

Is it possible to create a modern, maintenable, well documented, secure and performant alternative? Yes it is!

Let me introduce you Milo, a new Rust based HTTP parser which I plan to integrate into Node.js and let me show you how you can help be a part of its first Rust component.

This talk has been presented at Node Congress 2024, check out the latest edition of this Tech Conference.

FAQ

Future plans for Milo include Node.js integration for WebAssembly, fixing performance issues in WebAssembly, possibly implementing SIMD in WebAssembly, and migrating the LLHTTP bus test suite to ensure no regressions.

Node Congress 2024 is an event where experts and enthusiasts gather to discuss and share knowledge about Node.js and related technologies.

Paolo is a Node Technical Steering Committee member and Staff Dx Engineer at NearForm, who presented at Node Congress 2024.

Milo is a new HTTP parser for Node.js, written in Rust, designed to be more performant and secure compared to the existing LLHTTP parser.

NearForm is a professional services company focused on delivering modern, performant, and elegant solutions to digital partners globally.

Node.js has stable implementations for HTTP 1 and HTTP 2, but is still working on the implementation for HTTP 3.

LLHTTP is the current default HTTP parser for Node.js, written by Fedor Indutny in 2019. It has been the default since Node.js version 12.

Milo was created to address the complexities and potential vulnerabilities in LLHTTP, offering a simpler and more secure alternative.

Milo is written in Rust. It uses LLParse for transpiling from TypeScript to C and leverages Rust's powerful macro system for its state machine.

Milo has a small memory footprint and opts for performance by default. It can analyze data on the fly without copying it, but developers can opt in to have unconsumed data copied for easier handling.

Paolo Insogna
Paolo Insogna
23 min
04 Apr, 2024

Comments

Sign in or register to post your comment.

Video Transcription

1. Introduction to Node Congress 2024

Short description:

Hello and welcome to Node Congress 2024. NearForm focuses on delivering modern and elegant solutions. Paolo introduces himself and talks about HTTP versions. Node has stable implementations for HTTP 1 and 2, and they're working on HTTP 3.

Hello and welcome to Node Congress 2024. This is Milo, a new HTTP parser for Node.js.

First of all, let me introduce NearForm. We are a professional services company which is focused on delivering the most modern, performant, and elegant solutions to our digital partners. We are active in several countries in the world and we're always looking for new talents, so please apply.

Being reckless sometimes pays off. Why is that? Let me prove it to you. First of all, I want to introduce myself. Hi again, I am Paolo. I'm a Node Technical Steering Committee member and Staff Dx Engineer at NearForm. You can find me online at the end of the slide that you can see. And also on the right hand side you see where do I come from. I come from Campobasso in Italy, in the smallest region which is Molise that the rest of Italy pretend does not exist. But it's their loss, not mine. Go on.

We all love HTTP. Why is that? Because it's the most pervasive and mostly used protocol ever. Which version are you? Well, the thing is that despite being 30 years old, only three versions of HTTP actually exist. Two were only draft, 09 and 10, so I don't count them as the existing version. The ones that are made to the final version are 11, 2 and 3. 11 is by far the most used, is the historical one, is the one that you also probably know and is still in place and will not go anywhere anytime soon. 20 was actually created to address some of the problems of the TCP socket by using the speedy protocol. Thus, the results were not really successful. Now we also have 3, which instead use QUIC, which use UDP, which makes things more complex, especially for system administrators. I'm sorry for you folks, really.

What about Node? Node has a stable implementation for HTTP 1 and HTTP 2. In that case you're good to go. About HTTP 3, we're still not quite there yet. We are still working on the QUIC implementation, but we will get there. That's a promise.

2. HTTP Parsing and Introduction to Milo

Short description:

Now focus on the topic of this talk, which is HTTP parsing. The current Node HTTP parser is called LLHTTP, written by Fedor Indutny in 2019. It is the default since Node 12 and works brilliantly. LLHTTP is backward compatible with HTTP 09 and 10, which brings unnecessary complexity and vulnerabilities. To address these problems, Milo was developed as a solution. Milo is written in Rust, a flexible and performant language. The choice of Rust was deliberate to explore its potential for contributing to Node with Rust code.

Now focus on the topic of this talk, which is HTTP parsing. What is the current Node HTTP parser as of today? It is called LLHTTP. It has been written by Fedor Indutny in 2019 and it is the default since Node 12. It works brilliantly. On the right hand side you can see the state machine that it actually uses, made of 80 states, so it's very, very complex. The magic is in its founding parsing machine generator, which is LLParse. LLParse gets input state machine definition in TypeScript, which has a very specific subset of the oval language, and generates a C state machine. In other words, LLParse transpiles from TypeScript to C. Bad signs today. You can easily see how such a transpiler can be hard to debug and to release. Also, in addition, LLHTTP has been always backward compatible with HTTP 09 and 10, and this brings unnecessary complexity to address edge cases. It also has been lenient and tolerant against broken HTTP implementation, like, I don't know, usually embedded device or similar. This is very dangerous because it opens the door to vulnerabilities and other backdoors and so forth. These are usually the problems of LLHTTP, which brought me to the decision to write Milo, as you will see in a bit. Milo is the solution, of course, otherwise you wouldn't be here, so of course we have a solution. We start fresh. Sorry for the horrible pun, I really apologize. This is Milo. Not the Milo that you actually expected, but this guy was also Milo. What you're seeing is a, for people that don't know that, is a Tamiya, basically it's a Tamiya squirrel, sorry, it's a Japanese squirrel, and this one in particular was named Milo. It was one of my wife, which at the time was girlfriend's pets, and it was also the very first one I chose to name my new software against. Basically, I have now the habit of naming my software against my current or former pets, and I have plenty of them. You know, cats, dogs, horses, fishes, you name it, whatever. Anyway, this is Milo, or a Milo. I will show you the other Milo in a bit. Actually speaking of the last Milo, the one that you're actually here for, let's drop the bomb. Milo is written in Rust, period. Why is that? The language has been proven flexible and powerful and performant to achieve this specific task. It's low level to the performances, but it's not low level on the definition. For instance, I did not know Rust at all before writing Milo, and I purposely made this choice, I made an experiment with myself to see how hard it would be for a new contributor to embrace Rust in order to contribute to Node if Node contains some Rust code.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
The talk discusses the importance of supply chain security in the open source ecosystem, highlighting the risks of relying on open source code without proper code review. It explores the trend of supply chain attacks and the need for a new approach to detect and block malicious dependencies. The talk also introduces Socket, a tool that assesses the security of packages and provides automation and analysis to protect against malware and supply chain attacks. It emphasizes the need to prioritize security in software development and offers insights into potential solutions such as realms and Deno's command line flags.
Towards a Standard Library for JavaScript Runtimes
Node Congress 2022Node Congress 2022
34 min
Towards a Standard Library for JavaScript Runtimes
Top Content
There is a need for a standard library of APIs for JavaScript runtimes, as there are currently multiple ways to perform fundamental tasks like base64 encoding. JavaScript runtimes have historically lacked a standard library, causing friction and difficulty for developers. The idea of a small core has both benefits and drawbacks, with some runtimes abusing it to limit innovation. There is a misalignment between Node and web browsers in terms of functionality and API standards. The proposal is to involve browser developers in conversations about API standardization and to create a common standard library for JavaScript runtimes.
ESM Loaders: Enhancing Module Loading in Node.js
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
ESM Loaders enhance module loading in Node.js by resolving URLs and reading files from the disk. Module loaders can override modules and change how they are found. Enhancing the loading phase involves loading directly from HTTP and loading TypeScript code without building it. The loader in the module URL handles URL resolution and uses fetch to fetch the source code. Loaders can be chained together to load from different sources, transform source code, and resolve URLs differently. The future of module loading enhancements is promising and simple to use.
Out of the Box Node.js Diagnostics
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
This talk covers various techniques for getting diagnostics information out of Node.js, including debugging with environment variables, handling warnings and deprecations, tracing uncaught exceptions and process exit, using the v8 inspector and dev tools, and generating diagnostic reports. The speaker also mentions areas for improvement in Node.js diagnostics and provides resources for learning and contributing. Additionally, the responsibilities of the Technical Steering Committee in the TS community are discussed.
Node.js Compatibility in Deno
Node Congress 2022Node Congress 2022
34 min
Node.js Compatibility in Deno
Deno aims to provide Node.js compatibility to make migration smoother and easier. While Deno can run apps and libraries offered for Node.js, not all are supported yet. There are trade-offs to consider, such as incompatible APIs and a less ideal developer experience. Deno is working on improving compatibility and the transition process. Efforts include porting Node.js modules, exploring a superset approach, and transparent package installation from npm.
Multithreaded Logging with Pino
JSNation Live 2021JSNation Live 2021
19 min
Multithreaded Logging with Pino
Top Content
Today's Talk is about logging with Pino, one of the fastest loggers for Node.js. Pino's speed and performance are achieved by avoiding expensive logging and optimizing event loop processing. It offers advanced features like async mode and distributed logging. The use of Worker Threads and Threadstream allows for efficient data processing. Pino.Transport enables log processing in a worker thread with various options for log destinations. The Talk concludes with a demonstration of logging output and an invitation to reach out for job opportunities.

Workshops on related topic

Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
Building a Hyper Fast Web Server with Deno
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
WorkshopFree
Matt Landers
Will Johnston
2 authors
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
0 to Auth in an Hour Using NodeJS SDK
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
GraphQL - From Zero to Hero in 3 hours
React Summit 2022React Summit 2022
164 min
GraphQL - From Zero to Hero in 3 hours
Workshop
Pawel Sawicki
Pawel Sawicki
How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.
Mastering Node.js Test Runner
TestJS Summit 2023TestJS Summit 2023
78 min
Mastering Node.js Test Runner
Workshop
Marco Ippolito
Marco Ippolito
Node.js test runner is modern, fast, and doesn't require additional libraries, but understanding and using it well can be tricky. You will learn how to use Node.js test runner to its full potential. We'll show you how it compares to other tools, how to set it up, and how to run your tests effectively. During the workshop, we'll do exercises to help you get comfortable with filtering, using native assertions, running tests in parallel, using CLI, and more. We'll also talk about working with TypeScript, making custom reports, and code coverage.