English versionEN

What's Inside Biome's Linter?

With Emanuele, lead maintainer of Biome, we will explore the internals of Biome's analyzer, which fuels its linter. You'll learn how lint rules are made, what tools the analyzer can give and how to use them.

This talk has been presented at JSNation 2024, check out the latest edition of this JavaScript Conference.

FAQ

The Biome Analyzer is a versatile tool that goes beyond being just a linter or a CLI tool. It offers functionalities like formatting, analyzing, checking, import sorting, and even code transpiling.

Biome Analyzer is fast because it uses multi-threading to handle multiple files simultaneously, employs channels for efficient communication among threads, and uses aggressive caching to reuse memory tokens.

Multi-threading in Biome Analyzer involves spawning multiple threads for each file. Each thread is responsible for parsing and analyzing its own file, and then reporting diagnostics back to the main thread using channels.

Channels are used for communication between the main thread and multiple worker threads. Each thread sends its diagnostics and other information through these channels to the main thread, which then compiles and reports the results.

Reusable tokens refer to pointers to blocks of memory that are saved into a caching object. When a file is reparsed, these tokens are reused, saving memory and improving performance.

The Biome Analyzer can emit new diagnostics and perform other tasks like import sorting that complement the parsing phase. It can handle tasks that are out of scope for the parser, such as semantic validation.

Import sorting is a feature of the Biome Analyzer that sorts imports automatically when you save your file. This is done without the need for diagnostics and is part of the analyzer's assist functions.

Yes, the Biome Analyzer is LSP (Language Server Protocol) ready. It can be configured with an IDE or any editor that supports LSP, enabling features like automatic sorting of JSON keys or JSX element attributes.

Yes, Biome Analyzer can also be used as a CLI tool. You can configure it to impose refactors and make the CLI fail if certain conditions, like unsorted JSON keys, are not met.

Biome Analyzer is designed to handle large codebases efficiently. It can process thousands of files quickly by leveraging multi-threading and caching, providing a great developer experience.

tooling

Emanuele Stoppa

10 min

17 Jun, 2024

Comments

Video Summary and Transcription

Today, we're going to talk about the Biome Analyzer, which is not just a linter or a CLI tool. It takes advantage of multi-threading, channels for communication, and caching to achieve high performance. The analyzer complements the parser and provides features like import sorting and emitting new diagnostics. It is LSP ready, can automatically sort JSON keys, and can be used as a CLI tool for enforcing refactors. The Biome Analyzer showcases its impressive performance in handling large codebases in a video demonstration.

Available in Español: ¿Qué hay dentro del Linter de Biome?

1. Introduction to Biome Analyzer

Short description:

Today, we're going to talk about the Biome Analyzer. It's not just a linter or a CLI tool. Biome Analyzer is so fast because it takes advantage of multi-threading, uses channels for communication, and employs aggressive caching during parsing.

Hello, everyone. How's it going? So today, we're going to talk about the Biome Analyzer and what's behind it.

So before going forward, who I am. So my name is Emanuele Stoppa. I'm Italian. I live in Ireland. I like open source, games, traveling. And I'm also so into open source that I'm into two projects, Astro and Biome.

Today, we're going to talk about the Biome Analyzer. So what's really curious about the Biome Analyzer? Well, Biome Analyzer is so fast. I'm going to look at why it's that fast. It's not just a linter. It's much more. A linter is just a smaller thing. And it's not just a CLI tool. It's also something more. So, let's do it.

So, why Biome is so fast? So, there are, among other things, there are three things that I want to explain to you. And so, why is Biome so fast? Like, it takes advantage of multi-threading. So, it spawns multiple threads for each file. It uses this kind of channels to keep the communications among the different threads. And we use, like, aggressive caching during the parsing phase. Now, multi-threading. So, when each command that you run from the CLI, like formatting, analyzing, checking, this kind of stuff. So, each command that crawls your file system. What Biome does is that once it identifies those files that are eligible for being handled, let's say, Biome spawns a thread. So, each thread is responsible of its own file. And it parses it. It analyzes it. And it emits some signals that could be, like, if there's a diagnostic, if there's a code action, and more.

2. Working of Biome Analyzer

Short description:

Biome Analyzer uses channels for communication among threads and collects diagnostics using multiple senders and one receiver. It also employs reusable tokens to minimize memory usage during reparsing.

Now, all these threads, when they are spawned, they are not aware of each other. Like, they just do one job. At the end, they have to report something, like if there are errors, or if not, like any kind of information. In order to do so, we use channels.

So, you have all these files. For each file, we have these threads. There are n threads, depending on the operating system. Then we have the main thread. So, the main thread waits for all these threads. And it starts collecting information from all the threads.

So, using these channels, we have multiple channels with multiple senders, which are essentially the threads. And one receiver, which belongs to the main thread. And once there are diagnostics, we collect them. We collect if there are warnings, errors. If, like, we skipped like some diagnostics due to some restrictions or options and things like that. So, that's how the communication happens. And once all the threads are died, the main thread can resume its work and report everything to your console.

And then we have reusable tokens. So, essentially, what does that mean? So, once biome parses your file, it creates tokens and nodes. These are essentially pointers to a block of memory on your operating system. And these pointers are saved, like the references are saved into a caching object. Okay? Once a reparse of the same file happens again, let's say a code action occurred and that action changes this snippet from let to const. We do a reparse to make sure that there are no more triggered rules. When we reparse it, essentially, the nodes that belong to msecret equals and the string are reused. So, instead of creating a new node, we have it there already. So, we have that reference that says that msecret points to that block of memory. Let's just use it. Let's not create a new one. So, that's how for each document we reuse the same thing. So, like memory wise, there's no waste at all.

3. The Power of Biome Analyzer

Short description:

Biome's analyzer complements the parser and provides features like import sorting. It can emit new diagnostics during the parsing phase and can help with syntax errors. Import sorting is part of the same infrastructure and can be enabled in IDEs. The analyzer also lays the foundations for transpiling code, such as transforming JSX to JavaScript.

Neat, right? Now, before I said that it's more than a linter. Exactly. So, a linter is just a small part of the analyzer. Biome's analyzer is able to complement the parser. It's also able to provide other features like import sorting and even more.

So, how does it complement the parser? So, depending on the project, the project can decide that some things regarding the parsing phase can be out of scope, non-goals. For example, semantic validation or other things. And we can actually use the analyzer to complement the parsing phase by emitting new diagnostics using the analyzer.

In Biome, for example, we have a few syntax roles. That's how we call them that are emitted and triggered before the actual, for example, the actual linting or the actual formatting. So, if the code that you try to lint has some syntax errors, they are emitted and the linting phase actually stops. We can't or the formatting phase stops. We can't do that. So, that's how an analyzer can help even more.

Import sorting. Imports are actually coming from Biome analyzer. It's not a lint role. There's nothing about the linter. It's part of the same infrastructure. We just use it in a different way. And that's how it works. Like, you just enable your IDE to call the import sorting. And once you save, it just does that for you. No need for diagnostics or anything like that.

And even more, like, the analyzer is actually put the foundations to, like, transpiling the code. So, for example, transforming and transpiling JSX to JavaScript. So, some kind of a bubble thing. And assist. Like, nice trivia, the import sorting is actually an assist. So, in the future, in the near future of Biome, you will be able to see more of this analyzer.

4. Biome Analyzer: LSP Ready and High Performance

Short description:

The analyzer is LSP ready and can automatically sort JSON keys, JSX element attributes, and JS object attributes. Biome can also be used as a CLA tool to enforce refactors and ensure sorted JSON keys. Biome's speed is demonstrated in a video, where it handles large codebases with impressive performance.

So, the analyzer is actually LSP ready. So, like, you can configure it with a small pane with your IDE or any editor that supports the LSP. You will be able to see automatic sorting of JSON keys or JSX element attributes or JS object attributes. So, anything that can be, like, formatted that is opt in. Okay?

And since Biome is also a CLA tool, via configuration, you can actually impose these refactors and make the CLI fail, if, for example, the JSON keys of a certain JSON file isn't sorted. So, like, the best of two worlds.

And everything comes also to this video. Like, here you can see how fast is Biome. Like, I downloaded the webpack repository and the rollup repository and the TypeScript codebase of Rome tools. And see how fast it is. Like, we don't print diagnostics, because printing diagnostics is expensive. But see, like, 13,000 files. Even really big. Because the old TypeScript codebase contains a lot of big files for parsing purposes and testing. And it's freaking, freaking fast. So, like, great DX, I would say. Thank you.

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

The Future of Performance Tooling

JSNation 2022

21 min

The Future of Performance Tooling

Top Content

Emanuele Stoppa

Active contributor to the Biome project

Rome is a toolchain built in Rust that aims to replace multiple tools and provide high-quality diagnostics for code maintenance. It simplifies tool interactions by performing all operations once, generating a shared structure for all tools. Rome offers a customizable format experience with a stable formatter and a linter with over 150 rules. It integrates with VCS and VLSP, supports error-resilient parsing, and has exciting plans for the future, including the ability to create JavaScript plugins. Rome aims to be a top-notch toolchain and welcomes community input to improve its work.

tooling bundler

Conquering Complexity: Refactoring JavaScript Projects

JSNation 2024

21 min

Conquering Complexity: Refactoring JavaScript Projects

Phil Nash

DataStax

Today's Talk explores the complexity in code and its impact. It discusses different methods of measuring complexity, such as cyclomatic complexity and cognitive complexity. The importance of understanding and conquering complexity is emphasized, with a demo showcasing complexity in a codebase. The Talk also delves into the need for change and the role of refactoring in dealing with complexity. Tips and techniques for refactoring are shared, including the use of language features and tools to simplify code. Overall, the Talk provides insights into managing and reducing complexity in software development.

tooling

Improving Developer Happiness with AI

React Summit 2023

29 min

Improving Developer Happiness with AI

Watch video: Improving Developer Happiness with AI

Senna Parsa

Solutions Engineer at GitHub

GitHub Copilot is an auto-completion tool that provides suggestions based on context. Research has shown that developers using Copilot feel less frustrated, spend less time searching externally, and experience less mental effort on repetitive tasks. Copilot can generate code for various tasks, including adding modals, testing, and refactoring. It is a useful tool for improving productivity and saving time, especially for junior developers and those working in unfamiliar domains. Security concerns have been addressed with optional data sharing and different versions for individuals and businesses.

tooling artificial intelligence developer experience productivity best practices

Automate the Browser With Workers Browser Rendering API

JSNation 2024

20 min

Automate the Browser With Workers Browser Rendering API

Gift Egwuenu

Developer Advocate at Cloudflare.

The Talk discusses browser automation using the Worker's Browser Rendering API, which allows tasks like navigating websites, taking screenshots, and creating PDFs. Cloudflare integrated Puppeteer with their workers to automate browser tasks, and their browser rendering API combines remote browser isolation with Puppeteer. Use cases for the API include taking screenshots, generating PDFs, automating web applications, and gathering performance metrics. The Talk also covers extending sessions and performance metrics using Durable Objects. Thank you for attending!

tooling browser api cloudflare

Static Analysis in JavaScript: What’s Easy and What’s Hard

JSNation 2023

23 min

Static Analysis in JavaScript: What’s Easy and What’s Hard

Elena Vilchik

Software Engineer at Sonar, Switzerland

Static analysis in JavaScript involves analyzing source code without executing it, producing metrics, problems, or warnings. Data flow analysis aims to determine the values of data in a program. Rule implementation in JavaScript can be straightforward or require extensive consideration of various cases and parameters. JavaScript's dynamic nature and uncertainty make static analysis challenging, but it can greatly improve code quality.

tooling devtools debug

Workshops on related topic

Solve 100% Of Your Errors: How to Root Cause Issues Faster With Session Replay

JSNation 2023

44 min

Solve 100% Of Your Errors: How to Root Cause Issues Faster With Session Replay

WorkshopFree

Ryan Albrecht

You know that annoying bug? The one that doesn’t show up locally? And no matter how many times you try to recreate the environment you can’t reproduce it? You’ve gone through the breadcrumbs, read through the stack trace, and are now playing detective to piece together support tickets to make sure it’s real.
Join Sentry developer Ryan Albrecht in this talk to learn how developers can use Session Replay - a tool that provides video-like reproductions of user interactions - to identify, reproduce, and resolve errors and performance issues faster (without rolling your head on your keyboard).

tooling error monitoring debug