Challenges for Incremental Production Optimizations

Rate this content
Bookmark

We will look into usual optimization bundlers applied on production builds and how they can be combined with incremental builds.

This talk has been presented at JSNation 2024, check out the latest edition of this JavaScript Conference.

FAQ

Tobias Koppers is a developer working at Vercel, focusing on TurboPack.

TurboPack focuses on incremental builds because developers often need quick feedback during development, making fast incremental builds more beneficial than fast initial builds.

TurboPack is a new bundler being developed at Vercel, designed from scratch to optimize incremental builds, especially for Next.js.

Tree-shaking is an optimization technique that removes unused code from the final bundle, reducing its size by including only the required modules and exports.

Persistent caching in TurboPack ensures that cache data is saved and reused across builds, even if the build process starts in a fresh environment. This helps maintain incremental build efficiency.

TurboPack is stable for development use and is being used on Vercel.com. However, it is not yet fully optimized for production builds, and persistent caching is still under development.

Incremental builds are builds that only recompile the parts of the code that have changed, making the build process faster during development.

Unlike Webpack, TurboPack is designed from scratch to focus on incremental builds, aiming to make them as fast as possible. It also seeks to avoid losing cache when upgrading versions.

TurboPack renames long, meaningful names of exports to shorter names like A, B, and C for optimization, and ensures that import references are updated accordingly.

Some challenges include balancing incremental builds with production optimizations like tree-shaking, export mangling, module IDs, chunking, and scope hoisting, which often require whole application knowledge.

Tobias Koppers
Tobias Koppers
32 min
13 Jun, 2024

Comments

Sign in or register to post your comment.
Video Summary and Transcription
TurboPack is a new bundle similar to Webpack, focusing on incremental builds to make them as fast as possible. Challenges in production builds include persistent caching, incremental algorithms, and optimizing export usage. The compilation process can be split into parsing and transforming modules, and chunking the module graph. TurboPack aims to achieve faster production builds through incremental optimization and efficiency. Collaboration and compatibility with other ecosystems are being considered, along with the design of a plugin interface and tree-shaking optimization.

1. Introduction to TurboPack and Incremental Builds

Short description:

I work on TurboPack, a new bundle similar to Webpack but designed from scratch. Our mission is to focus on incremental builds, making them as fast as possible. We want developers to spend their time on incremental builds, and only have to do the initial build once. This talk covers the unique challenges of production builds.

So, my name is Tobias Koppers, and I work at Vercel and work on TurboPack. TurboPack is a new bundle we're working on, similar to Webpack, but we're designing it from scratch. We're using it for Next.js, and our mission for TurboPack from the beginning was to focus on incremental builds. We want to make incremental builds as fast as possible, even if we have trade-offs on initial builds, because we think that most developers tend to wait on incremental builds often, because that is what you do while developing.

On the other hand, we also try to make every build incremental. We try to make that you only have to do your initial build once, and then spend the remaining time only on incrementables. That also means if you upgrade the Next.js version or the TurboPack version, we don't want you to lose your cache, or if you upgrade the parentheses, and this also includes production builds, so we want to focus on production builds too. That's what this talk is about. Production builds have some unique challenges I want to cover, and go a little bit over that.

2. Challenges and Optimisations in Production Builds

Short description:

There are several common production optimisations for bundlers, including tree-shaking, export mangling, module IDs, chunking, scope hoisting, dead code elimination, minification, and content hashing. These optimisations have different challenges when it comes to incremental builds and whole application optimisations. For incremental production builds, we need to have at least two ingredients.

So, if we look at both sides, like on one side we have TurboPack, which is really focused on incrementables, and on the other hand we have production builds which are really optimised, and that doesn't look too opposed, but in fact, there are some challenges that come with these optimisations we do in production builds usually. Because optimisations often, like, so in development we can focus on making it as fast as possible, even if you trade-off bundle size, or make builds a little bit larger, or do something of these trade-offs, but in production builds you don't want to make these trade-offs. You want to make production builds as optimised as possible, and then you basically have to trade-off with maybe performance on that stuff. Bringing them both together, like incrementables and production optimisations, that's a bit of a challenge.

So let's look at some common production optimisations for bundlers. The one you probably know is called tree-shaking. It's basically all about, like, you have your repository with a lot of files, and in your application, you only want to include the files that you're actually using on pages, and bundlers usually do that by following a dependency graph, by following your imports, and only including the files you actually reference. But it goes more low-level. Every file has multiple exports, usually, and you maybe have some kind of utility libraries where you have a lot of function declarations, and tree-shaking also looks at these and looks into your source code, and looks which of these exports are actually used in your whole application, and only includes them in your bundle, and basically throws away the remaining ones. That's actually the first challenge.

We have to look at your whole application to figure out which exports are used in your application, and looking at the whole application is basically the opposite of making it incremental, where incrementables usually want to look at one piece at a time, and, if something changes, you want to minimise the changes, the effects on that. Basically, this whole application optimisation is a little bit opposed to that. The next optimisation is called export mangling. Now, as a good developer, you made this function declaration, and gave them good names, meaningful long names to actually do cool stuff, and give good explanation to your co-workers, and whatever, and, in production, you don't want to leak these long names into your bundles. You want the bundler or the tool to optimise it, and usually bundlers do that by renaming things like A, B, and C, something like that, so it's just more optimised. But there's also a problem with that. If you rename these exports in a file, and you also have to rename it on the import side, so every module that is importing that module with the declarations need to reference not the long names but the A, B, C, the mangled names, so basically you have this effect where you change one module, or your optimisation changes one module, and that affects a lot of other modules, and this is also a little bit challenging for incrementables, because then you change one thing, and it bubbles up to multiple other changes, and yes, it's not really incremental.

Another optimisation is called module IDs, where you have usually some modules in your application need to be addressable at one time, so you might want to get some export from that module and that stuff, and you could just give them at one time, you have to give them a name at one time, and you could just give them the pull path as name, but it's very long and verbose, and for production builds, we want something shorter, similar to export mangling, so usually in Webpack we give them just numbers, short numbers, and address them by that, and the problem with that is now you give every module a number, but you have to make sure that this number is unique in your whole application, and uniqueness is again a problem with like, you need to look at your whole application to figure out if this name is already taken, so if there is a conflict, so this is again the whole application optimisation.

Another optimisation is chunking, so usually your bundler doesn't put all your modules into a single file and serve that for all pages, because it would end up with huge megabytes of a bundle, so we split it up, or the code splitting, we split it up into bundles per page, but we also do something like a common chunk or shared modules optimisation, where we figure out if there are modules shared between your multiple pages or multiple chunks, and then we put them into a shared file so we don't have to load the same modules multiple times, basically load the shared module once, and finding shared modules also requires looking at multiple pages at the same time, so this is again an whole application optimisation. There is an optimisation called scope hoisting, where we basically don't want, as a good developer, you write up many small modules, because that's organising your code well, and we basically don't want this abstraction of modules leaking into the runtime, so we want to get rid of many small modules at one time, and basically make only what we actually need at one time.

So scope hoisting optimisation basically merges modules together under certain conditions, and the tricky thing are the conditions in this case. We have to figure out which modules are always in the whole application, always executed in the same order, in this order, and then we can basically merge them together because then we know this is the order that they're executed in. So basically, finding this kind of matching this condition, finding something happens always in your whole application, is again an whole application optimisation. Then there are some simple optimisations like dead code elimination, which just omits code that is not used, or minification, which just writes your code more efficiently, omits white space commands and that stuff. Then the last optimisation is content hashing, where we basically put on every file name we put in hash at the end to make it long-term cacheable, basically means you can just send an immutable cache header, and then the browser cache can cache that. And yes, that's basically all the eight optimisations I come up with.

So if we summarise that, like in this table, you see that a bunch of these optimisations, or like half of them need our whole application optimisations, and a bunch of them also have effects where you change something and then the importer side of that changes. These are a bit complicated for incremental builds. But we look at this later. So in general, for incremental production builds, what we need to do is we need to have two ingredients, or at least two ingredients.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

The Core of Turbopack Explained (Live Coding)
JSNation 2023JSNation 2023
29 min
The Core of Turbopack Explained (Live Coding)
Tobias Koppers introduces TurboPack and TurboEngine, addressing the limitations of Webpack. He demonstrates live coding to showcase the optimization of cache validation and build efficiency. The talk covers adding logging and memorization, optimizing execution and tracking dependencies, implementing invalidation and watcher, and storing and deleting invalidators. It also discusses incremental compilation, integration with other monorepo tools, error display, and the possibility of a plugin system for Toolpag. Lastly, the comparison with Bunn's Builder is mentioned.
Rome, a Modern Toolchain!
JSNation 2023JSNation 2023
31 min
Rome, a Modern Toolchain!
Top Content
Rome is a toolchain built in Rust that aims to replace multiple tools and provide high-quality diagnostics for code maintenance. It simplifies tool interactions by performing all operations once, generating a shared structure for all tools. Rome offers a customizable format experience with a stable formatter and a linter with over 150 rules. It integrates with VCS and VLSP, supports error-resilient parsing, and has exciting plans for the future, including the ability to create JavaScript plugins. Rome aims to be a top-notch toolchain and welcomes community input to improve its work.
Server Components with Bun
Node Congress 2023Node Congress 2023
7 min
Server Components with Bun
Top Content
Bun is a modern JavaScript runtime environment that combines a bundler, transpiler, package manager, and runtime. It offers faster installation of NPM packages and execution of package.json scripts. Bun introduces a new JavaScript and TypeScript bundler with built-in support for server components, enabling easy RPC with the client. This allows for code splitting and running code that streamingly renders React or any other library from the server and mixes it with client code, resulting in less JavaScript sent to the client.
Parcel 2: the Automagical Bundler
DevOps.js Conf 2021DevOps.js Conf 2021
8 min
Parcel 2: the Automagical Bundler
Parcel 2 is a ground-up rewrite of Parcel 1, a fast and scalable zero-configuration web application bundler used by large companies like Atlassian and Adobe. It offers a zero-config approach with good defaults, making it production-ready out of the box. The new features include a revamped plugin system, a configuration file, transformers for file conversion, optimizers for code compression, target support for different browsers, diagnostics for error debugging, and named pipelines for data and JavaScript in different formats. Parcel 2 also supports different import scenarios, such as importing JSON files with named pipelines and using query parameters for image optimization. It includes various performance improvements, stable caches, optimized data structures, enhanced code splitting and bundling, improved scope hosting, and better support for monorepos and libraries. A React example is provided to showcase the simplicity of Parcel and how to use it with React.
Owning your Build-step – Owning your Code
DevOps.js Conf 2021DevOps.js Conf 2021
28 min
Owning your Build-step – Owning your Code
This Talk explores JavaScript code optimization using Rollup, showcasing examples of improved load times and reduced server size. It delves into Rollup customization and plugin development, demonstrating how to write plugins and remove code using hooks. The Talk also covers module code loading, advanced code control, and importing/emitting files with Rollup. Additionally, it highlights the adoption of Rollup's plugin system by other tools and introduces a self-made terminal used in the presentation.
Rspack Recently Was Awarded Breakthrough of the Year at JSNation
JSNation US 2024JSNation US 2024
31 min
Rspack Recently Was Awarded Breakthrough of the Year at JSNation
For those who have not heard of Rspack, it's a 1:1 port of Webpack to Rust.But, did you know that rspack is actually the 4th iteration of native bundlers our team has designed, and it originally started out as a plugin for esbuild. In its development, we have rewritten esbuild & rollup in rust, taken apart parcel to understand it better, and overall have reviewed every bundler on the market during the development of rspack before finally picking the webpack api design for the project as it is known today.
In this talk I will share the behind the scenes of its creation, why we built it, what the future for rspack looks like, and our own experience + business data we have gathered with it in supermassive projects at ByteDance.