End the Pain: Rethinking CI for Large Monorepos

Rate this content
Bookmark

Scaling large codebases, especially monorepos, can be a nightmare on Continuous Integration (CI) systems. The current landscape of CI tools leans towards being machine-oriented, low-level, and demanding in terms of maintenance. What's worse, they're often disassociated from the developer's actual needs and workflow.

Why is CI a stumbling block? Because current CI systems are jacks-of-all-trades, with no specific understanding of your codebase. They can't take advantage of the context they operate in to offer optimizations.

In this talk, we'll explore the future of CI, designed specifically for large codebases and monorepos. Imagine a CI system that understands the structure of your workspace, dynamically parallelizes tasks across machines using historical data, and does all of this with a minimal, high-level configuration. Let's rethink CI, making it smarter, more efficient, and aligned with developer needs.

This talk has been presented at DevOps.js Conf 2024, check out the latest edition of this Tech Conference.

FAQ

The main challenges include managing the complexity of running multiple projects simultaneously, ensuring efficient and fast pipeline execution, and maintaining the CI setup as the monorepo grows. This often requires sophisticated tooling and strategies to handle dependencies and parallelize tasks effectively.

NX optimizes CI processes through features like affected commands, which only run tasks related to changed projects, and advanced caching mechanisms to avoid redundant computations. Additionally, NX supports fine-grained task distribution and dynamic scaling across multiple machines to improve efficiency and reduce CI times.

In monorepo management, project graphs are crucial for tracking the dependencies between different projects within the repository. This allows tools like NX to efficiently determine which parts of the monorepo are affected by changes, optimizing build and test processes by only processing relevant parts.

Yes, NX can handle dynamic distribution of tasks across multiple CI machines. It uses a coordinator on NX Cloud infrastructure to distribute tasks based on the project graph, optimizing resource usage and reducing build times by balancing loads across available machines.

NX Agents are part of NX's suite of tools designed to improve CI efficiency by helping with the distribution of tasks across multiple machines. They enable dynamic scaling and fine-grained distribution, reducing the overhead of manual configuration and ensuring efficient use of resources.

NX addresses flakiness detection by leveraging caching to identify when a task produces different results under the same conditions, indicating potential flakiness. It can then automatically rerun these tasks on different machines to confirm and address the issue, ensuring reliability in the CI process.

Juri Strumpflohner
Juri Strumpflohner
25 min
15 Nov, 2023

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Today's Talk discusses rethinking CI in monorepos, with a focus on leveraging the implicit graph of project dependencies to optimize build times and manage complexity. The use of NX Replay and NX Agents is highlighted as a way to enhance CI efficiency by caching previous computations and distributing tasks across multiple machines. Fine-grained distribution and flakiness detection are discussed as methods to improve distribution efficiency and ensure a clean setup. Enabling distribution with NX Agents simplifies the setup process, and NX Cloud offers dynamic scaling and cost reduction. Overall, the Talk explores strategies to improve the scalability and efficiency of CI pipelines in monorepos.

1. Introduction to Rethinking CI in Monorepos

Short description:

Today, I would like to talk about how we could potentially rethink how CI works in monorepos. My name is Joris Sturmfloner, and I have been using monorepos for six years. I am also a core team member of NX and a Google developer expert in web technologies and Angular.

[♪ music playing ♪ All right. So today, I would like to talk a bit about how we could potentially rethink how CI works compared to the current CI situation that we have, with a particular focus on monorepos and potentially large monorepos. So how we could optimize that. So before we go ahead, my name is Joris Sturmfloner. I've been using monorepos for probably six years already. Since about four years, I'm also a core team member of NX, which is a monorepo management tool. And so I'm also a Google developer expert in web technologies and Angular and also an instructor on AgHead, where I publish courses on web development and developer tools.

2. Considerations for CI in Monorepos

Short description:

When working with monorepos, we need to consider the local developer experience, automation and rules, and task pipelines. Current CI solutions are not optimized for monorepos and require low-level manual maintenance. Developers want a high-level way of defining their CI structure and need strategies to ensure scalability and manageable speed and throughput.

So when we go into the direction of a monorepo, it doesn't come for free, right? So there's some considerations that need to play in. One big one is obviously the local developer experience. So how do we structure a project in a monorepo? How do we make sure that we have a consistency in how these products are being set up? Which version do they use? How are they configured such that we can also have some sort of team mobility between projects, potentially, and it will also help us obviously maintain. Automation and rules around those projects is also a very important part, especially looking at maintenance and the longevity of such a monorepo.

And also things like features like task pipelines, being able to run things in parallel. Because clearly in a monorepo, we don't run just one project anymore, but potentially a series of projects where there are also dependencies. And so we need to be able to kind of build dependent products first before we actually run our project. And those are things that we don't want to do manually, but rather want to have tooling support. But today I would like to specifically focus on the elephant in the room whenever we talk about monorepos, which often is not being paid attention to immediately, which is kind of a mistake, which is CI. Because clearly there is some— the current CI situation is basically not optimized for monorepos because it is very machine-oriented, so we need to focus on exact instructions that we want to process. We need to actually have a very instructional kind of approach. It is very low level in that sense as well. It requires a lot of maintenance because we no more, as I mentioned, run just one project and that's it. We run a series of projects. We run multiple projects. And so we need to have strategies of tuning the CI in order to make sure even as our monorepo structure changes, as more products come into the monorepo, that it still works. It is also, I would say, a bit removed from what developers want, because as a developer, I would want to have a more high-level way of defining my CI structure, my CI run, my CI pipeline, in the sense of saying, hey, I want to run all these projects that got touched, for instance, in that PR, rather than having to fine-tune every single aspect of that project. And as I said before, they don't really work for monorepos, so they design much, much more general purpose and more into single-project workspaces in general. So today I would like to dive into some of these aspects, specifically looking at speed and throughput, because that is one major thing that we need to pay attention to, because otherwise our monorepo would be a problem. Because if we have good collaboration going on locally within Teams, but our pipeline takes over an hour for each PR, that's going to be a problem.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Levelling up Monorepos with npm Workspaces
DevOps.js Conf 2022DevOps.js Conf 2022
33 min
Levelling up Monorepos with npm Workspaces
Top Content
NPM workspaces help manage multiple nested packages within a single top-level package, improving since the release of NPM CLI 7.0. You can easily add dependencies to workspaces and handle duplications. Running scripts and orchestration in a monorepo is made easier with NPM workspaces. The npm pkg command is useful for setting and retrieving keys and values from package.json files. NPM workspaces offer benefits compared to Lerna and future plans include better workspace linking and adding missing features.
Why is CI so Damn Slow?
DevOps.js Conf 2022DevOps.js Conf 2022
27 min
Why is CI so Damn Slow?
Slow CI has a negative impact on productivity and finances. Debugging CI workflows and tool slowness is even worse. Dependencies impact CI and waiting for NPM or YARN is frustrating. The ideal CI job involves native programs for static jobs and lightweight environments for dynamic jobs. Improving formatter performance and linting is a priority. Performance optimization and fast tools are essential for CI and developers using slower hardware.
Atomic Deployment for JS Hipsters
DevOps.js Conf 2024DevOps.js Conf 2024
25 min
Atomic Deployment for JS Hipsters
This Talk discusses atomic deployment for JavaScript and TypeScript, focusing on automated deployment processes, Git hooks, and using hard links to copy changes. The speaker demonstrates setting up a bare repository, configuring deployment variables, and using the post-receive hook to push changes to production. They also cover environment setup, branch configuration, and the build process. The Talk concludes with tips on real use cases, webhooks, and wrapping the deployment process.
How to Build CI/CD Pipelines for a Microservices Application
DevOps.js Conf 2021DevOps.js Conf 2021
33 min
How to Build CI/CD Pipelines for a Microservices Application
Top Content
This Talk discusses the benefits of microservices and containers for building CI-CD pipelines. It explains how container technology enables portability and scalability. The challenges of microservices include network communication and testing in isolation. The Talk introduces Tacton, a cloud-native CICD pipeline for Kubernetes, and highlights the use of GitOps and Argo CD. It also discusses the importance of maintaining referential integrity between microservices and the evolving role of operators in the DevOps world.
Federated Microfrontends at Scale
React Summit 2023React Summit 2023
31 min
Federated Microfrontends at Scale
Top Content
Watch video: Federated Microfrontends at Scale
This Talk discusses the transition from a PHP monolith to a federated micro-frontend setup at Personio. They implemented orchestration and federation using Next.js as a module host and router. The use of federated modules and the integration library allowed for a single runtime while building and deploying independently. The Talk also highlights the importance of early adopters and the challenges of building an internal open source system.
Scale Your React App without Micro-frontends
React Summit 2022React Summit 2022
21 min
Scale Your React App without Micro-frontends
This Talk discusses scaling a React app without micro-frontend and the challenges of a growing codebase. Annex is introduced as a tool for smart rebuilds and computation caching. The importance of libraries in organizing code and promoting clean architecture is emphasized. The use of caching, NxCloud, and incremental build for optimization is explored. Updating dependencies and utilizing profiling tools are suggested for further performance improvements. Splitting the app into libraries and the benefits of a build system like NX are highlighted.

Workshops on related topic

React at Scale with Nx
React Summit 2023React Summit 2023
145 min
React at Scale with Nx
Top Content
Featured WorkshopFree
Isaac Mann
Isaac Mann
We're going to be using Nx and some its plugins to accelerate the development of this app.
Some of the things you'll learn:- Generating a pristine Nx workspace- Generating frontend React apps and backend APIs inside your workspace, with pre-configured proxies- Creating shared libs for re-using code- Generating new routed components with all the routes pre-configured by Nx and ready to go- How to organize code in a monorepo- Easily move libs around your folder structure- Creating Storybook stories and e2e Cypress tests for your components
Table of contents: - Lab 1 - Generate an empty workspace- Lab 2 - Generate a React app- Lab 3 - Executors- Lab 3.1 - Migrations- Lab 4 - Generate a component lib- Lab 5 - Generate a utility lib- Lab 6 - Generate a route lib- Lab 7 - Add an Express API- Lab 8 - Displaying a full game in the routed game-detail component- Lab 9 - Generate a type lib that the API and frontend can share- Lab 10 - Generate Storybook stories for the shared ui component- Lab 11 - E2E test the shared component
Node Monorepos with Nx
Node Congress 2023Node Congress 2023
160 min
Node Monorepos with Nx
Top Content
WorkshopFree
Isaac Mann
Isaac Mann
Multiple apis and multiple teams all in the same repository can cause a lot of headaches, but Nx has you covered. Learn to share code, maintain configuration files and coordinate changes in a monorepo that can scale as large as your organisation does. Nx allows you to bring structure to a repository with hundreds of contributors and eliminates the CI slowdowns that typically occur as the codebase grows.
Table of contents:- Lab 1 - Generate an empty workspace- Lab 2 - Generate a node api- Lab 3 - Executors- Lab 4 - Migrations- Lab 5 - Generate an auth library- Lab 6 - Generate a database library- Lab 7 - Add a node cli- Lab 8 - Module boundaries- Lab 9 - Plugins and Generators - Intro- Lab 10 - Plugins and Generators - Modifying files- Lab 11 - Setting up CI- Lab 12 - Distributed caching
Bring Code Quality and Security to your CI/CD pipeline
DevOps.js Conf 2022DevOps.js Conf 2022
76 min
Bring Code Quality and Security to your CI/CD pipeline
WorkshopFree
Elena Vilchik
Elena Vilchik
In this workshop we will go through all the aspects and stages when integrating your project into Code Quality and Security Ecosystem. We will take a simple web-application as a starting point and create a CI pipeline triggering code quality monitoring for it. We will do a full development cycle starting from coding in the IDE and opening a Pull Request and I will show you how you can control the quality at those stages. At the end of the workshop you will be ready to enable such integration for your own projects.
Powering your CI/CD with GitHub Actions
DevOps.js Conf 2022DevOps.js Conf 2022
155 min
Powering your CI/CD with GitHub Actions
Workshop
David Rubio Vidal
David Rubio Vidal
You will get knowledge about GitHub Actions concepts, like:- The concept of repository secrets.- How to group steps in jobs with a given purpose.- Jobs dependencies and order of execution: running jobs in sequence and in parallel, and the concept of matrix.- How to split logic of Git events into different workflow files (on branch push, on master/main push, on tag, on deploy).- To respect the concept of DRY (Don't Repeat Yourself), we will also explore the use of common actions, both within the same repo and from an external repo.