Video Summary and Transcription
This Talk explores the benefits of using TuberApple, a tool for supercharging the development experience. It highlights the advantages of monorepos, such as code reuse, shared standards, improved team collaboration, and atomic changes. TuberApple, specifically Tuburepo, offers efficient task execution through caching and optimized scheduling. It simplifies monorepo development by allowing parallel execution of tasks and seamless coordination of changes. Tubo, another tool within TuberApple, enables smart task execution, declaring task dependencies, and efficient caching in monorepos.
1. Introduction to TuberApple
Hello, everyone. Thanks for joining me today. We're going to talk about how to supercharge your dev experience with TuberApple. A little bit about me. My name is Bruno Paulino. I'm a tech lead at Any26, a company building a platform for engineers to build web applications and libraries. We'll discuss monorepo, multirepos, and how companies organize their code.
Hello, everyone. Thanks for joining me today. We're going to talk about how to supercharge your dev experience with TuberApple.
A little bit about me. My name is Bruno Paulino. I'm a tech lead at this cool company called Any26. We're building the bank that we love to use. And I'm a software engineer focused on the web. And there, at Any26, we're building the platform for all the engineers to build web applications on top of it. Not just web applications, but also web libraries. You could think of design system components, for example. And there, I don't like to call ourselves as DevOps. We actually like to call ourselves DevOps. So we actually help web engineers to ship these as fast as possible to the browser. I'm also bpaulino0 on Twitter. So if you use Twitter and you want to follow me, please do. I'm there.
So let's just jump right in. This presentation is actually divided in two parts. The first one, we are going to talk about monorepo, multirepos and how you can use them. And the second part, we're going to actually talk about tuberepo. And then to close it, we're going to see a nice live demo. But before we start, I want to talk about multirepos, monorepos and how companies actually organize their code.
The most common approach is actually to have multirepos, right? You go to a company and then they have several different projects and they organize projects in different repositories. That's the most sensible way of doing it. You have several different teams working in different projects. They have their own toolings, they have their own standards and so on, and that's pretty common and very reasonable, right? You want to give teams the independency of actually using their own tools, using their own ways of building software and shipping as fast as possible. But there is another way, right? There is a way of actually organizing your code in the same repository. There is monorepos. Don't confuse monorepos with monoliths.
2. Benefits of Monorepos
So you can still have a monorepo and you can still have microservices, for example, inside of the monorepo. The first reason is code reuse, making it easier to share code with teammates. The second reason is shared standards, allowing for consistent configuration and adoption across projects. Team collaboration is also improved, as engineers can easily share pull requests and work together without additional setup. Finally, atomic changes can be made, ensuring that all apps and packages work together seamlessly.
So you can still have a monorepo and you can still have microservices, for example, inside of the monorepo. The only difference from a monorepo to a multirepo approach is actually that you have all your apps or packages, libraries inside of the same repository.
But then you might ask yourself, why do I need to have a monorepo? Why do I need to put all my code together in the same repository instead of having them separately in individual repositories? There must be a good reason for other companies doing that, right? Like why monorepos are so hot in those companies, for example, Google, Netflix, Facebook, and Twitter, they all use monorepos in some shape or form. And there must be very good reasons for them not to go with the multi-repo approach, right? And let's talk about them.
The first one is actually code reuse. If you have everything under the same repository, then it's much easier for you to share code with your teammates. Like think about it, you could have a modules package where you have all your database modules, and you can also have a package there with your UI components. And then everybody else in your team can just reuse them without having to reinvent anything or to re-import anything or to install any extra package.
The other reason you might want to consider a monorepo is to have shared standards. In a monorepo, you can actually have true shared standards across your code base. Because think about it, you could have a yeslin config where it's completely share across all the packages, and then you can update in one single place. And then every other package can benefit from it immediately. So if you're using TypeScript, for example, you can have one single TypeScript configuration shared across all of your projects. So it was also very easy to adopt and share stuff among the packages inside of the monorepo.
Another thing that works really well in the monorepo setup is team collaboration. So if you have everything under the same repository, you can easily share a pull request with other engineers and get feedback because they have already all the context around the code base. They don't have to set up anything. They don't have to install a different node version. They don't have to set up any SDK in the machine or anything like this because they have been working in the same code base. They have every tool installed already. So they don't have to figure out anything. If you need to do a pair programming session, for example, it's just much easier to do it. You can just share a screen, fire up the dev server, and then everything is in there. You don't need to clone anything. You don't need to do any setup. It's just much easier and much faster for a big team.
Another big selling point for monorepo is atomic changes. If you have a single repository, you can actually change several different apps and packages at the same time, in the same pull requests. This way, you can actually guarantee that everything is going to work together.
3. Efficient Task Execution with Tuburepo
In a monorepo, everything can be in the same pull requests, ensuring that changes are coordinated and tested together. Code isolation is achieved through workspaces, which allow for self-contained packages with explicit dependencies. Tuburepo, a build system specifically for JavaScript and TypeScript, provides efficient task execution by caching previous runs.
This way, you can actually guarantee that everything is going to work together. In contrast, if you have multiple repositories, then you have to make sure that you can coordinate those changes with different teams. And then you make sure that you release this different version of the library or the app in order for you to move forward with a different change. In the monorepo, everything could be together in the same pull requests. If you have tests, if you have built pipelines, everything in your CI system, you can actually guarantee that everything is going to work, or you don't merge that change. So it's more straightforward to keep everything in sync.
Last but not least, another important point is isolation. You might ask yourself, how can you have code isolation in a monorepo, where everything is inside of the same repository? This is possible because of workspaces. Today, npm, pnpm and YARN all implement workspaces, which is a way of actually having self-contained packages. So all your packages, including libraries and apps, they're fully self-contained with their own dependencies declared and installed separately. And the dependencies are all explicitly defined in each package. So you can actually make sure that those packages, they have all the dependencies they need to be properly built, tested and shipped to production.
All right. Now we see that we have got the packaging sorted, right? So it looks like we can actually have a solid mono repo thanks to workspaces. This is a very neat feature. It works really well across all the common packaging systems that we have now including npm, pnpm and YARN. But running tasks efficiently, it's quite tricky. That's still not very efficient. It can be very challenging or at least it was until now. Today we have Tuburepo, which is a build system that was created specifically for the JavaScript and TypeScript ecosystem. Tuburepo was built by Jared Palmer. He's a very prolific engineer and he's doing Tuburepo as an open source tool. Today Tuburepo is part of VSL and it's still being built in the open source. So let's take a look at the features that Tuburepo provide us so that we can actually build this mono repo in a very efficient way. The first one is that Tuburepo never does the same work twice. So if you run a build or a test or a LinkedIn task, Tuburepo is going to remember that and is going to cache that for you. If you want to do it again for a different package that didn't change, Tuburepo will just immediately say, hey, you have done this task already. You don't have to run it again. I'll just show you the logs here from the previous run. And then everything was just working the same way.
4. Improved Task Execution with Tuburepo
Tuburepo provides optimized scheduling and caching, enabling efficient task execution. It runs tasks in parallel, handles dependencies, and shares the cache across teams and CI systems. Tuburepo has zero runtime overhead. In our live demo, we'll explore a demo repository using Tuburepo as a base.
You're going to see that in our live demo at the end.
Another thing that's a big win when using Tuburepo is optimized scheduling. Tuburepo will figure out how many CPUs you have available and run as many tasks in parallel as possible. It can handle tasks that are completely independent or have dependencies. Tuburepo ensures that dependent tasks are executed in the correct order. Tuburepo's caching feature allows you to share the cache across your dev team and CI systems, improving performance. Tuburepo has zero runtime overhead and does not affect your code when it goes to production.
Let's jump right into our live demo. Here we have a demo repository that uses Tuburepo as a base. We're using pnpm to manage the packages, but you can use yarn or npm as well. In the monorepo, we have a package.json file where tasks and dependencies are declared. We also have a packages folder that contains all the packages.
5. Exploring the Demo Repository
Tuburepo has zero runtime overhead and is just a dev dependency, not affecting your code in production. In our live demo, we'll explore a monorepo using Tuburepo as a base. We'll tour the monorepo, highlighting the package.json file with monorepo-specific tasks and dependencies, the packages folder containing shared packages, and the apps folder with separate apps for different purposes.
We're also going to see that in action in our live demo very soon. Last, but not least, Tuburepo has zero runtime overhead, which means that Tuburepo is just a dev dependency. It doesn't ship anything to your code when it goes to production. In fact, all your packages, they don't even know that Tuburepo exists, because Tuburepo is just existing at the monorepo level, and all your packages are completely independent and unaware of Tuburepo at all in your package.
So enough talking. Let's jump right into our live demo. Here we have the link for our repository. You don't have to code along, just follow me on the screen. But at the end, if you want to give a start to the repository, feel free.
Alright, so we're here in our VS code, and I have here a demo repository that I prepared for this talk. Here we have a monorepo using Tuburepo as a base. In this monorepo, I'm using pnpm to manage the packages, but you could as well use yarn or npm in the same way. So let's take a look, let's have a little tour across the monorepo and understand how we can actually compose those packages.
Here at the root level, we have a package.json file. Just the same package.json file that you're used to see everywhere in JavaScript or TypeScript projects. The only difference here that in a monorepo, you actually have tasks and dependencies declared here that are just meant for the monorepo. Okay, we're going to have a look at specific tasks later on. Here I have a packages folder with a bunch of all the packages. So we have like a collection of packages here. The first one is yes-lint-config, it's basically like this common yes-lint-config that I can actually share across all my packages in our monorepo. Then we have a ts-config, it's basically our TypeScript configuration that's also shared across packages and apps. Then we have a UI library, it's pretty much our design system, let's say. It's a very small library just for the sake of this demo, but you'll see that in action very soon. And then on top of it, we have the apps folder. And that's where we kind of like segregate our apps in a separate way. So we have apps and packages, that's more meant like for libraries, and then apps, you can actually build docker image, for example, you can actually deploy to a serverless environment and so on. It doesn't really matter. In this case, we have two apps. We have the Shop app, that's our next JS app, where customers go to buy t-shirts. And then we have our admin app.
6. Using TurboRepo for Monorepo Development
TurboRepo helps with Monorepo development by executing tasks in parallel. The dev task in the package.json file runs Turbo in parallel for all the dev packages. The console shows different outputs for admin dev, shop dev, and UI dev. The shop and admin app can be accessed in the browser, with the admin app serving as the back office for employees. A common feature among these apps is the blue button, likely from the design system component library.
It's kind of like the back office. So then employees can refund order, ship stuff to the customer, and so on. So let's have a look at how TurboRepo can actually help us with the development in our Monorepo.
I'm going to open my terminal here. And then I'm going to run pnpm dev. This is going to fire up our dev server in our Monorepo. And this is where Turbo is already helping us to execute tasks in parallel as much as possible. Remember that I told you that you can actually declare tasks and then Turbo can take care of running them in parallel? This is what's happening here in my package.json file at the root level of the Monorepo. I have a dev task here that's using Turbo. It's our dependency here that's declared below as a dev dependency. I'm just using the latest version of Turbo Repo and we have Turbo here calling Turbo run dev dash dash parallel, which means that, hey, just execute all the tasks that we have in our packages called dev in parallel and don't care about their dependencies between each other. So just run them in parallel. And that's what Turbo is doing here.
If you look at the console, we can see that we have a bunch of different outputs here even with different colors. So we have the admin dev, we have the shop dev, and we have the UI dev. So this is executing the dev command for all of those packages. So let's go to a browser and then let's see how our shop and admin app look like. So I'm going to go to localhost 3000 and that should be our shop. So as you can see here, we have the Turbo store and I can actually add stuff to the cart. It's just a dummy store. It's not doing much, but you can see here there's a little button here. And then I'm going to go to localhost 2001, 3001 and that's where we have the Turbo admin and that's kind of like our admin back office for employees. And here you can refund orders. That's also just a mock app to show you. But notice that a common thing among those apps is this blue button here. Right. It's actually quite blue. And they're very similar. So this is probably coming from our design system component library. Let's take a look.
7. Developing and Testing in a Monorepo
In VS Code, we can easily make changes to the UI library that reflects in both the admin and shop apps. Development in a monorepo with TurboRepo simplifies the process compared to separate repositories. We can also run tests and lint tasks across all packages.
Let's go to VS Code again. And then let's take a look at packages, UI and then we have here SRC button. So we have a very simple React button here. It doesn't do much, but we have some CSS here. Let's change this a little bit. So let's change it to some shade of black. And then let's go back to Firefox. We can actually see that reflected immediately. Right. So our dev environment is actually picking up change across packages. Notice that I didn't do any change to my admin app. I actually changed my UI library. And then we can already see that reflected here. So see it's already working the same way. And then if I go to my shop, we also see that the button, it's also in a shade of black here. And that's pretty cool. So you can actually do development locally without having to set up anything extra. Imagine that you have your UI library in a different repository, and imagine that you're using that also in your admin app and your shop app. That's just much more difficult for you to do development, right? You'd have to release a new version of the library and then import this new version of the library in your app and so on. So it gets a little bit more complicated to the development. In a monorepo, including turbo repo, then things are much simpler to do.
Now back to VS Code. Since we did some changes here, let's take a look at how we can run test and lint here. I'm just going to stop my dev server. And then I'm going to do a lint task here. I'm going to execute pmp and lint. And then let's see what happens. It executed a lint task. And we see here a bunch of other hints. So our lint here is configured to run across all of our packages.
8. Smart Task Execution in Monorepo
Tubo intelligently determines which lint tasks to execute based on the cache. It only executes necessary tasks, resulting in efficient monorepo development. Running the PNPM test command provides test results and output without explicitly executing a build.
So we have here UI lint, we have here shop lint and we have here admin lint. But notice down here that we see an output saying three successful and two of them were cached and three in total. So which means that two of them have already been executed in the past. So Tubo already knows that, oh, just one package change, which was just the UI package. And if you look closely here, we can see that that the UI lint here is a cache miss. So Tubo knows, hey, we don't have a cache for these hashes here. Tubo has a very smart hashing algorithm behind the scenes that's just computing which files have changed and then it knows where it needs to execute that task. So in this case, it is a cache miss. So it's going to execute the lint for this package. For the other packages, we have here shop lint. It's a cache hit. Notice that Tubo found this already in the cache. So it's not going to execute this. For the other package, admin lint, it's also a cache hit. So it has already been executed before. Because I executed that before showing you here in demo and then it's already cached. And that's pretty cool. Right. So now your monorepo is just executing what actually it's necessary.
So now I'm going to run another command here. PNPM tests. So let's test our packages. Let's see how that behaves. I execute test task. And then we have a bunch of output here. So let's take a look. We have now a different output here. We have the UI tests and the UI build. I didn't tell Tubo or PNPM to execute a build. Right.
9. Declaring Task Dependencies in Tubo
Learn how to declare dependencies between tasks in Tubo using Tubo.json. The pipeline in the Tubo.json file allows you to specify the order of task execution and their dependencies. By using the 'depends on' key, you can ensure that tasks are executed in the correct order. Outputs can also be specified to define the expected output of each task.
I just told it to execute a test. But remember that I told you that you can actually declare dependencies between tasks. In this case, my test task, it's actually dependent on a build task. So it makes sure that the build is executed first. So let's take a look how you can accomplish that with Tubo.
So once you have Tubo installed in your repository, the only thing that you need is a Tubo.json file. It's this file here in the root of your monorepo and that's the file where you declare everything related to Tubo repo. And the most important thing that you have there is this pipeline here. Pipeline is where you declare your pipeline of tasks. And that's where you have to be very explicit in which tasks do you want to run with Tubo and how their dependencies are going to look like.
So for example, we have the build task here and then we have a key called depends on. Depends on basically tells Tubo, hey, before you run the build task, make sure that you're have executed anything else that depend on it. So in this case is caret build, which means that, hey, execute any build task from any dependency that I depend on before you execute my build. So for example, if the admin app depends on the UI library, it's going to build the UI library first and then it's going to be the app next. So to make sure that the UI library will be ready before I can build the app. The outputs here declares what we expect to be an output for this task. So if I'm doing a build in the next project, I can actually have several output folders, right? Next outputs several different folders. In this case, I want to be able to cache the this folder and the .next folder, which means that you can actually retrieve those artifacts from the cache once you need them. The test task, it's no different. It follows the same standard. It has depends on and then it says caret build, same thing we had for build, but it also depends on build again. So which means that whenever you execute a test task for any package, make sure that you execute a build first for any package that I depend on. Let's go back to our example for the app. So we have the shop app that depends on the UI library. If I want to test the shop app, I want to make sure that the UI library is ready and available for me. So I'm going to execute the build first and then I'm going to run the test. And that's also true for the library itself. It's going to build the library itself and then it's going to execute. Here outputs means that this task is expected to output something. Right.
10. Task Dependencies and Caching in Tubo
The test task doesn't output anything and only caches logs. The link task doesn't output anything and can run in parallel. The dev command doesn't have dependencies, outputs, or caching.
In this case, it's an empty array. It doesn't output anything. So basically the test task you don't have to, doesn't have to cache anything besides the logs that it generates.
Then we have the link task, which doesn't have a depends on, it just has the outputs. And here it declares an empty array again, which means that it doesn't output anything. It doesn't depend on anybody. Notice that we don't have it depend on depends on key here, which means that you can run lint in parallel as much as possible.
Then we have the dev command. The dev command here also doesn't have any dependencies and also doesn't output anything. It just has a cache false, which means that Tubo never caches anything for the Dev, which means that whenever you're doing development locally, you always want to have the fresh output. All right.
11. Executing Tasks in Tubo
We execute multiple tasks across the monorepo using the tubo binary. In CI, Tubo executes tasks in parallel, considering dependencies.
Then we come back again to our package.json file. Remember that I showed you the dev command? It's the same for build, lint, and test and CI. We're going to execute all of them. We have executed already the lint and the test, and then we're going to execute the CI in our CI environment, and I'm going to show you how this outputs there. So whenever you want to execute a task across your monorepo, you just execute them calling the tubo binary. So it's just called tubo. And then you pass the command run, and then you tell the task that TuboNet needs to execute. In this case, it's test. So it does just one specific task for all packages across your monorepo. In our CI, it's a little bit different. I'm telling Tubo to execute lint and test. So it's going to execute all of those tasks in parallel as much as possible. If they have any dependencies, then they will execute those task dependencies before they can be executed in parallel as well.
12. Caching and CI Behavior
Tubo caches the executed tasks and replays the logs. Tubo knows which tasks to execute based on the declared tasks in the package.json file. When pushing to GitHub, the CI environment fetches the remote cache and replays the logs.
All right, so I have shown you here in my terminal that Tubo is caching something already, right? We saw here that it had some cache before. So let me run a new command here. I'm going to execute again the pnpm test command. And now we have a different output. We see the full Tubo here, which means that TuboRepo knows that nothing else changed in our package. We did change the button here, but we have already executed the test, right? So now Tubo has already cached this and it remembers for as long as you don't change the code. If I try to execute the link again, same thing that we did before, it's also a full Tubo. That's great. It's not executing anything else. It's very fast. You see, this is 100 milliseconds. It executed everything. And then it also showed us all the logs that it had before because Tubo was still caching and replaying them to you, so you can see what happened before.
All right, so we have seen here how to do development with Tubo, how we can actually configure our tasks across the repositories and how we can execute Tubo commands across the repo. So the thing that's missing here is how Tubo knows that it needs to execute like a dev task or a lint task or a test task across our packages. The only thing that you have to make sure is that on your package you have those tasks declared. So let's take a look at our UI package here and see how the package.json file looks like. So we have here the package.json file inside of the UI package, and here we have a bunch of scripts. But notice that we also have a dev command here, we have a test, and we have a lint. And that's how Tubo knows how to invoke those scripts. As long as they have the same name across the packages, Tubo will execute them immediately.
All right, so now we know how Tubo figures this out across all of our packages. So let's commit this and then let's push it to GitHub and see how that behaves in our CI environment. So I'm pushing this straight to the main branch. And what do I expect there? I expect that my CI environment, being aware of the cache, it's going to be able to fetch this remote cache that I generated locally here, and will just replay the logs. Remember that I told you that Tubo never does the same work twice? So it should also be true for our CI environment. If we have done a task already, either on our local machine or in a CI environment, then Tubo should be able to capture this cache and just replay the logs. Let's head to GitHub and let's see how this looks like. I'm now on the GitHub page of this repository and I can see here that the action has already been executed. So let's check it out.
13. Executing CI Tasks with Tubo
The checks step is crucial for executing pnpm commands. The CI script, run pnpm ci, executes linting and testing in parallel while respecting dependencies. Tubo fetches the cache generated locally and shares it across CI runners, even with different CI systems.
So we have a bunch of steps here, but the most important one is the checks. That's the one that we actually execute our pnpm commands. So we have a command here that we executed. Let's increase the font a little bit. And then we have a run pnpm ci. And that's our CI scripts. Let's take a look at our VS Code again, and then we should be able to see the CI script here. You see Tubo run lint and test. So it's doing linting and test at the same time.
Back to GitHub, let's take a look at the output here. So we see UI build, we see admin lint. We see UI lint, UI tests, a bunch of outputs here that we actually expect that Tubo will be able to execute them in parallel, but also respect the dependencies. So let's scroll down a little bit more and then let's see the output. It took like 1.3 seconds to execute all of those tasks. And then we see full Tubo here again, which means that Tubo was able to fetch this cache that I generated locally here in my machine and then share it across all of my CI runners. And if you're not using GitHub actions, but if you're using GitLab CI, for example, that works in the same way. You can still share your cache across any other continuous integration system.
And that's it. And that's what I wanted to share with you here. Thank you very much. And I'll see you around.
Comments