Video Summary and Transcription
This Talk discusses automating code changes for Android repositories, utilizing tools like JSCodeShift and Abstract Syntax Tree. The speaker shares a real use case example of maintaining a design system library and making changes to a component. The talk emphasizes the importance of automating repetitive tasks and using the power of abstract syntax tree for code changes. The Q&A session covers topics like source code formatting, TypeScript support, and cultural embedding of code mods. The talk concludes with insights on when automation is worth it and the limitations of code mods for monorepo changes.
1. Automating Code Changes for Android Repositories
Hello everyone! Today, I will share how I automated code changes for Android repositories. I work at Xebia as a software developer consultant and help organize conferences. Let me give you some details about the scale of the product. MarkPlus.nl is the largest marketplace in the Netherlands with millions of visitors and 90 million live ads. We run 59 standalone BFF services in production with more than 15 releases daily. We have a front-end platform team responsible for enabling engineers to deliver secure and efficient software.
So, hello everyone! How's it going? Do you enjoy the conference? Okay, cool.
How many developers here? Wow, quite a lot. How many of lazy developers here? All right, let me rephrase a bit, a bit less. How many lazy developers who like to automate things? Wow, I like it, now we talk!
Okay, today I will share how I automated code changes for Android repositories. My name is Konstantin. I'm working at Xebia in the Netherlands as a software developer consultant, and I also help Git Nation to organise the conferences. Feel free to reach me out on Twitter or LinkedIn.
My talk is connected to my current customer, and to understand the problem and the solution better, I'll give you a bit of details about the scale of the product itself. I'm working at MarkPlus.nl. This is the largest marketplace in the Netherlands. It has millions of visitors daily. Today, it's 90 million live ads. Hundreds of thousands of new ads every day. If you came from the Netherlands, you absolutely for sure know this product. If you are from Germany, you can compare it to eBay DE.
For today, we run 59 standalone BFF services in production. On average, there are more than 15 production releases daily. And there are no limits for that. To support all the required features, we have more than 100 front-end-related repositories with JavaScript or TypeScript code. And to make it work properly, the code ownership is spread across different front-end teams. Yeah, those front-end teams are part of bigger domain teams, or full stack teams, you can call them. And those are strongly focused only on building business logic. So, ideally, nothing else.
And to support the product teams, we have a front-end platform team. This is the team I'm working at. Our goal, basically, is to enable other front-end engineers to deliver secure, reliable, and efficient software. So, my team is responsible for many things for the front-end platform itself. So, we can say like it builds, the deployments, React Server-side rendering engine, design system libraries, performance, security, monitoring, observability. So, all these issues are handled by a front-end platform team. This distributed setup has a lot of advantages.
2. Challenges and Updating Projects
At the same time, the library publishes NPM packages and changes in one library can lead to multiple updates in dependent projects. We utilize internal tools and renovate bot to track and automate these updates. Although it may seem like overhead, we benefit greatly from this approach. Let's move on to a real use case example of maintaining a design system library and making changes to a component called primary button. To update a project with the latest design system package, we need to fix the toast component by replacing primary button with button and adding a new property called kind.
At the same time, it, obviously, has some challenges, right? For example, the library publishes NPM packages. And the change in one library usually leads to multiple updates and dependent projects. Of course, we have some internal tools to track and orchestrate with all these dependencies between the projects. And we utilize renovate bot, for instance, to perform automated updates for them.
Well, that might look like overhead, maybe a hassle, but we benefit a lot from this approach. Yeah. If you want to discuss the architecture more in depth, you can find me after the talk or reach me out on Twitter. There is a Twitter in the bottom. You can join me. You can follow me. So we can discuss architecture. But today I'm going to talk about other things.
Let's get closer to a more real use case example. Imagine you are maintaining a design system library that exports the primary button component. So far this component has only one property, one close handler. There's another project, for instance, some backend for a frontend, which has a toast component. It doesn't really matter what this component does. You only need to know that it imports and uses primary button which is provided by Design Package System, version 1. And as a design system library maintainer, one day it decided to make some changes. So primary button is no longer exist. It's just replaced by just button which is being exported. And the new component has extra property, you can see, property kind, which is basically responsible for the kind of how the button look like. So this is, with perspective of Design System Package, it's a breaking change. So that's why we released version 2 to follow the semantic versioning.
Now, to update the project with the latest design system package, we need to fix our toast component. So we're basically need to replace primary button with button. We need to add a new property kind to button and we use primary value for it. And everything else should stay the same, like children or other properties. So far, it's obvious how to do. This Git diff looks really simple, but someone has to do those changes.
3. Automating Code Changes with Abstract Syntax Tree
So we can automate code changes to support other frontend teams using the API net pipeline. By modifying the code, applying auto-fixes, and running tests, we ensure a smooth process. The most powerful way to update code is by using Abstract Syntax Tree, which allows us to manipulate the object representation of code. Tools like Babel and ESLint work with this concept. JSCodeShift is a powerful tool for effectively mutating nodes in the Abstract Syntax Tree.
So we can wait until other frontend teams will pick up those changes and fix these dependencies, but dependencies are lazy, right? You know it. And it's a platform team. We don't want that our changes will fail and will somehow block other frontend teams. So we need to support them for that. But how can we do it? Many files use this component and library. And let's even multiply it by a number of projects and, for instance, use design system.
So automation for the rescue. This is the API net, the pipeline I use to automate such changes. The first step is code-based preparation, like GitHub, Codespace, or checking your local machine, then you modify code the way we want. Then we apply auto-fixes like code styling or ESLint fixes. And to make sure that nothing is broken, run tests, build, and the final step, of course, deliver or commit your changes.
Okay. Let's focus only on code modification step. We can update code in many automated ways, but the most powerful way is based on usage of Abstract Syntax Tree. So if you're not familiar with Abstract Syntax Tree, you can basically think of it like an object representation of code after being parsed. So, for example, this is a function named grid, right? This function returns a string Hi Berlin. This is Abstract Syntax Tree that looks like a array of nodes with attributes. The defined function corresponds to a function declaration node that has identifier grid, and it doesn't have any params. So what is this function? It is just an array of statements, but our function has only one return statement, which is string literal Hi Berlin.
Okay, so we can parse JavaScript or TypeScript code into Abstract Syntax Tree, or basically just an object. You can think of it like an object. But that gives us possibility to update the tree easily because we can manipulate this object. For instance, we can add, remove, replace, or even update some nodes, and if we will generate new source code from the mutated Abstract Syntax Tree, we will get a new modified source code. So basically this is... you know such tools like Babel or ESLint? This is basically what's happening there under the hood. So let's get back to our example with Toast and Design System Package. We can use Babel or other parser to convert source code to Abstract Syntax Tree, but how effectively mutate the nodes? We can traverse manually, search somehow, but there are better ways. There are several great tools that could help us with that, but I find JSCodeShift is one of the most powerful tools for that. And truly, I think that this is underused and underestimated tool. This is basically a codemode toolkit that was open-sourced by Facebook in 2015, so it's not a new library.
4. JSCodeShift and Code Mod Implementation
React is two years older than JSCodeShift and is still widely used. JSCodeShift is a popular tool used by many projects. It provides a simple API for ordering and modifying the code tree. The transformation function is a crucial part of JSCodeShift, and it accepts various parameters. By parsing the code into an abstract syntax tree and generating the code back, all mutations happen in between. Let's implement a code mod for a breaking change in our design system library and apply it to the toast component.
But yeah, React is also two years older than JSCodeShift. And we are today still talking about React at this conference. JSCodeShift is used by many other projects. For instance, you can find its dependency in React, Remix, Blizz.js, Storybook, you name it. There are multiple of them.
And to order and modify your tree, JSCodeShift provides you some simple API. For instance, you can use the transformation function for location to a transformation function. That's one of the parameters. And you need to provide a path to files you want to modify. So a lot of code words already written. You can find them in the Internet and Github.
But let's create one for our specific use case. Well, this is source code of the transformation function. It might look scary from the first glance. But you will for sure understand it in a couple of next minutes. Bear with me. Transformation or transform is just one JavaScript function that accepts several parameters. You can see file info. These are details about file being processed. Or API. This is an object that provides access to the helper functions. And options is extra parameters you can use for parsing or manipulating history. So you can provide it from external calls, for instance. And this slide is crucial for understanding. As the first highlighted line, you see the code is parsed from basically a string or source to abstract syntax tree. And at the last line, we generate back our code from abstract syntax tree. So as you already might guess, all mutations are happening in between of the slides.
Okay, let's implement a code mod for breaking change from our design system library and let's apply it for a toast component. This is part of abstract syntax tree that represents this node with our import declaration. You see the source from design system package and it has three specifiers, import specifiers and we are looking specifically as one for our primary button which is in the middle.
5. Fixing Imports and Mutating the AST
So let's fix it with the gs-code-shift transformations. First, we need to find the node responsible for imports from the design system package. We use the find method to locate the import declaration node type. We then replace the primary button node with a new one using the utility function from JS code shift. Next, we handle the primary button G6 element within the component by creating a new GSX element and copying the existing attributes from the legacy node. Finally, we copy all the children from the legacy node. That's it!
So let's fix it with the gs-code-shift transformations. First, we need to find the node that is responsible for imports from design system package. So gs-code-shift provides API and in this case, we use find method. We are looking for import declaration node type. Find method also accepts some filter parameters. For instance, we are looking for a node which has source attributes and has value design system. So with this collection of imports, we are basically looking for imports specifier that has imported main button. Quite simple so far.
And we have another method utility function from JS code shift that helps us to replace the existing node with another one. So this is a mutation operation for AST. So we replace this primary button node with new one. And to create a new one, we use another API. We can construct a new input specifier with an identifier button. So after this step, basically, our import is fixed and our AST is mutated.
But how should we deal with the primary button G6 element within the component? So we already know the concept. We know the approach. We are looking for all the G6 elements and we are looking specifically for a primary button opening element, because it's not self-closing an element. We are looking for instance for a primary button opening element here. As our component basically can utilize several primary buttons elements, we get a collection from this response from these methods. So for each of these elements, we need to mutate the legacy element to a new one.
So in order to do this, let's create a new GSX element. We're using the same approach from the JS Code Shift API, create a new GSX element, and here we provide a button as an opening element. And for our opening element, we also can add here an attribute kind we were talking about before. We provide a value, primary. Don't forget to copy all existing attributes from the legacy node, right? In our case, it's on click property. And of course, the closing button element. And the last step, we need to provide all the children so we can copy all the children from the legacy node. In our case, it's just a close text. Okay, final step. We all familiar with this API.
Automating Code Changes and Takeaways
We just implemented the code mod to update the Tost6 component. Now, we can apply the same code mod for the entire code base. We have more than hundreds of projects, but with our automated pipeline, we can update the entire code base for all the front ends. Takeaways: automate repetitive tasks and use the power of abstract syntax tree for code changes.
We replace all legacy nodes with newly created ones. And that's it. We just implemented the code mod to update the Tost6 component, right? Now, we can apply the same code mod for the entire code base. And, for instance, it's not J6CodeShift, it's not only one file path. It accepts like the path to many files. And this command will basically update all the users of the primary button within one project. But how should we deal with other projects? We have more than hundreds of projects. So yeah, remember, we have this automated pipeline, which I mentioned before. So we can integrate the code mod into this pipeline. And yeah, we can update with this approach the entire code base for all the front ends. And yeah, takeaways. Automate repetitive tasks. Use the power of abstract syntax tree to code changes. And if you haven't, come home and write your first code mod if you haven't done it yet. Thank you. Applause Thank you ever so much. Please join me over here for some questions.
Q&A on Sourcecode AST Transformation
Wow, we've had a lot of questions. Thank you ever so much to our audience members for submitting them. The first question is does sourcecode-ast-sourcecode transformation keep the original source code formatting? By default, yes. Next question. Do you create these transformers or are teams forced to also commit them? Well, originally I introduced this approach to create transformers within our frontend guild or chapter. Now I am warning the teams to use transformers the same. How good have you found TypeScript or TSX support is with JS code shift? Yeah, it's fully supported. Cool. Thank you. So, next one. Angular has nice tooling for this already.
Wow, we've had a lot of questions. Thank you ever so much to our audience members for submitting them. We'll crack right on and get through however many we can in the 12ish minutes we have.
The first question is does sourcecode-ast-sourcecode transformation keep the original source code formatting? That's a good question because we can use different parsers and different parsers provide different outputs. For instance, we have Babel parser, Recast or TypeScript parser and some of them provide your original indentations or locations of every node. By default, yes, but we can also provide this option parameter which is the third one. There we can also add some extra things like prettify code the way we want and so on. But to answer this question, yes. Excellent. Thank you.
Next question. Wow, we have so many. Do you create these transformers or are teams forced to also commit them? Well originally I introduced this approach to create transformers within our frontend guild or chapter, I would say. Now I am warning the teams to use transformers the same. Because it's really powerful and there are some edge cases we need to update code in many places. And, yeah, this is already used by other developers. So basically we have a CLI tool or we have this CLI tool integrated into our pipeline. So we can just provide transformation function and it will apply changes everywhere. By the way, you have this transform.js function, it can be even URL. So just cost shift can download from somewhere outside and just apply to your transformations. Interesting. And I think this might come out again in some of the other questions that have been asked. But before we get to those, how good have you found TypeScript or TSX support is with JS code shift? Yeah, it's fully supported. So you can use the proper parser for that. So you can also update types the way you also update other nodes. So yeah, TypeScript also has all node types using the IST. Cool. Thank you.
So, next one. Angular has nice tooling for this already.
Automation and Cultural Embedding of Code Mods
For example, automating code updates when updating libraries using ngUpdate. Not yet, as our case is not related to updating libraries. It's rare. Codeshift has a big community and a website called Codeshift Community. You can provide a config to define the migration from version 1 to version 2. It's important to make code mods supported to make consumers happy. Not every project needs code mods, but in specific cases, they can be fixed. After the Q&A, you can ask more questions directly to Konstantin. Good tooling for testing Codemods is to create an entry file and test the output using a testing framework.
For example, automatically running code updates when updating libraries using ngUpdate. Do you use similar automation? Not yet. Because our case is not really related to update the libraries. This is just an example. And it's really rare. But, for instance, Codeshift has quite a big community. And there is a website called Codeshift Community something. The build is CLI which basically provides some extra API on top of JSCodeshift. And you can provide the config when you can define we migrating the system from version 1 to design system version 2. And this is transformation function to apply. So, basically, you can provide this your releases of the library. Transformation function like ngUpdate does. Cool.
There's a few that are kind of similar. I'll ask the one that's at the top of my list here. Which is how do you or how do you suggest others culturally embed writing code mods? Is there a concept like who writes breaking changes must supply a code mod? Well, it depends, of course. But if you want to make happy the consumers of your library or project, just make it with code mods supported, right? You can release breaking changes, but no one will be happy about it. But if you just can update smoothly, you will get more benefit of it. Yeah. And that's completely fair. I wonder, though, if there's more processes or mental models or cultural rituals that you build within your engineering team that encourage people to build code mods specifically when they make breaking changes? Yeah. The thing is that not every project or company needs code mods, right? Because it's quite hard code and it requires some skills. But, for instance, if you have a specific case, you can fix these code mods. Just try once, see the smiles on the faces of other developers, just show them my presentation, and it will be okay.
And just a reminder for the audience, when we're done with our Q&A, we've still got some time. We can always go over to the speaker Q&A area and you're welcome to ask Konstantin some questions directly, because there are so many, I don't think we're going to get through them all in eight minutes. I think the next one, I think currently the most upvoted question is, is there any good tooling for testing Codemods? Or approaches perhaps, not necessarily just tooling. Well, yeah, basically the approach is quite simple. You can just create a file, like entry file, like fixtures, and you can also test the output. And basically you can provide those features for testing framework and run Codemods.
Testing and Mono-repos
So it will take the original file, compare it with expected, and test it. You can output new files for easy comparison. Regarding mono-repos, it doesn't make sense to run Codemod unless you have some edge case. If you have independent code that needs to be updated in multiple places, Codemod can be useful.
So it will take this original file, compare it with expected, and it will be tested. Cool. Oh, because you can output new files, so you can do an easy comparison. Awesome.
What is your take on mono-repos or, asked differently, would you rather run a codemod in one mono-repo or many non-mono-repos? It doesn't really make sense to run Codemod in mono-repo unless you have some edge case. But for instance, if you are exporting one button, in the case I just showed, you can just use internal IDE tools to rename it everywhere, right? Yeah. So it doesn't really matter. But if you, for instance, have independent code, which has to be updated in many places, maybe, maybe you could use Codemod. But it's not so relevant to mono-repos. Yeah, makes sense. I think this is a forever developer question.
Automation Worth and Codemods
At what point is automation worth it? I recently spent a whole day writing a script for a boring task that would have taken me an hour or two manually. Codemods work with the as keyword to alias imports. gscode-shift is an abstraction above AST that supports different parsers. Transform functions can be parameterized for common reusable actions. The tools presented today may not be the most appropriate for Monorepo changes.
At what point is automation, in your opinion, even worth it? Have you had any examples where it would take more time to build automation scripts than doing the manual work to update instances?
Well, it's kind of related to a topic about laziness, and also about curiosity. I can endlessly do this job, like changing here and there all the time. But I'm a software developer by nature, and I like to build these things. Of course, as a software development consultant, I have to sell my customers this approach. And I build some POC, for instance, and show it how it works. So I spent not a lot of time. To start with GS Code Shift is just easy, really. You can try it and get benefit in one hour. It's more about integration into the current process. But once I achieved this, and brought this CLI, it already was used multiple times, and it was already paid back.
I recently had a scenario where I spent a whole day writing a script to do something, and it was a boring, menial task. I didn't want to do it, but it would have only taken me an hour or two to do it manually.
How do codemods work with the as keyword to alias imports? I'm pretty sure this is the structure part of AST. So if it's parsed by Babel, for instance, it will be supported by gscode-shift as well. So gscode-shift, once again, this is kind of a way just abstraction above AST which provides you some helper function to traverse and mutate the tree. And the parser could be different. It can be TypeScript. It can be Babel. It can be other parser. So if it's supported in JavaScript, if it's supported in Babel, it will work with gscode-shift as well.
Is it possible to parameterize the transform functions to create common reusable actions, for example, rename component.js, change props.js, and so on? Exactly. So this is a third parameter option, which is four. So you can provide extra parameters, any custom parameters, and you can get them from object. Awesome. This is a very kind of quite flexible and extensible way to be doing tasks like this. It's kind of the vibe I'm getting. Every question is like, does it work in this context? Does it work in that context? You're like, yeah. Yeah, it does. I suppose I'll ask the inverse question, which is, in what circumstances have you found that the tools you present today are not necessarily the most appropriate way to make these kind of changes? Or are there any? Well, the first one we just mentioned recently is just Monorepo. So you have to think twice.
Code Mod and Migration
Do you need codemode or not? If your job is not repetitive and doesn't require automation, codemode may not be a good choice. You can use codemode to migrate from enzyme to a testing library and perform basic migrations. However, using the diff of the AST of two different library versions to generate transformers is not feasible. Thank you for the amazing talk and for all the questions from the audience.
Do you need codemode or not? Second one, you don't even have Monorepo. You have one project. There's no need for a codemode as well. So if you have some job which is not repetitive, that does not require automation, then maybe codemode is not a good choice. Yeah, a little bit of a heavy-handed approach, perhaps.
As we're rounding out and we have just a few minutes left, I'm going to take a scroll through the questions and find those which have kind of the most upvotes. So the ones where most people have asked for them to be asked.
Could you use codemode to migrate from an enzyme to a testing library? Well, probably you could implement the one. If it doesn't exist yet, haven't checked, just check first, maybe it's already there. But probably you could, because with codemode you basically can do a lot. For instance, you can find the codemode that migrates class components to functional components in React's codemode CLI library. It doesn't support advanced cases, but it does some basic migration. If you need to use static analysis to understand what needs to be done, you can base some algorithm on that. So you can basically implement this codemode.
Cool. I think we might have time for one more really quick one, which is can you use the diff of the AST of two different library versions to generate transformers rather than writing them yourself. Well, not really, I guess, because you need to reverse the approach, right? You have to make a transform function out of these two source codes. It's kind of tricky. I guess it's time for AI for that. Yeah, yeah, absolutely. It saves us all the time writing code, perhaps any code at all, and then what do we do? Thank you so much for the amazing talk. Thank you so much for being so flexible in using up the time we have today when we asked you to fill this slot. Thank you on behalf of the whole audience. Thank you to everyone who asked questions. I know we didn't get round to all of them.
Comments