1. Introduction to Road to Zero Lint Failures
Today I'll be talking about our journey from thousands of Lint failures to zero. Lint roles ensure consistency, bug prevention, and maintenance. We run our Lint roles in three stages – during development, pre-commit, and pre-merge. For context, our codebase at Linton is over 80 years old. We have over 24,000 files and over 100k commits. This is compounded by having a lot of Lint failures already existing in the codebase as we are starting this process.
Hi, and welcome to Road to Zero Lint Failures, I'm Chris Ng and I'm a Senior Staff Software Engineer at LinkedIn. Today I'll be talking about our journey from thousands of Lint failures to zero. So why Lint roles? Lint roles ensure consistency, bug prevention, and maintenance. It allows us to scale guidance to all the other teams within LinkedIn to ensure that everyone is covered with the latest and greatest code standards, and also edge cases to prevent common mistakes, and ensure that the code base is consistent and deterministic when we're doing migrations.
We run our Lint roles in three stages – during development, pre-commit, and pre-merge. This way you are alerted when you are violating a Lint role as early as possible before involving other engineers. So what's the problem, right? Let's add all the Lint roles. I think this is a common conception, unfortunately in real life we face scaling issues.
So code quality and scale. For context, our codebase at Linton is over 80 years old. React was version 0.13 at the time this repo was created. We have over 24,000 files and over 100k commits. This is compounded by having a lot of Lint failures already existing in the codebase as we are starting this process. I believe there was over 7,000 Lint failures when we started the road to 0. This is also coupled with an ever increasing amount of Lint failures being introduced to the codebase due to either introducing new Lint rules or introducing new code which does not fix existing Lint failures.
2. Road to Zero Lint Failures: Incentives and Tooling
We introduced rules to limit the introduction of Lint failures and implemented a two-step review process. However, these measures did not completely eliminate Lint failures. To address this, we implemented a new rule where every Lint rule must fix all existing errors before being enabled. We also focused on providing incentives for developers to fix Lint failures, such as technical support, shout outs, and a good developer experience. We made it easy for engineers to fix Lint failures by providing tooling that identified errors and their owners.
So ideally, you come to a campsite and you leave it better than you found it. That's kind of like the campsite analogy. The problem is what if you come to the campsite when it already looks like this, pretty dirty, lots of Lint failures. Are you very incentivized to clean this up? So what we found out was that a lot of people are not very incentivized to clean this up.
So we started adding some rules to limit the amount of Lint failures that are getting introduced to the code base. We started blocking commits when there are Lint failures and the files are changed. We set limits to certain teams, say you can have 10 Lint failures for your particular team. We've done a two-step review process where there was a group of people who were enforcing, for lack of a better term, the standards in the code base. Some of these work and some of them didn't work, but didn't quite get us to zero Lint failures.
So let's visit the analogy again, right? How does this fail our analogy? So if we block commits, we have this two-step review system, but your manager now tells you, hey, you need to land something really, really fast. What do you think is going to happen, right? What we saw happen is that people code around the problem either by asking someone for an exemption or actually physically coding around the problem by introducing a new class or something like that, and then ship it, avoid all the issues, get it to production, get it to members as fast as possible, and get the impact that you want. This really ruins the campsite analogy. That's why we introduced a new rule where every Lint rule that's introduced must fix all existing errors before we can enable it as an error severity Lint rule. That means we stop the bleeding. There are no new Lint rules that are getting added that makes existing files hard to maintain. But then, as a Lint author, you're like, what if there's 1,000 errors? I have to fix all 1,000 errors? I don't know how to fix all these existing errors. I don't have time to do this. What if I break something, cause a product issue? What we kind of like figured out was that these are existing issues that people who work on the codebase data they have to deal with when they see a Lint failure, and they're not incentivized to fix it. And so the Lint author kind of needs to be in the same page as the Lint or the people who are getting Linted. So the road to zero Lint failures, it's all about incentives. That's kind of like a story here. The way we ran it was kind of a carrot over a stick. We provide people with lots of technical support. Every question would get an answer as soon as possible, most within the same hour, but we try to keep it within the same day. We provided shout outs when someone cleaned up a Lint failure. We provide a very good developer experience. We really targeted the actual developer fixing their Lint failures, as well as giving them visibility once they've fixed some Lint failures in the codebase, as well as recognition. The way we made it easy for engineers fixing Lint failures was providing tooling. We identified the errors, we identified the owners, we kept this list up to date and made it very simple to use, not a new tool for an engineer to learn. Just kind of the list for them to see what the failures are and who are the owners.
3. Tooling, Visualization, and Reductions
We implemented a tool to visualize ESLint failures by team and lint rule, providing accountability and responsibility. The tool, called Checkup, runs nightly tasks and generates structured reports. We updated our tooling to improve visualization and introduced a scorecard system to monitor lint failure reductions. Getting to zero lint failures was challenging, but focusing on individuals, giving shout outs, and keeping reductions minimal helped. We cleaned up over 6,000 lint failures, with 55 contributors, and saw a 30% increase in perceived code quality.
We were kind of like beneficiaries of a different initiative where we had a single owner per file. This ensures responsibility and accountability for every file in our code base. So our first kind of like MVP version of our tool was literally just kind of like a very, like, I don't know what you call it, it's like year 2000 webpage where we just break down all the ESLint failures by team, by lint rule, and also by team and lint rule. So that you can kind of like visualize for your team how many are you responsible for and where are they in your code base, these lint failures. It updated nightly. It ran off someone's machine, actually, on his desk, which was unfortunately, actually, one time shut down during COVID and we had to get someone to turn it back on.
We had warnings, errors, and disables counted in each contract. So this is kind of like a view of the breakdown by lint rule where we show you the file name, if you click in this file name, it will bring you to the GitHub page. Just that easy to find out what your lint failures are for your team and where is it. That way you understand how much you're committing to when you're trying to fix something. We use a tool called Checkup, which is essentially like a note runner, where every night it would run these tasks. They come with some built-in tasks for JavaScript, ESLint, where we can run a plugin and it will run ESLint in your entire codebase and then give you a structured format. This is an example run of Checkup. It'll give you a Sarah file, which is a structured format. This is an example of the Sarah file, which we parsed in our tool to show you that nice diagram. Oh, yeah, and then we updated our tooling to the company standard of UIs. I believe this is DocuSource, just so it's easier to visualize. And every week we give each team a red, green, or yellow scorecard. Green if you've decreased your link failures or if you're at zero. Yellow if there's no decrease or increase and red if it's an increase in link failures. This is very important because what we've seen is a lot of people regress, not due to their fault or any particular kind of nefarious means, but just stuff happens and it's easy for us to see these regressions and try to fix it right away. This is like an example of a scorecard we would send out, it's a manual email, we shout out two to three engineers per email, try to really give them recognition. When they fix some lint failures, but really the kind of learning that we've found was that getting to zero was very, very hard. This is kind of like the graph that we had, through the journey, where you can see the last mile was very, very hard to get to, it took almost half the time to just finish the last couple hundred. But sustainability matters, right, that last mile really matters. This is an example of kind of like a file with lots of lint failures. It's not really nice to work on this file, right, compared to like a file without lint failures. It's much easier to understand a file, there's no hidden kind of like work for you to do. Everything is clear and cut up.
And so what we learned, focusing in the individual really helped, giving personal shout outs, and really trying to keep the reductions minimal. We kind of cleaned up over like 6,000 lint failures, took a little over a year and kind of contributed to the whole community. We had 55 unique contributors to this effort. At the end of it in our kind of like quarterly surveys, we've seen an increase of 30% in how people perceive their code quality after this initiative. And we still kind of added new lint rules while we were running this initiative with over 80 lint rules added to our config, with over again like 45 contributors adding these lint changes. And that's it.
Comments