Video Summary and Transcription
Tibor Blanesy from Sonar presents advanced techniques for linting with ESLint, including the use of ControlFlowGraph to detect errors in code. The algorithm is based on liveness analysis, which identifies live variables at any point in the program. Additionally, the talk covers the computation of block sets using the difference between outset and kill set unionized with genset.
1. Advanced Techniques for Linting with ESLint
Hello! My name is Tibor Blanesy, I work on JavaScript static analysis at Sonar, and in this talk I would like to show you some more advanced techniques for linting with ESLint. Let's have a look at a function that returns a range of numbers between two values passed in the argument. If the argument is not provided, it will assume that the range should start from zero. We will use a representation of code called ControlFlowGraph to detect errors in the code. The base of the algorithm is liveness analysis, which tells us which variables are live at any given point in the program.
Hello! My name is Tibor Blanesy, I work on JavaScript static analysis at Sonar, and in this talk I would like to show you some more advanced techniques for linting with ESLint. Let's have a look at the following function, which I have found in the VS Code codebase. This function returns a range of numbers between two values passed in the argument.
If the argument is not provided, it will assume that the range should start from zero. When we use some static analysis tools, such as SonarQube, it will quickly show us that there is an issue with this code. For some reason, the value assigned to the variable from is never used later in the code. The logic handling the arguments is actually duplicated.
These kind of errors, when value written to the variable is not used is called a dead store. SonarQube provides following explanation why this is an issue. A dead store happens when local variable is assigned a value that is not read by any subsequent instruction. Calculating or retrieving a value only to then overwrite it or throw away could indicate a serious error in the code. Even if it's not an issue, it is at best a waste of resources. Therefore over-calculated values should be used. In the following couple of minutes I will try to explain how this kind of errors can be detected with static analysis.
First, we will use a representation of code called ControlFlowGraph. In this representation, node called basic blocks contains only statements which are executed sequentially. Jumps are represented as arrows between the blocks. So here we have a ControlFlowGraph for the function I showed earlier. We will only show part of the graph, which is relevant for the issue, to keep it small. In the next slide, I have the same ControlFlowGraph annotated in red, with events which are provided by ESLint when we write a custom rule. ESLint API provides two objects, CodePath, which represents the ControlFlow of the whole function, and ControlPathSegment for each basic block. ESLint then fires events for the start and end of the CodePath, and for the start and end of each basic block, which is a CodePathSegment.
So in the code, what we will write is the following object, which contains an event handler for the CodePath events. We don't have the time to go into implementation details, but in the following slides, I will quickly describe the basics of the algorithm. The base of the algorithm is liveness analysis, which tells us which variables are live at any given point in the program. The variable is live when value it is holding might be needed in the future. For each basic block, we will compute four sets of variables. The begin set with variables that are being read in the basic block, kill set, which contains variables that are being written in the basic block, inset with variables which are live at the beginning of the block, and outset with variables which are live at the end of the block. To compute these sets we will use following two rules. Outset of the current block is union of all insets of its successors.
2. Computing Block Sets
And inset of the current block is a difference between outset and kill set unionized with genset. We will compute the values of these sets by starting at the bottom of the graph and moving to the predecessors of each block.
And inset of the current block is a difference between outset and kill set unionized with genset. Now I will go through the basic blocks of the function I showed earlier, and we will compute the values of these sets. So we will start at the bottom of the graph to compute the sets of this basic block. So we will assume that from and to are being read later in the function, so genset is set to from and to and we will ignore that there is something written so kill set will be empty. From this we can compute the inset as being from and to and now we will move to the predecessors of this block.
Comments