Enhanced AST Static Analysis with Typescript Language Server

Rate this content
Bookmark

Most of the ecosystem tools, like bundlers or transpilers, are based on AST. And Typescript provides one of the best Developer Experiences to work with the code base.

This talk is about the experience of how beneficial it can be to use Type Hints and Typescript Language Server during the AST and static code analysis, based on the example of building a compile-time css-in-js library.

This talk has been presented at TypeScript Congress 2023, check out the latest edition of this JavaScript Conference.

FAQ

An Abstract Syntax Tree (AST) is a tree representation of the abstract syntactic structure of source code. It is used in static code analysis for various transformations such as removing specific code patterns, optimizing code, and ensuring code quality by abstracting the structure and making manipulation easier.

TypeScript's language server can enhance static code analysis by providing accurate type information and refactoring tools, which can be programmatically accessed to improve semantic analysis and code transformation tasks, ensuring type safety and efficiency in code modifications.

The code compilation process typically involves four stages: lexical analysis, syntax analysis, semantic analysis, and code generation. Each stage plays a crucial role in transforming source code into executable code.

A Babel plugin can traverse the Abstract Syntax Tree (AST), identify console.log call expressions, and remove them based on specific conditions. This process involves checking the callee of the call expression and manipulating the tree to exclude unwanted expressions, thereby modifying the source code during the build process.

Using TypeScript in static code analysis provides the advantage of type safety, which allows for more precise code transformations and optimizations. TypeScript's type system and language server provide tools and information that can enforce type constraints and facilitate more informed and reliable code edits.

TSMorph is a library that wraps the TypeScript compiler API to provide a simpler interface for interacting with the TypeScript compiler. It is used during abstract syntax tree analysis to programmatically access type information and perform type checks or transformations based on that information.

AST is used in JavaScript development for build-time optimizations like minification and transpiling, formatting code, and custom transformations such as removing unused code. It allows developers to manipulate code structure programmatically, enhancing performance and maintainability.

Artur Kenzhaev
Artur Kenzhaev
12 min
21 Sep, 2023

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Today's Talk discusses enhancing static code analysis using TypeScript language server and abstract syntax tree (AST). TypeScript can help with static analysis by providing types based on function signatures. By integrating TSMorph into a Babel plugin, we can check types for specific nodes in the abstract syntax tree. Enhancements to static analysis include checking console.log arguments and removing unnecessary expressions. TypeScript's type information can be used to compile CSS and extract it into a separate stylesheet, enabling better compilation and build time performance.

1. Enhancing Static Code Analysis with AST

Short description:

Today we'll discuss enhancing static code analysis using TypeScript language server and abstract syntax tree (AST). We can use AST to make code transformations, such as removing specific function calls. By implementing a Babel plugin, we can automatically remove all console.log calls from the code. AST and static code analysis enable build-time optimizations, minification, formatting, and transpiling to support older browsers.

Hi everyone, I'm Artur, tech lead at Apps Platform team at London-based company called Zoho. Today I'd like to talk about enhancing static code analysis using TypeScript language server. But first, let's discuss quickly what abstract syntax tree is and how can we use it for static analysis.

Imagine we want to implement some automatic code transformation during compilation to remove all CONSOLE.LOC calls from the output. In theory, we could just use a regular expression to find all CONSOLE.LOC in the code. But remember, solve the problem with a regular expression and now you have one more problem. And actually, it'll be quite tricky to write such a regular expression to handle all the different cases. Instead, we can make code transformations using abstract syntax tree.

Let's take a look at code compilation process. Compilation usually contains 4 stages – lexical analysis, syntax analysis, semantic analysis and code generation. Today we'll focus on semantic analysis with abstract syntax tree. And we'll implement our code transformation using AST. Most of the tools of our ecosystem are using AST for analysis and code changes.

Let's take a look at our code again. We're going to use a tool called AST explorer. And on the right side you can see how our code is represented by abstract syntax tree. The function call, which is console.log in our case, is represented by call expression node in the tree, which contains other nodes like callee or arguments. This call expression is a part of the block statement body, let's just remove it. Once we remove it, block statement wouldn't contain any other expressions in the body. So in the end, generating a code will get just an empty Hello World function, in this particular case. Going back to diagram, so what we did, we just transformed our abstract syntax tree, removing call expression node, and then we generated code from the new AST. Of course, we don't want to make these transformations manually, so let's implement a simple Babel plugin to remove all console.log calls. In AST Explore, you can select Built-in Babel API to implement a Babel plugin and you can see the four parts on the screen, like source code, AST, plugin implementation and output code. Original source code contains three console.log call expressions, which are represented by abstract syntax tree. You can see these three expression statements as part of the block statement body. Let's expand each of them and you can find one of the call expressions with callee and argument nodes. Now moving to the plugin implementation, the implementation is quite straightforward. All we need to do is traverse over call expressions, check if callee is a member expression and if that's a console.log, remove the whole expression. In the end, we'll get an empty function, hello world, without console.log again because they were automatically removed. You can use AST and static code analysis to implement build-time optimizations, minification, formatting, transpiling, for example, from the new language features to the old one to support older browsers, for example, and many more.

2. TypeScript's Role in Static Analysis

Short description:

TypeScript can help with static analysis by providing types based on function signatures. It offers a language server that allows tools to connect and access type information. TSMorph, a library wrapping the TypeScript compiler API, simplifies interaction. We can integrate TSMorph into a Babel plugin to check types for specific nodes in the abstract syntax tree. By enhancing the original plugin with TypeScript support, we can remove console.log arguments that are not numbers.

But how can TypeScript help with static analysis? Let's imagine now that we need to remove console.log calls as before, but keep logs of arguments of number type. So, for example, we want to keep 1, 2, 3 or some constant if that's a number, but we want to remove Hello World strings or Now strings with in-place values like 1, 2, 3, that's pretty straightforward because you can get this value just from the syntax. But it gets quite complicated with function calls or external variables like some constant here. We have to have some kind of type system in place or implement type inference ourselves and TypeScript can provide such types.

In this example, TypeScript can automatically info the correct type based on the function signatures. And to make TypeScript not only check types with the command-line tool but work with different code editors, TypeScript provides a language server which is a separate process like a backend and tools can connect to this server to get the information about types or use code refactoring tools provided by TypeScript. And the beauty of the language server is that it can be used not only for code editing or type checking. Following the language server protocol we can interact with it programmatically and benefit from TypeScript knowledge about the project and types at any level including we can use it during our semantic analysis.

TSMorph is an amazing library which wraps TypeScript compiler API and provides a simple interface to interact with. There is an example how you can configure TSMorph in your program. Now we can use it during abstract tree analysis. Let's see how we can integrate TSMorph into the actual plugin. So there is quite a lot of code here, but let's focus on the most important parts. So we want to have a getTypeAtPost function which accepts file name, code, start and end of the position we are interested in and then we can create a virtual source file based on the file name and code, and read the type for this provided position using TSMorph. The idea is the same as, for example, in IDE. If we need to see the type in our editor, we select an expression we are interested in and editor shows types for this specific cursor position. In this case, instead of cursor, we just have programmable interface but the idea is mostly the same. And now let's integrate TSMorph with our Babel plugin. We need to initialize the defined previously TypedClick class for TS processor. Then we can define a helper called GetTSType with file name, source code, and path to the node in the abstract syntax tree as arguments. Each path in the abstract syntax tree contains nodes with information where the node is positioned from start to end. This way we can check the type for the specific node. And in the end, we just need to return the type representation provided by TSMorph. Having these simple helpers defined, let's see how our original Babel plugin can be enhanced with TypeScript support. To remind, this was the original plugin implementation that's integrated with TypeScript's language server. First of all, again, let's initialize our TSProcessor instance and define GetTSType helper. Next, we can get code and file name from the Babel plugin state. At this state, we already did all the checks, like we check that this is console.log expression, so we just need to go over console.log arguments and check that each argument... We need to get the type for each argument and check that if that argument type is number or not. If that's not a number, we can just remove this argument completely, and if it is number, we can just leave it.