Trick one is perfect hashing. This means that we reduced the number of branches. And if you can see we have names of an array of string view, HTTP, HTTPS, W, FTP, WSS file. Then we have these contexts called HTTP, not special URLs, HTTPS, WS, which corresponds to web sockets, FTP, WSS and file. These types correspond to what-vg URL scheme types. And in order to get the scheme type from here, we used an algorithm to perfectly hash, perfectly find the correct position inside the names array of the input that we have. And this is one of the examples.
The second trick is of course memoization tables. In order to reduce the number of branches and reduce if statements, what we did was that we used bitwise operations as well as getting the already parsed values from a table itself. By doing that, we have an is bad charge table that contains of 255 characters and it stores zero or one according to if it's a bad character or not. This is a really great example about improving the performance of a function and within the cost of increasing the size of the binary as well.
The third one is use vectorization. So do not process byte by byte when you can process 16 by 16. New processors in the world right now support 16 by 16 vectorization iteration through the array, so we don't need to iterate one by one. And for example, this example has tabs or a new line. In order to understand if a particular string has a tab or a new line character, we use the following example. I'm not going to dive into this for the sake of today, but the information is out there and there are optimizations available to increase the iteration and execution time of a basic for loop with such certain tricks. So on top of these efficient C++ JavaScript bridge, these optimizations, we provided an efficient bridge between the JavaScript and the C++ implementation. This is done particularly for Node.js integration so that the serialization costs the string to string conversion from C++ to JavaScript is reduced as much as possible. So passing multiple strings is expensive, and pass one string with offset. So basically we have an href and we return eight different integers that correspond to protocol end, username end, host start, host end and so on and so forth. If we know protocol end, then you can take the substring of the href by taking zero to protocol end for example, and if you have a username and so on and so forth. This is kind of the optimizations that improves the code base by 60, 70, 80 percent.
Here's an example JavaScript benchmark. It basically takes lines and it tries to parse it and it adds the length of the href to a value and then it counts the good URLs and the bad URLs. This is done to eliminate just JITs compiler optimization so that that code to disable that code elimination in V8. The benchmark is available at github.com slash adurl slash gs url benchmark and please take a look at it and if there's something that we missed, please take the time to create an issue on the GitHub repository. This particular benchmark on node 18.15.0 ran around 0.8 million URLs per second. At that time dno 1.32.5 was doing 0.9 million, bun 0.5.9 was around 1.5 million, and on node 20.1.0, it's right now 2.9 million URLs per second. The Ada C++ library is safe and efficient.
Comments