Video Summary and Transcription
Today's Talk introduces Reacher, a performance monitoring tool for React and React Native codebases. It highlights the need for catching performance regressions early in the development process and identifies JavaScript misusage as a common source of performance issues. ReaSure, developed by Covstack, is presented as a promising library that integrates with existing ecosystems and provides reliable render time measurements and helpful insights for code review. Considerations for operating in a JavaScript VM are discussed, including JIT, garbage collection, and module resolution caching. Statistical analysis using the z-score is mentioned as a method for determining the significance of measurement results.
1. Introduction to Performance Monitoring
Today, I'm going to talk about performance monitoring in React and React Native codebases with Reacher. Entropy is the increase of disorder, which distinguishes the past from the future. As developers, we fight against entropy by following a development cycle and addressing bugs. However, even with a well-designed workflow, negative reviews can still appear.
Hi, today I'm going to talk about performance monitoring and how to make it happen in your React and React Native codebases with Reacher. My name is Michał Pieszchala, I'm a Head of Technology at Callstack, responsible for our R&D and open source efforts. I'm also a core contributor to a bunch of libraries currently maintaining the React Native CLI and the React Native testing library.
Let's start with some inspiration, shall we? Anyone heard of entropy? Not really this one. The real world entropy, described by physics like this. Or how Stephen Hawking framed it. You may see a cup of tea fall off a table and break into pieces on the floor, but you will never see the cup gather itself back together and jump back on the table. The increase of disorder, or this entropy, is what distinguishes the past from the future, giving a direction to time. Or in other words, things will fall apart eventually when unattended.
But let's not get too depressed or comfortable with things just turning into chaos, because we can and do fight back against it. We can exert efforts to create useful types of energy and order, resilient enough to withstand the unrelenting pull of entropy by expending this energy. When developing software we kind of feel entropy is a thing. That's why we usually put some extra effort and follow some kind of a development cycle. For example, we start with adding a new feature. During development we sprinkle it with a bunch of tests. When done we send it to QA. QA improves it and promotes our code to production channel release. And we're back to adding another feature. But that's quite simplified version of what we usually do. Let's complicate it a little bit. Among other things we don't take into account that bugs may suddenly appear. Now our circle becomes rather a graph but that's okay because we know what to do. We need to identify the root cause, add a regression test so it never breaks again, send to QA once again, ship it and we're back to adding new features.
So we're happy with our workflow. It works pretty well. We're adding feature after feature, our app release is so well designed that even adding 10 new developers doesn't slow us down. And then we take a look at our app reviews to check what folks think. And a wild one-star review appears. And then another one comes in. And they just...
2. Challenges with Performance Monitoring
Our perfect workflow is not resilient to performance regressions. We need a way to spot them before they impact our users. Treating performance issues as bugs allows us to catch regressions early in the development process. To find the best tool for performance testing, we need to consider the impact and target the most likely regressions. Most performance issues originate from the JavaScript side, particularly from React misusage. We estimate that around 80% of the time spent fixing performance issues is in the JavaScript realm. We found a promising React performance testing library that is worth exploring.
they just keep on coming. And we start to realize that our perfect workflow based on science, our experiences and best practices, which was supposed to prevent our app from falling apart, is not resilient to a particular kind of bugs. Performance regressions. Our codebase doesn't have the tools to fight these. We know how to fix the issues once spotted but we have no way to spot them before they hit our users.
So how was it, once again? Or... Performance will fall apart eventually when unintended. So if I don't do anything, to optimize my app while adding new code and letting the time go by, it will get slower. And we don't know when it will happen. Maybe tomorrow, maybe in a week, or in a year. And if only there's been an established way of catching at least some of the regressions early in the development process, before our users notice. Wait a minute, there is! If we start treating performance issues as bugs, we don't even need to break of our development workflow. Regression tests run in a remote environment, on every code change, so we just need to find a way to fit performance tests there, right?
But before we go on a hunt for the best tool, let's take a step back and think about impact and what's worth testing. As with any test coverage, there is a healthy ratio that we strive for, to provide us the best value for the lowest amount of effort. We want to make sure to target regressions which are most likely to hit our users. And apparently, we are developing a re-ignited app. By the way, did you know there's a font named Impact? And you've probably seen it with hits like memes. Anyway, take a look at the typical performance issues callstack developers are dealing with daily. Slow lists and images, SVGs, React context misusage, re-renders, slow TCI, just to name a few. If we look at this list from the origin of issue point of view, we'll notice that the vast majority of these come from the JavaScript side. Now, let's check the relative frequency. And what emerges is pretty telling. We estimate that most of the time our developers spend fixing performance issues, around 80%, origin from the JavaScript realm, especially from React misusage. Only the rest is bridge communication overhead and native code, like image rendering or database operations working inefficiently. But I'm not a fan of reinventing the wheel, so I've done my googling for React performance testing library, and I found this. This package. It looks promising. Let's see what's inside. It's not quite popular, but that's okay. Last release was 9 months ago.
3. Introduction to ReaSure
We need a new library that integrates with our existing ecosystem, measures render times reliably, provides a CI runner, generates readable and parsable reports, and offers helpful insights for code review. Introducing ReaSure, a performance regression testing companion for React and React Native apps. Developed by Covstack in partnership with Intane, ReaSure enhances the code review process by integrating with GitHub. It runs jest through Node code with special flaks to increase stability and uses the React profiler to handle measurements reliably. ReaSure compares test results between branches and provides a summary of statistically categorized results. Embracing stability and avoiding flakiness is key for cognitive benchmarks, especially in Node.js.
That's okayish. What else? It monkey patches React. That's not okay. It uses React internals as well. Well, that's a bummer. It's not a good fit for our use case and doesn't really look like a solid foundation to build on.
But, what do we actually need from such a library? Well, ideally, it should integrate with existing ecosystem of libraries we're using. It should measure render times and count reliably, have a CI runner, generate readable and parsable reports, provide helpful insights for code review, and, looking at our Google library, have a stable design. And since there's nothing like this out there, we need a new library.
And I'd like to introduce you to ReaSure, a performance regression testing companion for React and React Native apps. It's developed at Covstack in partnership with Intane, one of the world's largest sports betting and gaming group. ReaSure builds on top of your existing setup and sprinkles it with an unobtrusive performance measurement API. It's designed to be run on a remote server environment as a part of your continuous integration To increase the stability of results and decrease flakiness, ReaSure will run your tests once for the current branch and another one for the base branch. Delightful developer experience is at the core of our engineering design. That's why ReaSure integrates with GitHub to enhance the code review process. Currently, we leverage Danger.js as our bot backend, but in the future we'd like to prepare a plug-and-play GitHub action.
Now, let's see what it does. ReaSure runs jest through Node code with special flaks to increase stability. The measureRender function we provide runs the react profiler to handle measurements reliably, allowing us to avoid monkey-patching React. After the first run is completed, we switch to the base branch and run tests again. Once both test runs are completed, the tool compares the results and presents the summary, showing statistically categorized results that you can act upon. Let's go back to our example. Notice how we created a new file with .perf-test-.dsx extension, that reuses our regular React testing library component test in a scenario function. The scenario is then used by the measurePerformance method from Reassure, which renders our counter component, in this case, 20 times. Under the hood, React profiler measures renderCount and duration times for us, which we then write down to the file system. And that's usually all you have to write. Copy-paste your existing tests, adjust, and enjoy. Cognitive benchmarks is not a piece of cake, even in non-JS environments. But it's particularly tricky with Node.js. The key is embracing stability and avoiding flakiness.
4. Considerations for JavaScript VM
Operating in a JavaScript VM, we need to consider JIT, garbage collection, and module resolution caching. Statistical analysis requires running measurements multiple times. The z-score is used to determine the statistical significance of results.
Operating in a JavaScript VM, we need to take JIT, garbage collection, and module resolution caching into account. We have a cost of concurrency that our test runner embraces for speed execution. We need to pick what to average and what to percentile. And a lot more. To take statistical analysis, for example. To make sure our measurement results make sense mathematically, running them once or twice is not enough. Taking other things into account, we've figured ten times is a good baseline. Then to determine the probability of the result being statistically significant, we need to calculate the z-score, which needs the mean value or average divergence and standard deviation. This got me flashbacks from college, so I'm not gonna dive any deeper here.
Comments