Playwright has MCP support, Midscene has MCP support, ESLint has MCP support. So, all these tools are leveraging MCP, and you can use it to do awesome things. In Playwright's case, in Midscene's case, you can use it just to basically tell your LLM, hey, go to this website, do this bunch of actions, and then use the tools that you used, leverage the tools that you used to do these things, to generate me a test. And it will give you your test, which is pretty, pretty cool. Just once again, make sure you check if it's actually the outcome or the actual test that you would write.
How far are we from self-healing tests? In 2024, there was already this company that did this amazing blog post, and they already had a version of what self-healing looked like for them. However, I feel like this, like AI, might be already a bit... Like some stuff in AI might already be a bit outdated, because, well, now we have things like AutoPlaywright, CyPrompt, and Midscene that allow you to write this test, where you just use natural language, and it will generate your test out of it. This is the closest that I think we're going to be to self-healing in tests for a while, because, well, you say fill username field, fill password field. The model doesn't need to know where these things are at that moment. So even if you refactor your page for a while, or if you tweak it, it will still find them. So these tests might recover a bit and might be the closest that you get to self-healing.
One thing that's important to note when we're talking about these things is, they tend to be slower, because you still have to go to when you're still leveraging an LLM, still leveraging a model, so it still needs to go there and back to do all the reasoning, the thinking, and figuring out where stuff are. So these tests might come at a cost. And here these examples are in Midscene, which is built by ByteDance. And speaking of ByteDance, they also created this thing called UiTars, which is an open source model agent built upon a VLM. So a VLM is a vision language model that will integrate advanced reasoning enabled by reinforcement learning. And interestingly enough, UiTars is super powerful. This thing is even better playing Doom than I am, which I'm not good at. But when you have a comparison of an LLM versus a VLM, it basically screenshots the entirety of the page and looks into it, instead of getting text looking at a representation of your HTML or screenshot of how your page looks like and reasoning. This is the difference between using an LLM and VLM for your tests. ByteDance saw some scenarios that using UiTars, the prompts don't have to be as descriptive as you are when you're using GPT. However, when you're using stuff like GPT, they are better when generating assertions. So there's these pros and cons when we're measuring each one of them, I would say. So yeah, going back to where we were in the beginning, we started looking at this graph that interestingly enough said that 13% of the people don't test. And if you're not testing, start testing, please. And it took us through this journey of me asking a bunch of friends how we're writing our stuff and starting doing an analysis of where we are and where we're going. So let's recap what is the actual state of JavaScript testing in 2025. First, everyone will keep doing their own things and having different ways to test. I don't see that changing.
Comments