Video Summary and Transcription
This talk focuses on practical approaches for testing Node.js applications, including the use of Dora metrics and the testing trophy strategy. It emphasizes the importance of covering critical flows with integration and end-to-end tests, while also considering the cost and speed of different test types. The speaker recommends mocking third-party services and using snapshot testing, but warns about the potential for false positives. Playwright is suggested as a preferred tool, and the importance of automated test execution is emphasized.
1. Introduction to Testing Node.js Applications
My talk is testing do more with less, and it would be some practical approach how you can test your Node.js application. But before many teams reach that, can answer yes on those questions, they need to go through some paths and implement some things. So how do you make sure that you're moving the right direction? So there is so-called Dora metrics, and it stays for DevOps research and something, an A. It has four key metrics, and three of them are directly impacted by good testing.
Hi, everyone. Thanks for coming. My talk is testing do more with less, and it would be some practical approach how you can test your Node.js application.
First, try to answer the question like is your code well tested? But be honest. Do you feel comfortable deploying it automatically on a Friday evening and just going home? Who can raise their hand? Cool. We have some brave people here in the room. Nice. But does your release pipeline always stay as green as your Christmas tree? It's also important. That's cool, guys. Maybe next time we're going to do talk together. I will learn from you.
But before many teams reach that, can answer yes on those questions, they need to go through some paths and implement some things. So how do you make sure that you're moving the right direction? So there is so-called Dora metrics, and it stays for DevOps research and something, an A. It basically helps you to understand how good your team or the company in terms of performance and velocity, so how good you are at shipping software.
It has four key metrics, and three of them are directly impacted by good testing. First is deployment frequency. It basically measures how often your team successfully deploys to production. You can imagine if you have, say, manual testing still yet, I hope not, but some companies still have manual testing, you cannot deploy very often. You probably deploy once every second week or once per month. Ideally, you should deploy on demand. Lead time for change. Testing also has a big impact on this metric because even if you have automated tests, imagine you have very flaky and slow end-to-end tests. It means that whenever you push the master, whenever your deploy pipeline starts, it might take you hours, and if it fails, it might take you days or even weeks to get the green deploy pipeline. So again, you're very slow. And the last but not the least where the test plays an important role is change failure rate. It basically measures how often your deployment causes some production issue, and it has a direct connection with the test coverage. But not the test coverage like you measure, oh, 80% of my lines are covered. I'm happy. No. This is the actual coverage.
2. Optimizing Test Coverage and Approaches
The testing trophy helps you focus on writing the right tests. Unit tests have low cost and high speed, while end-to-end tests have high cost and low speed. Integration tests provide good confidence at a moderate cost. Start with writing integration tests and consider writing unit tests for specific cases. Talk to your business and product teams to identify critical flows.
Are you testing the right things? Are you covering the flows that bring the most value to the company? So how you can approach your testing, how you can write less test and get higher confidence. So again, I assume you at least heard, or even maybe you're already applying the testing trophy. So it is advocated and it became popular after Kent C. Dodds I think wrote an article about it and he's doing trainings around it. But basically, the main idea of the testing trophy is to help you to focus on writing the right tests. If you see each test has its cost and gives you some level of confidence, and it has some speed.
So, unit tests, the cost are very low, yeah? It's very easy to write unit tests, especially nowadays we have chat GPT, you can ask and it will generate unit tests for you automatically. They give you, I would say, average confidence because if you only have unit tests, it might not be enough to deploy automatically but the speed is very high. You can execute them faster. On the other end of spectrum, end-to-end tests. The cost for them are very high because it's not only the cost to write the end-to-end test but also to maintain it in the long term. Like you need a special infrastructure, you need special environment where you need to run them, that environment should be stable. But it gives you very high confidence. Like if your end-to-end test is green, you're very confident that everything works as expected, but the speed is low, yeah? So, somewhere in between there are integration tests. So, the cost for them are very close to unit tests, they give you very good confidence. I would say from my experience, only having integration test sometimes is good enough and you can live without end-to-end test and their speed is still high. And you can see that in this testing trophy, it emphasizes that amount of integration test should be, should overcome the amount of end-to-end and unit test.
So, how do you approach this? So, step number zero. You remember? You can enable static linters and type checks. It's free, yeah? Everyone should do it. Step number one. You start with writing integration test. And you try to write integration test for each happy and non-happy flow. And then you check your test coverage. You identify, okay, maybe there are some rear-edge cases for which there is no sense to write an integration test, you can write a unit test to cover them. Or maybe within your application, there are some reusable code that you used across multiple parts. So, maybe you consider it as a library. So, it also makes sense to write unit test for that. And the last, in terms of testing, talk to your business people, talk to your product. Ask them to identify the very few business critical flows.
3. Testing Node.js Application Architecture
Cover business critical flows with end-to-end tests. Ensure observability in production with metrics, logging, and event tracking. This talk focuses on testing Node.js applications and the typical architecture. Unit tests cover small parts, but integration tests address interactions between components.
So, the flows that if they don't work, the company loses millions per second. And only cover those business critical flows with end-to-end test. And then, let's be honest, not everything is possible to test. So, you still need to have a good observability on production. So, you should have metrics, you should have logging, you should track your events, like product events. So, you can immediately identify any abnormality if you deploy something wrong.
In this talk, we're gonna focus on steps one, two, and three. So, as I said, this talk is focusing on testing Node.js application. So, this is the architecture or structure of the typical Node.js application, how it works. You have a browser, you have some client-side application that runs in the browser. It makes a request. The request goes through some cloud infrastructure which does some things like HTTPS termination, maybe authentication, rate limiting, a lot of stuff. And then the request lands in your Node.js application. Let's say it's an express. So, you have a chain of middlewares which parses and enriches the request. Usually you have, I call them infrastructure middlewares. So, those middlewares that don't really belong to your application, but they just make it possible for your application to run in the environment. Yeah? So, they do a lot of things like parsing the incoming requests, like authenticating the user, maybe enabling localization, and so on and so forth. And then, finally, request arrives into the code, the white box, actually the code that you wrote, the code that contains the business logic. And this code needs to grab some data from somewhere to do some manipulations, yeah? So, it makes a bunch of API calls or maybe direct database calls. Those API calls are again going from some cloud infrastructure to APIs. And then when the data is back, you do something with this data, like you either return the JSON or maybe your server side render HTML and then you return to the browser. So, this is the type of application that we're gonna try to test.
If you write the unit test, you actually, with each unit test, you actually cover this tiny blue rectangle. I hope you can see it. So, you need to write a lot of unit tests to actually cover the whole white space. But still, you won't be able to cover the interaction between the boxes. So then, we can think how we can write integration tests that will help here. So, this is the typical scenario of the integration test. So, given we have some incoming request, HTTP request, for example, from the browser, application makes specified API calls and returns the expected response.
4. Integration Testing Approaches
Capture behavior to ensure expected application functionality. Write focused integration tests for code you own, mocking internal requests and API interactions. Consider black box integration testing for better confidence, but exclude complex infrastructure middlewares if necessary.
So, if we capture this behavior, we can be pretty sure that the application does what we expect. Let's see how we can write integration tests given this scenario.
The first type of integration test is called focused integration test because it only tests the code that you own. You exclude the infrastructure middleware and API calls that are provided by external libraries or platform teams. In this case, you mock the internal request object and how your code interacts with API clients. Then, you make assertions on the expected response from your route handler.
However, this approach may exclude important things like complex chain of middlewares. To address this, you can use black box integration testing. With black box integration testing, you test the entire Node.js application as a single piece. You mock the incoming and outgoing HTTP requests and write assertions for the expected response. This approach provides better confidence that everything works as expected, including the chain of middlewares. However, it can be difficult to create if the infrastructure middlewares are complex or if you don't know what's happening inside them. In such cases, you can exclude them from the test coverage.
5. Black Box Integration Testing
Mock code interactions and assert expected response for focused integration tests. Black box integration testing covers entire Node.js app as one piece, mocking incoming and outgoing HTTP requests. Consider complexity and knowledge of infrastructure middlewares when deciding to include or exclude them from testing.
You mock how your code interacts with API clients, and then you make an assertion that the response from your route handler looks as expected. So, it's a good approach, but you lose maybe some important things. So, if you have complex chain of middlewares, if your middlewares are doing something that is very important for you, you might want to include those middlewares and API clients in the test coverage.
So, that's what I call black box integration testing, because this type of test, they test the whole Node.js application as one single piece. You don't really care about the internal implementation details. The main difference here is that you mock the incoming HTTP request. You mock all the outgoing HTTP requests that your Node.js application can do to downstream API. And then you write an assertion what the expected response. So, in this case, you have much better confidence that everything works as expected, even including the chain of middleware.
But sometimes it might be very difficult to create, because from my experience, those infrastructure middlewares can do, for example, a lot of API calls on their own. Or you don't really know what's happening inside those infrastructure middlewares. So, not mocking them can end up that your test will be so complex that you will spend like more time setting up all the requirement infrastructure for them to run instead of just focusing. So, it depends on your situation. If your infrastructure middleware are not that complex, include them. Otherwise, you can exclude.
6. Testing Node.js Application in Browser
Include Node.js application tests in the same suite as browser tests. Run Node.js app as a separate process in the browser. This approach is more complex but better than end to end tests, which include everything and can lead to failures due to external factors.
And you might think, like, okay, I'm already using like browser testing. I have client side application. I'm already using, like, I don't know, Cypress or playwright to run my UI test in the browser. Can I also include the test of my Node.js application in the same test suite? Well, technically you can. So, basically, you can initiate the request to your Node.js application from the browser and write assertions in the browser, like inside the Cypress or playwright. The biggest disadvantage of this approach that you need to run your Node.js application as a separate process. So the infrastructure for this type of test is more complex compared to the previous ones. But still, it's much better than end to end test. Because in the end to end test, you include everything. And if something fails, like, if something is wrong with the cloud infrastructure on your testing environment or another team deployed a buggy API service, your test will fail.
7. Key Takeaways and Recommended Tools
Learn or adopt testing trophy strategy. Start with integration tests, then cover missing parts with unit testing. Reduce the amount of end to end tests. Measure the impact of changing test practices. Tools I personally use: Jest, VTest, Knock, SuperAgent, known mock HTTP, Cypress, Playwright.
So, a few key takeaways. So first of all, learn or adopt testing trophy strategy. It's a good approach. It will guide you and it will help you to focus on writing the right test at the right moment of time. Suggestion from my own, start with integration test. So write as many integration tests as possible and then cover the missing parts with unit testing. Again, we are talking about applications. Of course, if you're developing a library, probably integration tests are the right choice there. Sorry. Unit test is the best choice there.
And then as much as possible, try to reduce the amount of end to end test. It might feel, especially at the beginning when you just create an application, it might feel like it's the easiest way to test the whole application. But the more you keep them or the more mature your application, the longer you need to maintain it, the more problem they cause. And the last but not the least is don't trust your feelings. Try to measure. There are a lot of tools that you can plug in in your deploy pipeline and they will measure like how your metrics are going on and if changing the way how you write tests gives you a positive impact or not.
And some tools that I personally use when I write tests. Test frameworks like Jest, very obvious choice. But recently I'm also using VTest which is faster. For mocking downstream API calls, I use Knock. If I need to simulate the incoming HTTP call to my Node.js application, I use SuperAgent. The known mock HTTP is a very nice module. If you want to write the focused integration tests or the tests that are running inside the express framework, you can use this module to mock the express request and response. And also you can probably already use Cypress or Playwright for testing in browser. So thank you very much. I hope it sparks some ideas and it gives you some directions how you can improve your code. You can definitely do more with less. Thank you.
8. Mocking Third-Party Services and Snapshot Testing
Mock third-party services in end-to-end tests. Don't rely on their actual implementation. Packed test is contract testing to verify mocks. If you have packed in your pipeline, use it. Snapshot testing gives confidence in response assertions but may have false positives.
What do you think about mocking third-party services in end-to-end tests? I think everything that you don't own and everything that you don't trust should be mocked. So you should not rely on the actual implementation of a third-party service because usually they are unreliable. They can fail at the moment that you want to deploy your application. So you don't want to depend on them.
Okay. Interesting. I would have thought you would want to know if we can depend on them. So slightly counterintuitive to me! And this is an interesting one. Apparently, because I don't totally understand what a packed test is. Okay, yes. Yes, so I have... Maybe you can explain what a packed test is for me who doesn't know. So, a packed test is a contract testing. So basically whenever you mock something, you need to eventually verify are your mocks are still up-to-date with the actual API implementation. So packed allows you to have a contract between your Node.js application downstream APIs and you can verify if this contract is still valid. And if the contract is still valid, it means that your mocks are still valid. So the short answer is if you have packed in your pipeline, use it, but it is quite difficult to set up. But it is a good approach.
So it's a... It's an advanced step, yeah. Not the easy to go to... Yeah. What do you think about snapshot testing? Snapshot testing. I didn't mention it, but whenever you write assertion on the response, I personally like to write a snapshot test. It gives you like a very good confidence that you know exactly what your application is returning. So even something changed that's something you don't even expect that might impact your response, the snapshot will immediately tell you, oh, there is a difference. But it might give you a lot of like false positives. Yes, there might be some tiny change that is not relevant, but the snapshot might fail.
9. Snapshot Testing and Maintaining Mocks
Snapshot testing can give false positives but is still useful. Maintaining mocks depends on API stability. If all teams follow the rule of no breaking changes, mocks stay reliable. APIs should not contain backward-incompatible changes. Cypress or Playwright? One-word answer.
But it might give you a lot of like false positives. Yes, there might be some tiny change that is not relevant, but the snapshot might fail. So I personally use it but I know that, for example, my team members don't like it because too often, you need to update snapshots. And sometimes you just update them because they're broken and they're actually broken. Yeah. But... Yeah, that's interesting. Snapshots are always a blessing and a curse.
Next one for you. How do you maintain your mocks if external APIs change over time? Yeah. So, the external APIs in this talk are usually like your company internal APIs. So usually in the company, you have an agreement that you should not do backwards incompatible changes. So, ideally, your mocks should stay forever valid. If you don't trust the other teams in your company, then you can implement the fact to verify. But in my experience, for many years, if all the teams are following this API cannot have a breaking change, usually your mocks stay up to date and they stay reliable. So, you only change mocks if you start consuming new feature. Then you update the mock so it starts providing this new feature. But if you don't rely on that new feature, you don't need to update the mock. So, they can stay... Oh, okay. So, it's more stable than people think? It is. From my experience, yes. Many people are afraid of it. But from my experience, it is quite stable. But again, if everyone follows the rule that APIs should not contain backwards incompatible changes.
We're going to start a war with this question, maybe. Cypress or Playwright? Okay. Well... You're only allowed one word answer. No explanation.
10. Playwright, Manual Testing, Monitoring, and TDD
Playwright is my personal preference for its ease of setup and CodeGen. Manual tests should be outsourced to users. Deploy fast, fix issues found by users. Monitoring tools like Datadog provide clear understanding of application behavior. TDD was tried in hackathons but not in real life.
We'll just say. Okay. Well, my personal preference recently is the Playwright because I feel it much easier to set up. But that's a personal flavor. I'm a big fan of CodeGen. In the past, I had good times writing Cypress tests, so I have nothing against Cypress. The CodeGen is really nice in Playwright as well. You can just click around and get the tests. It saves a lot of time when writing tests. And you can just get the machine to do it for you.
What about manual tests? And having manual tests as part of the system? Well, I think in the modern world where every company tries to release new features as often as possible, I don't see that you should have manual tests in your pipeline. If you have something that doesn't make sense to automate, I can imagine, I would better create a feature switch around it so it's not directly exposed to the end user, deploy to production, enable it to some limited amount of better testers, and do manual testing on production. But don't let manual tests block your deployment. Outsource your manual testing to the user. Yeah, your users can be your better testers, yeah? I guess if you can deploy fast again, you can fix whatever is found. Yes. If you have monitoring, you will immediately understand that, okay, this user has some issues, you can create a fix, deploy it faster.
What monitoring? This is my own question. Observability. What monitoring tools would you use in your workflow? At the moment, at eBay we use a private cloud, so we have some proprietary toolings. But before that, for example, I was using Datadog. It's a very robust tooling. It tells you everything about what's happening with your Node.js application, so you have a very clear understanding of what's happening. You have all the metrics. Yeah. Excellent.
Then, this is an interesting one. Going integration test first would sit in the way of doing TDD. What is your view on TDD? Okay, I'll be honest here. I tried TDD during the hackathons and workshops, but I never did it in real life.
11. TDD, Code Coverage, and Selling Tests
I mean, when I do my day-to-day job. How much code coverage is enough? You need to understand the product and how users are using it to determine which flows should be covered for testing. The goal should not be a concrete percentage of coverage, but rather ensuring that tests prevent regressions. By investing time in testing now, you can save time in the future by avoiding regression issues.
I mean, when I do my day-to-day job. Nice.
I'll be curious. Again, people can't see this, but how many people are doing TDD? Wow, there's a chunk of people. Well, I see someone raising their hand, but he's not doing TDD I know for sure. He was getting called out. We need to see people's code to prove that. I hear a lot of people say they're doing it. Until there's a deadline coming. That's so funny.
How much code coverage is enough? I don't want to say like a concrete number. Or how to define what to test and what not is the second part of that, so maybe that's a better. Yeah, you need to understand the product, yeah? So I don't think that the test coverage can be answered from only a technical standpoint of view. So if you understand the product, if you understand how users are using the product and how companies making money or whatever out of this product, you will be more, I would say, you can better understand which flows are more important and which flows should be covered for the test. So I think that makes more sense than just 80% of lines are covered or 100% lines are covered. That's a nonsense. You all know that. Yeah, you can write some stupid test that will give you 100% coverage, but they won't prove anything. Ah-ha, okay.
I'm seeing there's quite a few questions about the percentage. I think that's because often we get told by a team lead to have coverage. I would say percentage should not be the goal, but again, from my experience, when you start writing integration tests, naturally you get around 70% to 80% coverage, you get naturally, so you don't need to put some extra effort. And then if you really need to add more coverage, if there are some edge cases that you still want to cover, just write unit tests. If it makes you feel more confident releasing, just write those tests.
How do you sell tests to PMs or companies that are struggling? Well, for them, I sell the regressions. So every product, they don't like regressions. When they release something, some new feature, and something else breaks, yeah? So they don't like it, so usually the test is your weapon against regression. So that's one of the ways how you can sell it. It means that you invest a little bit more time now, but the other features that will be developed in three months, in half a year, you will spend less time because you don't need to fight with regressions.
12. Automated Test Execution and Conclusion
By investing time in testing now, you can save time in the future by avoiding regression issues. Determine the best moment to automatically run tests based on your team's needs. It is important to run tests on each pull request and in your deploy pipeline. Ensure that tests run automatically to avoid forgetting and maintain a safe development environment.
It means that you invest a little bit more time now, but the other features that will be developed in three months, in half a year, you will spend less time because you don't need to fight with regressions. Purposefully break things. So they're like, look, I know that things seem to be breaking a lot lately, but I have a solution. We're just going to test more and magically it works. So I guess, yeah, they can't see the code hopefully. As long as it's not one of those TPMs, we're fine. A nice lie.
Let's see. This is a nice one as well. When, in your opinion, is the best moment to automatically run the test? Because I see some people do it on commit, or pre-commit hooks, and on push, just before going to production. So when should people be running tests? Or even what tests running when? Well, if your tests are fast, you can run them on the go. As soon as you change the code, you can immediately run the test. What we usually do in our team, we run tests on each PR. Like you always run the test before you push the code. It's obvious. And then you run the test when you have a PR, because if testing PR are failing, there is no sense for others to spend time in reviewing that PR. And then of course you run the test in your deploy pipeline. So these are the moments of time. The question is when you need to automatically run the test. As soon as you have the test, just make sure they run automatically, so you don't forget. Even if you commit something, like Friday evening and run away home, you're 100 per cent sure that the test will run. Okay, so make sure it's automatic. Yes. That's all. Yes. So nobody has to hit a button and then we're safe. Yes. Fantastic.
All right. I think that's probably all the time we have for questions today. So let's give Eugene another round of applause. Thank you. And you can go back and enjoy your day. Thank you very much, Eugene. Thank you. Have a good one. Bye-bye.
Comments