Hi, everyone. I'm Talia, and today we're going to talk about how to enable tests in production. We're going to talk about what testing in production is, how to set it up, and common pitfalls that people usually run into.
So, this is my contact information, my Twitter and my e‑mail, in case you guys have questions later. But a little bit about me. I'm a developer advocate at Split. And I used to be a test engineer, and I worked in QA and automation and testing for a while before I joined Split. And being a test engineer was really difficult for me, because most of the problems that I had revolved around staging and using this dummy environment, and staging isn't the same as production. So, I would have so many problems, and these are some of the problems that I dealt with that I'm sure most of you have dealt with too. If you've dealt with any sort of test environment, any sort of QA environment, anything that's not production, these are some of the things that made it really hard for me to do my job.
So, the first problem was data mismatch. So, the data and staging doesn't match production, which means test results don't always match. So, I used to work really hard on making sure I tested every single product requirement, and I would go through the documentation with the product owner, and I worked with my developers to fix all the bugs, make sure my end-to-end tests were passing, and then I would sign off on the feature, and as soon as it's launched to production, there would be a bug. It's such a horrible feeling when there's all this pressure on you to make sure that your feature works in a dummy environment.
And then the next thing with data mismatch that happened to me was something called configuration drift, and what this is, is let's say that you get paged one night because there's an incident for your app, and you look at the logs and you identify the problems, but in order to fix it, you have to update a specific configuration in production, and so you make the change in production and you go back to sleep. And although you fixed the issue, you've just created an even bigger divide between your staging and your production environments. So this divide is called configuration drift, and many times, staging environments are not the same as production because of changes made during incident management, which just adds to a bigger configuration drift. And I felt like, what's the point of testing and staging if it's not gonna give me the same results as production?
The next problem I had was staging was really slow. There was just really bad performance. And a lot of times when you're writing tests and staging, you often have to add waits and sleeps because things take longer to load. For example, click on a button, wait 10 seconds for something to happen, perform this action, wait another 10 seconds for something to happen. Your user is not going to wait 10 seconds for something to appear. You know, in tech time, that's crazy talk. So that's not how my users are going to interact with my features in production. So why make that different in staging?
Nobody cares if staging is down. This is another reason, another thing that I had to deal with is that I would be assigned to test different issues, to test different hotfix tickets. And these were just critical bug fixes that needed to get immediately released to production. So I would log into staging to test it, but staging would be down. So I have to ping the DevOps guy. But the DevOps guy says you need to open an IT ticket and then the IT ticket has to get escalated by my manager.
Comments