The Evolution of Browser Automation

Rate this content
Bookmark

In this session, we’ll take a look at what has happened behind the scenes in browser automation throughout the years and what the future will have in stock for us. We will examine how web testing will develop and what challenges this will bring for conventional frameworks like Selenium or WebdriverIO, as well as new frameworks such as Cypress, Puppeteer and Playwright. Lastly, we will experiment with some new automation capabilities these frameworks provide to test some of the new web features.

This talk has been presented at TestJS Summit - January, 2021, check out the latest edition of this JavaScript Conference.

FAQ

Browser automation involves using software tools to automate actions within a web browser, such as clicking buttons or filling forms. It is important for testing web applications to ensure they work well across different browsers and configurations without manual intervention.

Selenium was created by Jason Huggins in 2004 at ThoughtWorks to test an expense tool on different browsers like IE and Firefox. WebDriver was developed by Simon Stewart in 2005 as a better tool for browser automation. These tools helped address the growing need for testing web applications across various browsers.

Browser automation tools have evolved from basic test runners within the browser to complex frameworks that support comprehensive testing capabilities. This includes integration with modern web technologies and support for mobile testing, driven by the dynamic nature of web app development and the need for more sophisticated testing solutions.

Conventional tools like Selenium use the WebDriver protocol allowing cross-browser automation, suited more for QA than developers. Non-standard tools, such as Cypress and Playwright, use custom JavaScript or browser APIs for more in-depth interaction with web apps, offering features attractive to developers.

The current WebDriver protocol often faces issues like slow response times due to its design, which requires commands to be translated by a driver. This can lead to inefficiencies especially when tests are run in the cloud or require high interaction with the browser.

A new WebDriver protocol is being developed to address current limitations by allowing more direct and efficient interaction with browsers. This includes capabilities for better introspection into network and DOM elements, aiming to improve the speed and reliability of browser automation.

Browser automation is adapting to modern web applications by integrating with newer web technologies and APIs, allowing for more granular control and testing of dynamic and complex web applications. This includes using browser APIs to directly manipulate browser functionalities and improve test automation.

Christian Bromann
Christian Bromann
34 min
15 Jun, 2021

Comments

Sign in or register to post your comment.
Video Summary and Transcription
Browser automation has evolved over the years, starting with Selenium and WebDriver. Tools like Cypress, Taskerfee, Puppeteer, and Playwise use different approaches for automation. The new WebDriver protocol will enable sending and receiving thousands of commands and messages simultaneously. New testing types, such as performance and accessibility testing, will continue to emerge. The new WebDriver protocol combines the best of all three approaches and provides opportunities for testing and automating web applications.

1. Introduction to Browser Automation

Short description:

Hello, y'all. Thank you for joining the session. I'm Christian, working at Sauce Labs. Let's talk about browser automation and its misconceptions. Browser automation has evolved over the years, starting with Selenium and WebDriver. Jason and Simon merged the projects to overcome limitations. A working group at the W3C was formed to standardize the process.

Hello, y'all. Thank you for joining the session, and particularly big thanks to the TestJS Summit organizer and speaker committee for inviting me to open the conference. I'm very excited about all the great talks from experts around the world that we will get to see over the next two days. And I'm very happy to see such great events continue to take place despite the difficult global situation that we find ourselves in.

I would love to spend the next 25 minutes to speak a little bit about how browser automation has been involved over the last decade or so, and I hope it gives a little bit more context when you hear about automation tools in the upcoming sessions. But before we start, let me introduce myself. I'm Christian, I'm working in the Open Source Program Office at Sauce Labs. And most people probably know me as the maintainer of Web.IO, which is a project that got me excited about automated testing and browser automation many, many years ago. And maintaining the project really taught me a lot of things about generally how browsers work and how open source and standards are being developed. And those are all topics that I'm fortunate to work on full times these days.

The reason I wanted to give this talk is because I see a lot of misconceptions about how browser automation actually works. It's an interesting challenge, especially for cloud vendors and cloud providers, because as a user there's not much delineation between your automation framework and the automation actually happening in the browser. So for instance, if your click doesn't happen, even though the test script passes without errors or if the script cannot find an element, even though you clearly see that the element is there when you check it yourself, people kind of blame the frameworks first and then at some point the cloud vendor second. While in reality, there are a lot of nuances and processes responsible for making that click happen in the first place, maybe in a VM that is miles and miles away from the machine that actually runs your test. So let's have a look how a click command in the framework actually ends up being a click event in the browser and to do so, I would like to start with a small recap.

Browser automation has been around for more than a decade and there have been quite some interesting developments and influences happening over the years, especially with the web changing from how it was 15 years ago to what it is today. So let's recap what has happened so far and how we got where we are right now. So it started all kind of in 2004, with someone called Jason Huggins, having the need to test an expense tool at ThoughtWorks to make sure it works on IE as well as on Firefox back then. He called that tool Selenium, and it's probably a project that you all know already. A year later, another actor jumps into the scene, claiming to have built a better tool, which was called WebDriver. That guy was Simon Stewart. Both tools over the years gained more and more popularity as browser automation became a thing to test web applications. So at some point, Jason thought it would be a good idea to create a company. That company was called Source Labs. It was apparent that both tools, the WebDriver project and the Selenium project, they were great, but they have their certain limitation in specific areas. Selenium, which was back then running in the browser, had problems with cross-region policies and automation around the browser in general. WebDriver had other limitations when it came around automating certain elements. Jason and Simon merged the project together in joint forces to overcome this limitation and provide the best experience possible at the time. Over the years, these frameworks gained more and more popularity to a point where people had a key idea about how automation works. And so, a working group at the W3C was formed to kind of standardize this process.

2. Evolution of Browser Automation

Short description:

The goal was to ensure consistency across browsers and draft a standard for browser automation. This led to the development of WebDriver IO and Appium. However, the web landscape changed with dynamic JavaScript-heavy applications, decoupled front-end and back-end, and the emergence of new web APIs. Tools like Cypress filled the gaps, while the recommended standard fell short. A new protocol was developed to address modern web app requirements. Conventional tools like Selenium use the WebDriver protocol, while non-standard tools offer their own advantages and limitations. Projects like Cypress, Taskerfee, Puppeteer, and Playwise use different approaches for automation.

The goal here was really to make sure that a click in, for instance, Chrome was the same as a click in Firefox. And so, the people there started an effort to draft a standard with the requirements in mind that people had at this point in time when it comes to browse automation. This created a lot of confidence and traction in the ecosystem, where a lot of new projects started to flourish and started to be created. We see the WebDriver IO release in 2011, and we see other projects like Appium that bring the same principle into the mobile space.

What then happened was quite interesting. The web kind of changes a lot, and also the way how web applications are built. What happened before was kind of a static server that was delivering static web size, has now become a more and more dynamic JavaScript-heavy web application that uses frameworks like React, Vue, Angular, or Swelt. That drastically has changed a lot of requirements that people had when they test applications. Suddenly, front-end and back-end became more decoupled, and people really wanted to start focusing, testing only the front-end application, rather than deploying the whole stack. With the continuous development of more web APIs that became available in the browser, people had more and more use cases to test. A lot of these use cases were not really in focus when WebDriver or SitAnywhere developed. Luckily, during those times, we had companies like Cypress who stepped in and filled the needs for developers in a really extraordinary way. They tried to close the gaps, as well as a lot of other tools that started to pop up in the ecosystem.

During all these developments, the standard that was supposed to solve these problems was finalized and became a so-called recommended standard. However, while it allowed you to run automation across all browsers, its original design was already behind, and it was clear that it wouldn't solve the problems that developers have building modern web applications today. So almost at the same time, a new effort was started to develop a new protocol with experiences and learnings that made creating the first one and new requirements that developers have building modern web apps today. So if we look into the ecosystem, we can pretty much group tools into kind of two buckets. We have, on the one side, the more conventional tools like Selenium or web.io, and we have, on the other side, the so-called non-standard tools. And both groups have some interesting characteristics. Starting with the conventional ones, they are, as you might expect, all using the WebDriver protocol and therefore allow you to truly do cross-browser automation. Everyone, every command you can run in WebDriver is tested in every browser like any other standard that you have in the web. However, given the way some front-end frameworks are built, it can still create some incompatibilities when testing web apps. So as a design of this protocol, which was originally do everything a user would be able to do, it's not very suited for developers that like to introspect all areas of the application. These tools aren't really that popular among devs and more used by QA folks. However, many of them are open governed, open-source projects with a long history and a large community with ReptiveIO that is for instance part of the OpenJS Foundation alongside NodeJS, Mocha, and WebPack and we have Selenium which is the project of the Software Freedom Conservancy. Now on the other side we have the I call them non-standard tools which all have their own ways to automate the browser and their own set of advantages and disadvantages over each other. These custom approaches usually are based on some sort of JavaScript emulation or through the use of browser APIs that makes them however all limited to a certain browser and provides them however with the capabilities that you would not have with WebDriver and therefore is much more interesting for developers that like to introspect web apps they like to introspect the network and the dome things like that. What is interesting that all these projects are paid by companies and multiple people are working on these projects full-time Looking at all these projects together we see that we have tools like Cypress and Taskerfee that taking the approach of using web APIs for automation, we have Puppeteer and Playwise that rely on native browser APIs and lastly we have Selenium, WebDriver.io and many other tools that rely on the WebDriver protocol. What's worth pointing out here that some tools like Cypress or WebDriver.io actually use a mixture of two approaches. For some automation capabilities Cypress needs to use browser APIs for example to take screenshots of the browser.

QnA

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Network Requests with Cypress
TestJS Summit 2021TestJS Summit 2021
33 min
Network Requests with Cypress
Top Content
Cecilia Martinez, a technical account manager at Cypress, discusses network requests in Cypress and demonstrates commands like cydot request and SCI.INTERCEPT. She also explains dynamic matching and aliasing, network stubbing, and the pros and cons of using real server responses versus stubbing. The talk covers logging request responses, testing front-end and backend API, handling list length and DOM traversal, lazy loading, and provides resources for beginners to learn Cypress.
Testing Pyramid Makes Little Sense, What We Can Use Instead
TestJS Summit 2021TestJS Summit 2021
38 min
Testing Pyramid Makes Little Sense, What We Can Use Instead
Top Content
Featured Video
Gleb Bahmutov
Roman Sandler
2 authors
The testing pyramid - the canonical shape of tests that defined what types of tests we need to write to make sure the app works - is ... obsolete. In this presentation, Roman Sandler and Gleb Bahmutov argue what the testing shape works better for today's web applications.
Testing Web Applications with Playwright
TestJS Summit 2022TestJS Summit 2022
20 min
Testing Web Applications with Playwright
Top Content
Watch video: Testing Web Applications with Playwright
Testing web applications with Playwright, a reliable end-to-end testing tool. Playwright offers fast execution, powerful tooling, and support for multiple languages. It provides precise selectors, web-first assertions, and code generation for easy testing. Playwright also offers features like live debugging, tracing, and running tests on CI. The future of Playwright aims to make testing easy and fun, with a focus on creating frustration-free web experiences.
Full-Circle Testing With Cypress
TestJS Summit 2022TestJS Summit 2022
27 min
Full-Circle Testing With Cypress
Top Content
Cypress is a powerful tool for end-to-end testing and API testing. It provides instant feedback on test errors and allows tests to be run inside the browser. Cypress enables testing at both the application and network layers, making it easier to reach different edge cases. With features like AppActions and component testing, Cypress allows for comprehensive testing of individual components and the entire application. Join the workshops to learn more about full circle testing with Cypress.
Test Effective Development
TestJS Summit 2021TestJS Summit 2021
31 min
Test Effective Development
Top Content
This Talk introduces Test Effective Development, a new approach to testing that aims to make companies more cost-effective. The speaker shares their personal journey of improving code quality and reducing bugs through smarter testing strategies. They discuss the importance of finding a balance between testing confidence and efficiency and introduce the concepts of isolated and integrated testing. The speaker also suggests different testing strategies based on the size of the application and emphasizes the need to choose cost-effective testing approaches based on the specific project requirements.
Playwright Test Runner
TestJS Summit 2021TestJS Summit 2021
25 min
Playwright Test Runner
Top Content
The Playwright Test Runner is a cross-browser web testing framework that allows you to write tests using just a few lines of code. It supports features like parallel test execution, device emulation, and different reporters for customized output. Code-Gen is a new feature that generates code to interact with web pages. Playwright Tracing provides a powerful tool for debugging and analyzing test actions, with the ability to explore trace files using TraceViewer. Overall, Playwright Test offers installation, test authoring, debugging, and post-mortem debugging capabilities.

Workshops on related topic

Designing Effective Tests With React Testing Library
React Summit 2023React Summit 2023
151 min
Designing Effective Tests With React Testing Library
Top Content
Featured Workshop
Josh Justice
Josh Justice
React Testing Library is a great framework for React component tests because there are a lot of questions it answers for you, so you don’t need to worry about those questions. But that doesn’t mean testing is easy. There are still a lot of questions you have to figure out for yourself: How many component tests should you write vs end-to-end tests or lower-level unit tests? How can you test a certain line of code that is tricky to test? And what in the world are you supposed to do about that persistent act() warning?
In this three-hour workshop we’ll introduce React Testing Library along with a mental model for how to think about designing your component tests. This mental model will help you see how to test each bit of logic, whether or not to mock dependencies, and will help improve the design of your components. You’ll walk away with the tools, techniques, and principles you need to implement low-cost, high-value component tests.
Table of contents- The different kinds of React application tests, and where component tests fit in- A mental model for thinking about the inputs and outputs of the components you test- Options for selecting DOM elements to verify and interact with them- The value of mocks and why they shouldn’t be avoided- The challenges with asynchrony in RTL tests and how to handle them
Prerequisites- Familiarity with building applications with React- Basic experience writing automated tests with Jest or another unit testing framework- You do not need any experience with React Testing Library- Machine setup: Node LTS, Yarn
How to Start With Cypress
TestJS Summit 2022TestJS Summit 2022
146 min
How to Start With Cypress
Featured WorkshopFree
Filip Hric
Filip Hric
The web has evolved. Finally, testing has also. Cypress is a modern testing tool that answers the testing needs of modern web applications. It has been gaining a lot of traction in the last couple of years, gaining worldwide popularity. If you have been waiting to learn Cypress, wait no more! Filip Hric will guide you through the first steps on how to start using Cypress and set up a project on your own. The good news is, learning Cypress is incredibly easy. You'll write your first test in no time, and then you'll discover how to write a full end-to-end test for a modern web application. You'll learn the core concepts like retry-ability. Discover how to work and interact with your application and learn how to combine API and UI tests. Throughout this whole workshop, we will write code and do practical exercises. You will leave with a hands-on experience that you can translate to your own project.
Detox 101: How to write stable end-to-end tests for your React Native application
React Summit 2022React Summit 2022
117 min
Detox 101: How to write stable end-to-end tests for your React Native application
Top Content
WorkshopFree
Yevheniia Hlovatska
Yevheniia Hlovatska
Compared to unit testing, end-to-end testing aims to interact with your application just like a real user. And as we all know it can be pretty challenging. Especially when we talk about Mobile applications.
Tests rely on many conditions and are considered to be slow and flaky. On the other hand - end-to-end tests can give the greatest confidence that your app is working. And if done right - can become an amazing tool for boosting developer velocity.
Detox is a gray-box end-to-end testing framework for mobile apps. Developed by Wix to solve the problem of slowness and flakiness and used by React Native itself as its E2E testing tool.
Join me on this workshop to learn how to make your mobile end-to-end tests with Detox rock.
Prerequisites- iOS/Android: MacOS Catalina or newer- Android only: Linux- Install before the workshop
API Testing with Postman Workshop
TestJS Summit 2023TestJS Summit 2023
48 min
API Testing with Postman Workshop
Top Content
WorkshopFree
Pooja Mistry
Pooja Mistry
In the ever-evolving landscape of software development, ensuring the reliability and functionality of APIs has become paramount. "API Testing with Postman" is a comprehensive workshop designed to equip participants with the knowledge and skills needed to excel in API testing using Postman, a powerful tool widely adopted by professionals in the field. This workshop delves into the fundamentals of API testing, progresses to advanced testing techniques, and explores automation, performance testing, and multi-protocol support, providing attendees with a holistic understanding of API testing with Postman.
1. Welcome to Postman- Explaining the Postman User Interface (UI)2. Workspace and Collections Collaboration- Understanding Workspaces and their role in collaboration- Exploring the concept of Collections for organizing and executing API requests3. Introduction to API Testing- Covering the basics of API testing and its significance4. Variable Management- Managing environment, global, and collection variables- Utilizing scripting snippets for dynamic data5. Building Testing Workflows- Creating effective testing workflows for comprehensive testing- Utilizing the Collection Runner for test execution- Introduction to Postbot for automated testing6. Advanced Testing- Contract Testing for ensuring API contracts- Using Mock Servers for effective testing- Maximizing productivity with Collection/Workspace templates- Integration Testing and Regression Testing strategies7. Automation with Postman- Leveraging the Postman CLI for automation- Scheduled Runs for regular testing- Integrating Postman into CI/CD pipelines8. Performance Testing- Demonstrating performance testing capabilities (showing the desktop client)- Synchronizing tests with VS Code for streamlined development9. Exploring Advanced Features - Working with Multiple Protocols: GraphQL, gRPC, and more
Join us for this workshop to unlock the full potential of Postman for API testing, streamline your testing processes, and enhance the quality and reliability of your software. Whether you're a beginner or an experienced tester, this workshop will equip you with the skills needed to excel in API testing with Postman.
Monitoring 101 for React Developers
React Summit US 2023React Summit US 2023
107 min
Monitoring 101 for React Developers
Top Content
WorkshopFree
Lazar Nikolov
Sarah Guthals
2 authors
If finding errors in your frontend project is like searching for a needle in a code haystack, then Sentry error monitoring can be your metal detector. Learn the basics of error monitoring with Sentry. Whether you are running a React, Angular, Vue, or just “vanilla” JavaScript, see how Sentry can help you find the who, what, when and where behind errors in your frontend project. 
Workshop level: Intermediate
Testing Web Applications Using Cypress
TestJS Summit - January, 2021TestJS Summit - January, 2021
173 min
Testing Web Applications Using Cypress
Top Content
WorkshopFree
Gleb Bahmutov
Gleb Bahmutov
This workshop will teach you the basics of writing useful end-to-end tests using Cypress Test Runner.
We will cover writing tests, covering every application feature, structuring tests, intercepting network requests, and setting up the backend data.
Anyone who knows JavaScript programming language and has NPM installed would be able to follow along.