Automate the Browser With Workers Browser Rendering API

In this talk, we will explore how the Browser Rendering API can automate browser tasks, freeing developers from repetitive manual work. We will begin with an overview of Cloudflare Workers and how they enable running JavaScript at the edge. Then, we will discuss browser automation in detail, covering how to interact with the DOM, fill out forms, and scrape data from web pages. We will showcase real-world examples of how browser automation with Cloudflare Workers can improve the user experience of web applications, increase productivity, and automate tasks, such as generating screenshots and PDFs of web pages, testing web applications, and running performance audits. At the end of this talk, attendees will gain a better understanding of how to use the Browser Rendering API to automate browser tasks and take their web development skills to the next level.

Rate this content
Bookmark
Video Summary and Transcription
The Workers Browser Rendering API allows developers to automate tasks within a browser using the Cloudflare Developer Platform. This API is based on Puppeteer, a Node.js library, and has been adapted by Cloudflare to connect directly with their platform. It enables the execution of tasks like navigating to websites, capturing screenshots, and generating PDFs. Developers can utilize Cloudflare Puppeteer, a modified version of Puppeteer, for seamless browser automation. Cloudflare Radar uses this API for its URL scanner, showcasing its practical applications. The API supports session management, allowing the reuse of browser sessions to enhance efficiency. Developers can explore use cases such as taking performance metrics and storing screenshots in Cloudflare R2. For more information, documentation and examples are available on the Cloudflare website and blog.

FAQ

Cloudflare R2 is a blob storage service offered by Cloudflare. It allows users to store various types of data, including images and other files.

Cloudflare Workers supports JavaScript, TypeScript, and Python, allowing developers to create new applications or augment existing ones using languages they are already familiar with.

Common use cases include taking screenshots of websites, generating PDFs from web pages, testing web applications, gathering performance metrics, and automating tasks that require browser interaction.

The Workers Browser Rendering API allows developers to perform browser automation tasks within a headless browser instance using the Cloudflare Developer Platform. It can navigate to websites, take screenshots, create PDFs, and more.

The Workers Browser Rendering API is a fork of Puppeteer, a Node.js library from the Chrome DevTools team. Cloudflare patched Puppeteer to connect directly to the Workers Browser Rendering API, enabling its use within Cloudflare Workers.

The Workers Browser Rendering API can automate the process of checking apartment availability on websites. For example, it can navigate to a specific page at a set time, take a screenshot of the availability page, and even send an email notification if an apartment is available.

Documentation and examples for the Workers Browser Rendering API can be found on the Cloudflare documentation website and the Cloudflare blog. These resources provide detailed guides and use cases to help developers get started.

Yes, the Workers Browser Rendering API is generally available and can be used by developers to automate browser tasks within their applications.

Yes, Cloudflare has added features like browser management and Durable Objects that allow developers to reuse browser sessions and extend session durations, making it easier to perform multiple tasks within a single worker.

Developers can integrate tools like Durable Objects and session management to extend browser sessions and improve performance when performing multiple tasks. Additionally, analytics and logs can help monitor the number of browsers spawned during worker integration.

1. Introduction to Browser Automation#

Short description:

I'm excited to be here at JS Nation, talking about automating the browser with the workers browser rendering API. I'll show you how to perform browser automation tasks using the Cloudflare Developer Platform and share use cases and examples. I'm Gift Iguenu, a developer advocate at Cloudflare. Let me tell you a short personal story about finding an apartment in the Netherlands. I had a specific apartment in mind and had to manually check the website every week.

Hi, everyone. I am really excited to be here at JS Nation. I'm going to be talking about how to automate the browser with the workers browser rendering API. Now, that's a mouthful. From now on, I'm going to refer to this as just rendering API or browser rendering. I'm really sad I couldn't make it to the conference in person, but I hope you still enjoy my session.

In this session, I'm going to show you how you can perform browser automation tasks using the Cloudflare Developer Platform. And at the end, I'll show you some cool use cases and examples of some tasks that you can automate with the browser. Again, I am Gift Iguenu. I work as a developer advocate at Cloudflare. I'm really excited about the Developer Platform at Cloudflare, and I talk about it on Twitter slash my personal blog. So, if you'd like to follow me there, that would be great.

All right. So, I have a short personal story to tell before going into this talk, and it's likely based on my experience as an expat living in the Netherlands. So, I feel like this is going to be relatable if you live in the Netherlands, or if you live in any major city that has housing crisis. It's super difficult to find an apartment. Usually takes around, what, one to three months, if you just relocated or moved to a new city, to get a nice apartment, right? And it's also quite expensive. In the Netherlands, last year, there was, like, house shortage over 390,000 homes were had shortage. So it was very difficult to find an apartment.

And at the time, I was looking for an apartment in the Netherlands, specifically in Amsterdam, because I was living in a different city, and I wanted to move to Amsterdam. And I had a specific apartment in mind that I wanted to move to. It's like a building complex with several types of apartments. Now the interesting bit about this specific place is called our domain. Basically how it works is you just need to go to the website, and once there is an available apartment, you pay for it, and it means that once you are able to pay, it means that you have the apartment.

Now, the problem is, for every apartment that is available in that building, it's essentially already booked, right? So every week, if maybe someone moves out, or there's an availability, they would update that every week on Wednesday at 12 p.m. So for the next one month, I would go to the website on Wednesday at 12 p.m. to check if there is an availability, right? So I was looking for a two-bedroom apartment. And of course, it took me a while to be able to find this. I was manually checking the websites every week, which was not nice. But thinking about the subject in question, browser automation, at the time, if I had known about this, this is something I would have done.

2. Automating Tasks with Browser Rendering API#

Short description:

The Worker's Browser Rendering API allows you to automate tasks within your browser, such as navigating to a website, taking screenshots, and creating PDFs. I used it to automate the process of finding an apartment, navigating to the website at the specific time, and paying for the available apartment. Browser automation can also be used for tasks like email automation. Cloudflare Workers is a serverless environment that supports multiple programming languages like JavaScript, TypeScript, and Python. The Cloudflare developer platform offers the browser rendering API, which allows developers to control and interact with a headless browser instance. This API is based on Puppeteer, a Node.js library from the Chrome DevTools team.

But just so if anyone has this problem, this is how I solved it, using browser automation. So the Worker's Browser Rendering API basically allows you to, you know, run automation within your browser. You could do things like, you know, navigate to a specific website, take a screenshot of the website, create a PDF out of the website, and so on.

So for my specific use case, instead of going to the website on Wednesday at 12 p.m. every single week, I wanted to automate this process, right? I wanted to allow my browser do that for me, without me having to do it myself. So what did I do? I essentially wrote the browser rendering worker that would navigate to the specific website at the specific time. Honestly, I just would run this every single time, just because I wanted to be sure that I could find an apartment, right? So what this does is, it navigates to the websites, takes a screenshot of the specific availability page. So if there is an availability at a time, the website is going to tell me that, oh, you're lucky. There is currently an apartment available. In fact, at this specific time, there was three different apartments available. So I was able to, you know, go to the website and pay for the apartment that I wanted to do.

Of course, this is just a cool example that, you know, browser automation can help you with. There's so much more that you can do. For example, with this demo, I could as well go one step further to make a Chrome job that I would make this worker run at a specific day every week. Or I could also add like email automation where if there's an apartment, instead of like returning the value of the apartment to the website, I could send myself an email and I would be automatically informed of the availability, right? So that's an actual use case that I think browser automation can help with.

So with that, I want to talk about underlying technology within that, you know, setup I just showed you, which is Cloudflare Workers. If you're not familiar, Cloudflare Workers is a serverless environment that allows you to create new applications or augment existing ones with, you know, languages you're already used to like JavaScript, TypeScript, or even Python. So this is an example Cloudflare Worker. Basically what this does is it's returning a string or an array of conference data, in this case, GSNation, and I'm returning that to my browser. It's essentially written in JavaScript. This example I'm showing you, but you can as well write it in TypeScript or in Python if you want.

Now moving forward, just a few slides. Yeah. Browser rendering API is one of the services that we offer in the Cloudflare developer platform. Essentially what this is, is it allows you as a developer to control and interact with a headless browser instance. And from that, you can then create automatic flows, for example, navigate to a website, take a screenshot, and so much more. And the underlying technology here, if you ever explored the topic of browser automation, you possibly have heard of Puppeteer. So Puppeteer is a Node.js library from Chrome DevTools team. It essentially provides an abstraction of the DevTools protocol to help you control Chrome or Chromium. Now the worker's browser rendering API is a fork of Puppeteer.

3. Integrating Puppeteer with Cloudflare Workers#

Short description:

Cloudflare created a fork of Puppeteer and patched it to connect to the worker's browser rendering API, enabling the use of Puppeteer within Cloudflare workers. This integration arose from an internal need within the team to automate browser tasks. Cloudflare Radar, a product for monitoring internet trends, uses the worker's browser rendering API to implement the URL scanner. The rendering API combines remote browser isolation with Puppeteer to simplify running Puppeteer within a worker. A browser binding is required in the Cloudflare worker to interact with the remote browser instance. The Cloudflare Puppeteer library, an open-source fork of Puppeteer, is used in Cloudflare workers. Browser rendering is generally available for automating tasks within the browser, with additional features like browser management for reusing browser sessions.

Essentially what we did at Cloudflare is we created a fork and patched it to connect to the worker's browser rendering API instead, so that within your Cloudflare worker, you can directly use the Puppeteer library. This was not possible before, by the way. So we are allowing you to use Puppeteer within workers with this integration.

And how did this come up? This actually was born as a internal need within the team. So within Cloudflare, we actually needed to do browser automation, but it wasn't possible within workers at the time. Many teams wanted to do tools like maybe take automated screenshots or create automated PDFs. For example, one of the tools within Cloudflare called Cloudflare Radar, which is a product for monitoring internet trends, uses the worker's browser rendering API to implement the URL scanner. So basically what this does is it takes screenshots of any URL you input and it will generate like a report of the performance and the security of the URL.

How does this work under the hood? I'll try to explain this to you on how we're using Puppeteer under the hood to allow you to use it within a worker. So the rendering API combines the power of something called the remote browser isolation to help you simplify the process of running Puppeteer within a worker. So remote browser isolation basically allows you to interact with a web browser in a remote environment instead on your own device. So we've wrapped the Puppeteer library so that it works directly inside your project. So this is how we're able to allow you to use Puppeteer within your worker. So once a web socket connection is established, our remote browser isolation will handle all the incoming requests to your worker so that you can perform tasks like take a screenshot or do the different other automation tasks you'd like to perform.

And also, within your Cloudflare worker, you would need to set up something called a browser binding. So essentially, this is what will give a worker an authenticated endpoint to be able to interact with a remote group, your browser instance. This is an example of a worker script that essentially is running the Cloudflare browser rendering API within the script. So essentially what's happening here is you need to import the Cloudflare Puppeteer library, and then you would need to set up the instance by calling the Puppeteer.launch function. Essentially what I'm doing here is I'm launching Puppeteer, and I'm also navigating to a specific page in the browser. All of this is available on GitHub. Like I mentioned, we made a fork of Puppeteer. So the specific library that we're using within Cloudflare workers is called the Cloudflare Puppeteer library and made it open source. And the team working on this always tries to maintain feature parity with the actual fork, which is the Puppeteer library from Google.

At the time of this talk, browser rendering is currently generally available. So if you're interested in checking it out, you can use it as from today to automate your tasks within the browser. We also have released additional features. For example, we have an additional feature called browser management, where you can reuse browser sessions. So an interesting thing that happened or from my experience using it at the time was when I needed to use or perform like two or more tasks within a specific worker, I found it difficult because of the limits that comes with the browser rendering API. So we've made it easier for you to reuse browser session. And I'm going to show you an example of how that works.

4. Using the Browser Rendering API#

Short description:

We've added analytics and logs to track spawned browsers in the worker integration with the rendering API. Use cases for the browser rendering API include taking screenshots, generating PDFs, automating web applications, and gathering performance metrics. An example of using the API is taking a screenshot and storing it in Cloudflare R2 blob storage. The process involves launching a browser, opening a new tab, capturing the screenshot, and uploading it to R2.

We've also added analytics and logs so you can see the amount of browsers that has been spawned during the time of your worker integration with the rendering API.

So what are some use cases, like why would you use this within your application? I already showed you a few, but I'll also share a few more just so that you're familiar with some good use cases to use the browser rendering API. So obviously, I already mentioned taking a screenshot. So if you ever wanted to do, for example, you are running a test and you need to take a screenshot or you need to make a screenshot of a specific page, or you don't want to have to do that manually, you can use it as an example use case. You can use it to generate PDF from a web page. You can use a text web application. Of course, also use it for gathering performance metrics.

I have a few minutes left, so I'll quickly show you some examples of how you would typically want to use the worker's browser rendering API within your application. The first example I'm going to show you is how to take a screenshot and also store this screenshot to Cloudflare R2. So R2 is a blob storage that we offer within Cloudflare that allows you to store things like images and so much more. So I'm going to take a screenshot of a page and store the image in R2. So basically, what I need to do here is I already have a lot of this set up. So I'll quickly show you the code. I have my worker here where I've already gone ahead to install the Puppeteer library. I've also installed additional libraries that allow me to perform the tasks that I want to perform here. The things that I'll call out is first, I'm launching the browser. So basically, I want to launch a browser and then open a new tab within the browser. And for the specific URL I'm going to pass, I want to take a screenshot, which is what I'm doing here. And then after taking a screenshot, I'm then uploading the image to R2, which is what I do in this code block. So let me quickly test this out and show you.

So I'm going to run npm run dev. And once this is running, I will quickly turn something off and try this again. So once this is running, I will navigate to my URL here and I need to also pass in the params of the URL I want to pass in. So for this specific example, I will be using the Hacker News websites for my demo. So hackernews.com. And once I put that in, I should get a screenshot of the current hacker news websites. Obviously, this has been stored to R2. So what I'm pulling in is the public URL from R2. And you see the image generated here. So this is an example of what you can do with the browser rendering API.

5. Generating PDFs with the Browser Rendering API#

Short description:

To generate a PDF, create a new browser instance, navigate to the desired URL, wait for the page to load, wait for the DOM to load, and then create the PDF. Note that CSS may not load fully without a timeout.

Here's another example. I'll just quickly go through them so I don't waste a lot of time. The other thing you can do is also generate a PDF. So for this example, I am doing something similar, right, where I have created a new browser instance. I'm navigating to a specific URL on line one here. And for this specific URL, I'm using the GS Nation schedule. So I want to quickly take a screenshot of the schedule and turn that into a PDF, right? So what I've done here is I'm using the await page.goto to navigate to the URL. Excuse me, I had to pass a wait until here. Why? Because the page will take a while to load, meaning that if you take a screenshot or make a PDF before waiting for the page to load, you might just end up with a blank screen. I'm also waiting for a specific selector within the page to also enable me, you know, fetch or wait for the DOM to load, fetch as much of the data within the page before I make the PDF. And then on line six to nine, I'm finally creating the PDF and the results is what you have here. This is missing a lot of CSS. I am aware of that. A lot of CSS from the page is not showing. But what I could do to fix that is to set a timeout, like a timeout that would take like maybe four or five seconds so that the CSS in the page is completely loaded before I create the PDF. But I didn't do that in this demo. So that's also OK because this page has tons of resources that would need to get downloaded.

6. Extending Sessions and Performance Metrics#

Short description:

Learn how to use Durable Objects to persist browser sessions, extend session time, and perform session management within the Workers browser rendering API. Explore taking performance metrics from applications and find more details in the documentation and blog posts. Thank you for attending!

Another demo I'd like to show is how you can use Durable Objects to persist browser sessions. So I mentioned that if you would want to perform more than one task within your worker, sometimes it might be difficult and it would also decrease performance. Well, we are now allowing you to reuse browser sessions by either using Durable Objects or by using the session management I mentioned earlier. So I'll show you a quick example of how to use Durable Objects. Mind you, you have a limit of two new browsers that you can open within an account and those will stay up for one minute. Meaning you have 60 seconds to do whatever task you want to do before the session is closed. But you can use Durable Objects to extend that a bit.

In this example, I'm using a Durable Object to create an alarm that would extend my session for another 60 seconds so that I can do what I want to do. So for example, I want to take a screenshot like I did in the effects example. But I also want to take a screenshot of every possible dimension within the screen. So that would essentially run for more than 60 seconds. So I'm adding an additional 60 seconds so that the browser session is extended. I'm doing that using Durable Objects. Secondly, one of the additional features that you could use within the Workers browser rendering is called Session Management where you can reduce a session. Essentially, the difference between doing this and using it ordinarily is that you can now pass in something called a puppeteer that connects. And instead of ending your browser session, you can just disconnect it. And then if you need to reuse the session for the next possible task you have, it's then possible within your worker.

I'll skip over this last demo just because I don't have so much time to go through it. But another thing you could do within your workers browser rendering API is to perform to take performance metrics from an application. And this is an example of what I'm doing here with Hacker News. Those are different things that you could do with the browser rendering API. I know I've briefed through a lot of them, but I recommend that you check out the docs to learn more and also see some examples and try it out for yourself as well. If you have a use case that you would like to automate with the browser rendering API, I would advise that you check out the docs. And also, we have blog posts on the Cloudflare blog that goes into a bit more detail about how all this works. So I recommend you also look into that as well. Thank you so much for listening to my session. I hope you had a great time and also took something out from this. If you'd like to get access to the slides, you can scan the QR code on screen. Thank you. And have a wonderful rest of the conference.

Gift Egwuenu
Gift Egwuenu
20 min
17 Jun, 2024

Comments

Sign in or register to post your comment.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

The Future of Performance Tooling
JSNation 2022JSNation 2022
21 min
The Future of Performance Tooling
Top Content
Today's Talk discusses the future of performance tooling, focusing on user-centric, actionable, and contextual approaches. The introduction highlights Adi Osmani's expertise in performance tools and his passion for DevTools features. The Talk explores the integration of user flows into DevTools and Lighthouse, enabling performance measurement and optimization. It also showcases the import/export feature for user flows and the collaboration potential with Lighthouse. The Talk further delves into the use of flows with other tools like web page test and Cypress, offering cross-browser testing capabilities. The actionable aspect emphasizes the importance of metrics like Interaction to Next Paint and Total Blocking Time, as well as the improvements in Lighthouse and performance debugging tools. Lastly, the Talk emphasizes the iterative nature of performance improvement and the user-centric, actionable, and contextual future of performance tooling.
Install Nothing: App UIs With Native Browser APIs
JSNation 2024JSNation 2024
31 min
Install Nothing: App UIs With Native Browser APIs
This Talk introduces real demos using HTML, CSS, and JavaScript to showcase new or underutilized browser APIs, with ship scores provided for each API. The dialogue element allows for the creation of modals with minimal JavaScript and is supported by 96% of browsers. The web animations API is a simple and well-supported solution for creating animations, while the view transitions API offers easy animation workarounds without CSS. The scroll snap API allows for swipers without JavaScript, providing a smooth scrolling experience.
Living on the Edge
React Advanced 2021React Advanced 2021
36 min
Living on the Edge
The Talk discusses the future of React and introduces new APIs, including streaming rendering and server components. React Suspense allows for asynchronous loading of components and data fetching. The use of serverless computing, specifically Cloudflare Workers, is explored as a way to improve performance. The Talk emphasizes the potential for simplifying the React ecosystem and the excitement about the new API.
Pushing the Limits of Video Encoding in Browsers With WebCodecs
JSNation 2023JSNation 2023
25 min
Pushing the Limits of Video Encoding in Browsers With WebCodecs
Top Content
Watch video: Pushing the Limits of Video Encoding in Browsers With WebCodecs
This Talk explores the challenges and solutions in video encoding with web codecs. It discusses drawing and recording video on the web, capturing and encoding video frames, and introduces the WebCodecs API. The Talk also covers configuring the video encoder, understanding codecs and containers, and the video encoding process with muxing using ffmpeg. The speaker shares their experience in building a video editing tool on the browser and showcases Slantit, a tool for making product videos.
Building Multiplayer Applications with Cloudflare Workers & Durable Objects
Node Congress 2023Node Congress 2023
28 min
Building Multiplayer Applications with Cloudflare Workers & Durable Objects
Top Content
Durable Objects are a part of CloudFlare's long-term goal to expand application possibilities on workers, allowing for the building of scalable collaborative applications. Durable Objects provide a way to store global state and coordinate multi-client applications. They can be created as close to the user as possible and have unique IDs for routing requests. Durable Objects have a persistent storage API with strongly consistent semantics and IO gates to prevent correctness errors. They are well-suited for collaborative applications and can be used with WebSockets. Performance impact and read replicas are considerations for accessing Durable Objects globally.
Rome, a Modern Toolchain!
JSNation 2023JSNation 2023
31 min
Rome, a Modern Toolchain!
Top Content
Rome is a toolchain built in Rust that aims to replace multiple tools and provide high-quality diagnostics for code maintenance. It simplifies tool interactions by performing all operations once, generating a shared structure for all tools. Rome offers a customizable format experience with a stable formatter and a linter with over 150 rules. It integrates with VCS and VLSP, supports error-resilient parsing, and has exciting plans for the future, including the ability to create JavaScript plugins. Rome aims to be a top-notch toolchain and welcomes community input to improve its work.

Workshops on related topic

Solve 100% Of Your Errors: How to Root Cause Issues Faster With Session Replay
JSNation 2023JSNation 2023
44 min
Solve 100% Of Your Errors: How to Root Cause Issues Faster With Session Replay
WorkshopFree
Ryan Albrecht
Ryan Albrecht
You know that annoying bug? The one that doesn’t show up locally? And no matter how many times you try to recreate the environment you can’t reproduce it? You’ve gone through the breadcrumbs, read through the stack trace, and are now playing detective to piece together support tickets to make sure it’s real.
Join Sentry developer Ryan Albrecht in this talk to learn how developers can use Session Replay - a tool that provides video-like reproductions of user interactions - to identify, reproduce, and resolve errors and performance issues faster (without rolling your head on your keyboard).
Writing Universal Modules for Deno, Node and the Browser
Node Congress 2022Node Congress 2022
57 min
Writing Universal Modules for Deno, Node and the Browser
Workshop
Luca Casonato
Luca Casonato
This workshop will walk you through writing a module in TypeScript that can be consumed users of Deno, Node and the browsers. I will explain how to set up formatting, linting and testing in Deno, and then how to publish your module to deno.land/x and npm. We’ll start out with a quick introduction to what Deno is.