English versionEN

Getting Weird with Video Manipulation and HTML5 Canvas

In this lightning talk we will be pushing the boundaries of HTMl5 Canvas browser APIs. Join us while we do some experiments with video in the browser to see what’s truly possible. DISCLAIMER: No promises of cross-browser compatibility. Not for the faint of heart. Must be this tall to ride. Attending this lightning talk may result in serious injury or death. All participants must bring a life jacket.

This talk has been presented at React Summit 2020, check out the latest edition of this React Conference.

FAQ

At the React Summit, capabilities such as streaming video content using HLS for adaptive bitrate streaming, manipulating video frames with Canvas for effects like grayscale, and performing real-time object detection using machine learning were discussed.

By using the Canvas API to manipulate video frames, developers can create interactive experiences such as adding text overlays, applying filters, and conducting real-time object detection directly in the browser, enhancing user engagement and interactivity.

HLS (HTTP Live Streaming) is used to stream video content efficiently by dividing the video into smaller segments that can be downloaded at different quality levels based on the user's bandwidth, ensuring smoother playback and better user experience.

Rendering video frames to Canvas elements allows for synchronization of multiple video sources, reduces bandwidth by avoiding multiple streams, and provides flexibility for adding real-time interactive elements and effects directly within the browser environment.

On Mux's marketing website, video frames are rendered to Canvas elements to create interactive demos where elements such as devices and browsers pop out as the user hovers over them. This method ensures synchronized playback and enhances the interactive experience without additional bandwidth costs.

Mux is a platform that provides video infrastructure for developers, similar to how Stripe offers payment solutions and Twilio offers communication solutions. It focuses on providing easy-to-use APIs and tools for developers to work with video content efficiently.

Developers can manipulate video frames by drawing them onto a Canvas element, extracting image data, and then applying transformations such as color manipulation or object detection before redrawing the modified frames back onto the Canvas.

video web development

Dylan Jhaveri

16 min

17 Jun, 2021

Comments

Video Summary and Transcription

The talk covers the use of HTML5 video APIs and Canvas for creating interactive video experiences. It demonstrates how to manipulate video frames with Canvas to add effects like grayscale and text overlays. The video also explores real-time object detection in live streams using TensorFlow and HLS technology for adaptive bitrate streaming. Practical applications include enhancing user interactivity on web platforms by rendering video frames to Canvas elements, which avoids double streaming and sync issues. The talk shows how these technologies are applied on Mux's marketing website, making it more engaging with interactive elements. The speaker emphasizes the potential of these tools for real-time video manipulation and interactive video experiences directly in the browser.

Available in Español: Explorando la manipulación de video y el lienzo HTML5

1. Introduction to Canvas and HTML5 Video APIs

Short description:

Hello, everyone, at React Summit. Today, we're going to talk about the Canvas and HTML5 video APIs and some cool stuff you can do with them. I'm Dylan Javary from Mux, where we provide Video for Developers. We focus on creating easy-to-use APIs for video. If you're interested, let's chat.

This is a test. Hello, everyone, at React Summit. I'm very excited to be talking to you here today. We're going to be talking about the Canvas and HTML5 video APIs, and some cool stuff that we found that you can do with them.

So quick intro, I'm Dylan Javary, I work at Mux. If you have not heard of Mux, Mux is Video for Developers. Maybe you know of Stripe, Stripe is payments for developers, or you know Twilio, which is phone calls and text messages for developers. We like to be like those companies, where we're built first with developers in mind and try to make great easy-to-use APIs, but we do this for video.

I'm not going to be talking too much more about Mux today, but if you are interested, come talk to me. I'd love to chat with you.

2. Introduction to React App and Player Component

Short description:

I'd love to chat with you. Let's start with a simple demo of a React app using the player component and canvas. The player component is a video element that uses HLS technology for video streaming.

I'd love to chat with you. Cool, so now to jump into some code. So I have this code sandbox set up. Code sandbox is a great tool, by the way. It's become one of my favorite pieces of software. I think there's some code sandbox folks here at this conference, so shout-out to you all. I love this product. And I'll be sharing this after so you can fork it, play with the code, do things yourself.

And let's just start out with a really simple demo. So this is a very straightforward React app. We have a few different routes. These five different examples I'm going to show and we're using React Router, React DOM. And let's start with the first one. Start with a simple demo. So right here we have simple.js. This is the component that we're rendering. We have this player component and then we have this canvas. And right now, you can't see the canvas on the page, but that's what we will be. We'll be kind of manipulating that and doing some fun stuff as we go along.

So real quickly, let's just take a look at this player component. So this player component is... It's really just a video element. But if you're familiar with video... How many of y'all have done video on the internet? So video streaming, video on demand or live streaming, anything like that. You might have used the video element before and maybe you've done an MP4 file and that can kind of work. But when you really want to do video streaming properly, what you need to do is use something like HLS. So HLS is a technology that allows you to basically download videos in segments and at different bit rates and different quality levels according to the user's bandwidth. So that's kind of something muts does for you. We're not going to get too deep into that. But that's what we're using here on this video player.

3. Exploring HTML5 Video Element and Canvas

Short description:

So this is the HTML5 video element with extra JavaScript for HLS capabilities. When the play event fires, the onPlayCallback is called. The video is duplicated on a canvas element below. The code uses the video element and a canvas context to manipulate and draw images onto the canvas. The drawImage function copies each frame from the video element to the canvas. Let's take it one step further and look at the filter example.

So this is... It's really just the HTML5 video element. And then we're attaching some extra JavaScript to give it some HLS capabilities. And then when the play event fires, that play event is when the playback begins on the video, and we're gonna call this onPlayCallback.

So let's jump back into the component that's rendering this page. Zoom in a little bit here. Make sure you can see that. So right here, we have the player, onPlayCallback. And when that fires, see what happens. What we see is this video is playing in the video element. And then it's being duplicated on this canvas element right below.

Let's jump into some of this code. So onPlay calls, we grab the video element, and we create this context, this context ref. What this is it's sort of a handle onto the canvas element. And then we can call functions on that context that allows us to manipulate that canvas element, change how it's displayed, and that's kind of our hook into manipulating the actual canvas itself. So onPlay, we call requestAnimationFrame, call updateCanvas. And what that's going to do is just call this one liner, drawImage, we pass that video element into it. And this tells the canvas to just draw this image onto the canvas. And these are the dimensions. This is the coordinates where to start and these are the dimensions to draw. And that this is actually, we call this recursively. So every time this runs, we requestAnimationFrame again, and then the callback call updateCanvas again. So you can see what's happening. We're just drawing that, we're basically copying that video element down onto the canvas and right below it. So that's how that works. Quick showing what we did there. Video element, copy each frame, draw them onto the canvas, pretty simple, right?

So now let's jump into, take this one step further. So let's go to this filter example. So what the filter does, could play, okay, same kind of thing, but you can see something else is going on here. What we're doing is the same kind of callback update canvas.

4. Manipulating Canvas and Video Frames

Short description:

We can manipulate and work with raw image data from the canvas. By iterating through the image data and adjusting color values, we can achieve effects like grayscale. Additionally, we can add text on top of the canvas, allowing for real-time modifications. This opens up possibilities for interactive video manipulation using browser APIs. Let's explore more examples, including grabbing individual frames from a video and manipulating them. The video we're using is big buck bunny, a popular example in the video streaming community.

And what we do is we draw that image onto the canvas. We extract the image data off the canvas. And now we have like raw image data that we can actually manipulate and work with. And we're gonna iterate through that image data and we're gonna mess with the color values. We can average out, if we average out the red, green, and blue values, that's gonna give us this grayscale effect. So we're actually just like manipulating the image frame by frame from the video at a time, and then putting it back onto the canvas, redrawing it onto the canvas. And you can see it has that effect. And you can see this, this canvas is always staying synced with the frame of video that the video element is rendering.

Okay. Pretty cool, right? So let's look at the steps that we did there where we took this kind of a little bit further. So, instead of just drawing each frame onto the canvas, after we do that, we're extracting the frame, manipulating the colors onto a gray scale, and then redrawing it back onto the canvas. Okay. So now we have a few more examples. Let's see what else we can do. It's going to get better and better each time. Layla, this is my coworker Phil's dog. And let's look at this example. So now in the update canvas function, we draw the image, and then we're just going to add this context Phil, we're going to call this Phil text method on the canvas. So what we're doing there is we're actually just adding text on top of the canvas. So we're rendering the video image and into the canvas, and then just adding text on top. Now you can imagine this could get pretty useful, right? If we have a video, and that video we're playing, if we just hit this video element and played it and draw it onto the canvas, then we can do all these cool things like add text in real time, do all these cool things in real time, frame by frame on the client side in the browser, all with these browser APIs. So that's where we're adding a name. Let's see what else we can do.

Okay. So now let's get into this one. This is called classify. So what we've looked at is we can grab individual frames from the video in real time, draw them onto a canvas, and before we draw them onto a canvas, we can manipulate them, right? So what else can we do? When we have a raw frame of a video, let's think about what else we can do. So this video, if you don't recognize this video, this video is big buck bunny. It's sort of the canonical, hello world video example in the kind of video streaming community. I've watched this video way too many times and it kind of makes a good example.

5. Real-time Object Detection and Use Cases

Short description:

In this classify demo, we run machine learning object detection on each video frame, drawing rectangles around detected objects. We use the TensorFlow Cocoa SSD model to detect objects in real time. By extracting image data from the canvas, we can map predictions and draw boxes with labels on the video. Although not perfect for animated content, it can detect real-life objects accurately. This opens up possibilities for real-time object detection in live video streams. Let's explore more use cases.

So I'm gonna use this for the purposes of this, this classify demo. And let's just push play here. And if you see what's happening is every frame of the video, we're running some like machine learning, object detection functionality on each image frame, and you can see it's, and then we're drawing the rectangle after we detected the object onto the frame. And right now it thinks this is a person, go a little further, now it thinks it's a bird. So we're actually like detecting frame by frame, what's going on with the objects in this video.

So let's take a look at the code. We draw the image onto the context, we extract the image data. And this is the same image data where we were manipulating the colors, but we have this extra call here, which is model.detect and we pass in that image data. So model is something that comes from this TensorFlow Cocoa SSD model, which is this TensorFlow model that will do object detection on images. It's made to work with images. And when we pass in this image data that we've extracted from the canvas, it's going to run the object detection and send us back an array of predictions that they call it, okay? So now once we have an array of predictions, we can pass those into this outline stuff function that's going to map those predictions. It has the X, Y coordinates, the width and the height of this bounding box. And then we can actually just draw those boxes with the labels directly on to that canvas element that we're already using to render the video. So you can see it thinks it's a bird, still thinks it's a bird. And dog, we saw there was a dog there for a second, here it thinks that is a sports ball. So, you know, it's not the most accurate object detection for this animated content. Now it's a sheep, it kinda looks like sheep, but we're actually able to do some pretty cool stuff. And remember, this is happening in real time. So we don't even necessarily, a lot of times when you're doing image detection on a video, you would do that out of band, on a server, kind of once the video is kind of finalized. But imagine this was a live stream, right? If we're dealing with a live stream of video, we'd be able to actually run this on the client and actually detect objects in real time. And, you know, the sky's the limit there and we can do all kinds of things with the detection that we're doing. Let's look at one more example of the classification. Let's pull up, Laila fills dog again, and you can see here, TensorFlow for a real live video. It's the type of dog, it's actually pretty good at detecting real life things, animated things, animated giant bunnies, maybe not so much, but a dog, it can get. So that is to really quickly review what we did there. So the kind of key kind of part to pay attention is that once we get images into a canvas, we can actually extract that raw image data. And then this red circle where we're doing live object detection, replace that with anything, right? Manipulate the colors, add text overlays. And then we can redraw those back onto the canvas and with all the canvas APIs that are available. So that's what we did there. Now, let's take a quick look at some real world use cases of this.

6. Enhancing Marketing Website with Interactive Video

Short description:

We recently did a design refresh on our marketing website, adding an API demo in the top hero section. Previously, we had a single video, but this time we wanted to make it more interactive. By using a strategy of copying frames from a video element and rendering them to canvas elements, we achieved the desired effects. This approach eliminated the need for double streaming, avoided playback sync issues, and allowed for a more interactive experience. If you're interested in video, let's chat!

We at Mux, we actually use this on our marketing website recently. So we recently kind of did a design refresh on our marketing website and we have this API demo in this top hero section and you can see what's going on here. These before previously on our marketing site before this iteration, we had a similar sort of API demo but it was all one video. So you can imagine if all of this here was just one video with this device and the browser popping out, that worked pretty well but we kind of wanted to make it better this time.

What we were thinking is that you'll notice that as I'm hovering over this, that's popping out. If I hover over the browser and browser pops out I can copy text here. I can interact with it. That's what we want to do. Like, let's say a developer comes here they want to copy this text or, you know, but just make it more interactive. We also have these bleeding colors in the back that we want those to bleed kind of outside the bounds of this element and kind of bleed into the top header and bleed into the bottom. And if this was just a static video we wouldn't be able to get that effect.

So, the way we were able to pull this off I have a storybook here example. So, the way we were actually able to do this is through the strategy that I described. So we actually inspect these elements. Okay, we inspect these elements. You can see that this right here is a canvas. Let me replay this. And then we see that this right here is another canvas. And then if we look further down here in the DOM, we can actually see that there's a video element. So this is the video element that is streaming the video. And then we're copying the frames of that video and rendering it to these two canvas elements in real time. So the benefits of that strategy, alternatively, we could kind of pull out the same design and have this browser be one video element and this device be another video element. And that would work okay. Except the downside of that is, number one, we're like double streaming the same video, which is gonna double the bandwidth, more bandwidth for the user. More video data being downloaded seems unnecessary and repetitive. Number two is that then the two videos could get out of sync, right? Like, if one video buffers and you're on a slow connection and the other one's not buffered yet, then you can get this playback sync happening so we'd probably have to write some JavaScript that kind of like keeps the play heads aligned and in sync and that seems kind of buggy, not a great solution. So what we did is kind of apply the strategy of taking this video element, grabbing the frames from that video element, rendering them to the canvas. And that way, these two canvases will always stay in sync. We're only downloading the video once. It works well and let's play this one more time. And that's the solution we came to. So you'll notice now I can hover over this, hover over this and the devices pop out and it's more interactive. I can copy code and now this video, this is a happy birthday video for React Summit. It's a video I found online of kids crying when they blow out their birthday candles and it's kind of funny. So happy birthday React Summit. I'm excited to be here, excited to talk with you all. And if you have anything to talk about video, I'd love to chat. If you're adding video to your product, building video, doing cool things, please chat with me and thanks for having me. Find me on Twitter, DylanJAJ and that is the end.

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Don't Solve Problems, Eliminate Them

React Advanced 2021

39 min

Don't Solve Problems, Eliminate Them

Top Content

Kent C. Dodds

Creator of EpicWeb.dev, EpicReact.Dev, TestingJavaScript.com

Kent C. Dodds discusses the concept of problem elimination rather than just problem-solving. He introduces the idea of a problem tree and the importance of avoiding creating solutions prematurely. Kent uses examples like Tesla's electric engine and Remix framework to illustrate the benefits of problem elimination. He emphasizes the value of trade-offs and taking the easier path, as well as the need to constantly re-evaluate and change approaches to eliminate problems.

remix best practices web development

Jotai Atoms Are Just Functions

React Day Berlin 2022

22 min

Jotai Atoms Are Just Functions

Top Content

Daishi Kato

Zustand, Jotai & Waku author

State management in React is a highly discussed topic with many libraries and solutions. Jotai is a new library based on atoms, which represent pieces of state. Atoms in Jotai are used to define state without holding values and can be used for global, semi-global, or local states. Jotai atoms are reusable definitions that are independent from React and can be used without React in an experimental library called Jotajsx.

state management web development builders and founders jotai react jotai react native

Debugging JS

React Summit 2023

24 min

Debugging JS

Top Content

Watch video: Debugging JS

Mark Erikson

Replay.io

Debugging JavaScript is a crucial skill that is often overlooked in the industry. It is important to understand the problem, reproduce the issue, and identify the root cause. Having a variety of debugging tools and techniques, such as console methods and graphical debuggers, is beneficial. Replay is a time-traveling debugger for JavaScript that allows users to record and inspect bugs. It works with Redux, plain React, and even minified code with the help of source maps.

best practices case study javascript web development debug

The Epic Stack

React Summit US 2023

21 min

The Epic Stack

Top Content

Watch video: The Epic Stack

Kent C. Dodds

Creator of EpicWeb.dev, EpicReact.Dev, TestingJavaScript.com

This Talk introduces the Epic Stack, a project starter and reference for modern web development. It emphasizes that the choice of tools is not as important as we think and that any tool can be fine. The Epic Stack aims to provide a limited set of services and common use cases, with a focus on adaptability and ease of swapping out tools. It incorporates technologies like Remix, React, Fly to I.O, Grafana, and Sentry. The Epic Web Dev offers free materials and workshops to gain a solid understanding of the Epic Stack.

react web development builders and founders future of development epic react

A Look Ahead at Web Development in 2025

JSNation US 2024

32 min

A Look Ahead at Web Development in 2025

Top Content

Wes Bos

Full Stack Developer, Speaker & Teacher, Co-host of Syntax.fm podcast.

Today, Wes Boss introduces the new features of the web, including customizable select and temporal, a standardized API for working with dates, time, and duration. The current date API in JavaScript has some problems related to time zones and date manipulation. With the temporal API, you can create dates without a time zone, specify dates without a year, and create durations without being attached to a specific date. The API also provides features for finding the difference between two dates. Invokers is a declarative click handlers API that eliminates the need for JavaScript. Speculation API enables pre-rendering and pre-loading of pages, improving performance. The CSS Anchor API allows positioning elements based on another element's location. Web components are encapsulated, framework-agnostic, and easy to use, offering a standardized approach for building reusable UI components. Building media UI components, like video players, is made easier with web components like Shoelace. Transformers JS allows running AI models in JavaScript for tasks like emotion detection and background removal. Python doesn't run in the browser, but JavaScript does. Small AI models can be loaded and executed faster in the browser using technologies like WebGPU. Animate height auto transition using calc size. Apply starting styles to elements for smooth animations. Use Vue transition for CSS and JavaScript animations. Syntax website with Vue transition for smooth page transitions. CSS relative colors allow for lighter or darker shades. Scope CSS ensures styles only apply to specified div containers. Web primitives facilitate modern JavaScript code. You can create web requests and receive web responses using the same primitives on both the client and server. There are many new web standards that work everywhere and frameworks like Hano and Nitro are built upon them. The select and Popover elements are accessible by default. Most of the discussed features will be available in all browsers by 2025. The future of web development with AI is uncertain, but web developers should embrace AI tools to improve efficiency. Implicit CSS lazy loading depends on whether it's prefetching or pre-rendering. Wes Boss discusses the specific features he is excited about in web development, including starting style, calc auto, and allowed discrete. He shares his preferred way of staying informed on new web development discoveries, emphasizing the importance of being part of the community and keeping up with industry discussions. Wes also mentions reading W3C meeting notes and recommends following the Twitter account Intent2Ship to stay updated on upcoming CSS features. Lastly, he discusses the potential impact of the new Scope CSS feature on developers' management of styles.

web development

Fighting Technical Debt With Continuous Refactoring

React Day Berlin 2022

29 min

Fighting Technical Debt With Continuous Refactoring

Top Content

Watch video: Fighting Technical Debt With Continuous Refactoring

Alex Moldovan

CodeSandbox

This Talk discusses the importance of refactoring in software development and engineering. It introduces a framework called the three pillars of refactoring: practices, inventory, and process. The Talk emphasizes the need for clear practices, understanding of technical debt, and a well-defined process for successful refactoring. It also highlights the importance of visibility, reward, and resilience in the refactoring process. The Talk concludes by discussing the role of ownership, management, and prioritization in managing technical debt and refactoring efforts.

team productivity web development developer challenges inspiration

Workshops on related topic

React, TypeScript, and TDD

React Advanced 2021

174 min

React, TypeScript, and TDD

Top Content

Featured Workshop

Paul Everitt

ReactJS is wildly popular and thus wildly supported. TypeScript is increasingly popular, and thus increasingly supported.

The two together? Not as much. Given that they both change quickly, it's hard to find accurate learning materials.

React+TypeScript, with JetBrains IDEs? That three-part combination is the topic of this series. We'll show a little about a lot. Meaning, the key steps to getting productive, in the IDE, for React projects using TypeScript. Along the way we'll show test-driven development and emphasize tips-and-tricks in the IDE.

react best practices typescript devtools web development test driven development react

Web3 Workshop - Building Your First Dapp

React Advanced 2021

145 min

Web3 Workshop - Building Your First Dapp

Top Content

Featured Workshop

Nader Dabit

In this workshop, you'll learn how to build your first full stack dapp on the Ethereum blockchain, reading and writing data to the network, and connecting a front end application to the contract you've deployed. By the end of the workshop, you'll understand how to set up a full stack development environment, run a local node, and interact with any smart contract using React, HardHat, and Ethers.js.

react blockchain web development ethereum web3

Remix Fundamentals

React Summit 2022

136 min

Remix Fundamentals

Top Content

Workshop

Kent C. Dodds

Building modern web applications is riddled with complexity And that's only if you bother to deal with the problems
Tired of wiring up onSubmit to backend APIs and making sure your client-side cache stays up-to-date? Wouldn't it be cool to be able to use the global nature of CSS to your benefit, rather than find tools or conventions to avoid or work around it? And how would you like nested layouts with intelligent and performance optimized data management that just works™?
Remix solves some of these problems, and completely eliminates the rest. You don't even have to think about server cache management or global CSS namespace clashes. It's not that Remix has APIs to avoid these problems, they simply don't exist when you're using Remix. Oh, and you don't need that huge complex graphql client when you're using Remix. They've got you covered. Ready to build faster apps faster?
At the end of this workshop, you'll know how to:- Create Remix Routes- Style Remix applications- Load data in Remix loaders- Mutate data with forms and actions

remix web development

Vue3: Modern Frontend App Development

Vue.js London Live 2021

169 min

Vue3: Modern Frontend App Development

Top Content

Workshop

Mikhail Kuznetsov

The Vue3 has been released in mid-2020. Besides many improvements and optimizations, the main feature of Vue3 brings is the Composition API – a new way to write and reuse reactive code. Let's learn more about how to use Composition API efficiently.

Besides core Vue3 features we'll explain examples of how to use popular libraries with Vue3.

Table of contents:
- Introduction to Vue3
- Composition API
- Core libraries
- Vue3 ecosystem

Prerequisites:
IDE of choice (Inellij or VSC) installed
Nodejs + NPM

web development vue composition api vue vue 3

Developing Dynamic Blogs with SvelteKit & Storyblok: A Hands-on Workshop

JSNation 2023

174 min

Developing Dynamic Blogs with SvelteKit & Storyblok: A Hands-on Workshop

Top Content

WorkshopFree

2 authors

This SvelteKit workshop explores the integration of 3rd party services, such as Storyblok, in a SvelteKit project. Participants will learn how to create a SvelteKit project, leverage Svelte components, and connect to external APIs. The workshop covers important concepts including SSR, CSR, static site generation, and deploying the application using adapters. By the end of the workshop, attendees will have a solid understanding of building SvelteKit applications with API integrations and be prepared for deployment.

web development fullstack ssr svelte

0 to Auth in an hour with ReactJS

React Summit 2023

56 min

0 to Auth in an hour with ReactJS

WorkshopFree

Kevin Gao

Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool. There are multiple alternatives that are much better than passwords to identify and authenticate your users - including SSO, SAML, OAuth, Magic Links, One-Time Passwords, and Authenticator Apps.
While addressing security aspects and avoiding common pitfalls, we will enhance a full-stack JS application (Node.js backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session securely for subsequent client requests, validating / refreshing sessions- Basic Authorization - extracting and validating claims from the session token JWT and handling authorization in backend flows
At the end of the workshop, we will also touch other approaches of authentication implementation with Descope - using frontend or backend SDKs.

react web development security authentication