Video Editing in the Browser

Video editing is a booming market with influencers being all the rage with Reels, TikTok, Youtube. Did you know that browsers now have all the APIs to do video editing in the browser? In this talk I'm going to give you a primer on how video encoding works and how to make it work within the browser. Spoiler, it's not trivial!

This talk has been presented at React Summit 2023, check out the latest edition of this React Conference.

Watch video on a separate page

FAQ

During the pandemic, Christophe Archido spent a lot of time doing video editing and considered becoming a full-time YouTuber.

A video codec is a technology that compresses and decompresses digital video. It uses keyframes and delta frames to efficiently reduce video file sizes. Popular video codecs include H.264, AVC, AV1, VP8, and VP9.

Demuxing is the process of reading a video file's binary data, extracting frames, and sending them in the correct order to a codec for decoding. This involves handling video containers like mp4, avi, and mkv.

Image compression is a process that reduces the size of an image file by using techniques like run-length encoding, Fourier transforms, and Huffman encoding. Popular image compression formats include JPEG, WEBP, and PNG.

Christophe Archido found that Final Cut Pro lacked modern AI advancements, such as automatic background removal and transcribing spoken words into text.

Christophe Archido explored WebCodecs for encoding and decoding, TensorFlow.js for background removal, and Whisper for transcribing speech to text.

Christophe Archido, also known as Vegeux on the Internet, is a software engineer who has contributed to the React community by co-creating React Native, Prettier, Excalibur, and CSS in JS.

Christophe Archido used Final Cut Pro for video editing.

The main challenges include handling large file sizes, implementing efficient image and video compression, dealing with stateful APIs, and ensuring performance comparable to traditional software like Final Cut Pro.

Christophe Archido's call to action is for developers to create a simplified and robust API for browser-based video editing, akin to the jQuery of video editing.

Christopher Chedeau
Christopher Chedeau
23 min
06 Jun, 2023

Comments

Sign in or register to post your comment.
Video Summary and Transcription
This Talk discusses the challenges of video editing in the browser and the limitations of existing tools. It explores image compression techniques, including Fourier transform and Huffman encoding, to reduce file sizes. The video codec and frame decoding process are explained, highlighting the importance of keyframes and delta frames. The performance bottleneck is identified as the codec, and the need for specialized hardware for efficient video editing is emphasized. The Talk concludes with a call to create a simplified API for video editing in the browser and the potential for AI-powered video editing.
Available in Español: Edición de video en el navegador

1. Introduction to Video Editing in the Browser

Short description:

Hey, everyone. Today, I want to talk about video editing in the browser. I spent a lot of time doing video editing during the pandemic. However, I realized that the existing tools didn't have the AI advancements I needed. I wanted to remove the green screen and shadows, and cut based on spoken words. On the other hand, I saw exciting developments in JavaScript, such as WebCodecs, TensorFlow.js, and Whisper. This talk will explain why I couldn't fully achieve a good video editor powered by AI. Let's start with thinking about making a video.

Hey, everyone. My name is Christophe Archido, also known as Vegeux on the Internet. And I've done a few things for the React community. I co-created React Native, Prettier, Excalibur, CSS in JS, but today I want to talk about something different. I want to talk about video editing in the browser.

So during the pandemic, I spent a lot of time doing video editing. And I was even thinking maybe I should go like become a YouTuber full-time. But then I realized that with this number of views, I should probably keep my job as a software engineer for a bit longer.

So what does it mean to edit videos? So I used a tool called Final Cut Pro. And I felt that it was built like many, many years ago and didn't have all of the AI advancements that we've seen recently. So for example, I bought a $20 green screen. And I need to pick the green color and the range in order to remove it. And as you can see, there's some shadows behind me in the picture. And it wasn't properly removed. Then in order to cut, I want to know what am I actually saying to know which part I should be cutting. But I only got the sound waves and not the actual words spoken. On the other side, I was looking at the JavaScript, like the browser news, and I saw a lot of super exciting stuff happening. So we can start doing encoding and decoding with WebCodecs. TensorFlow.js lets you remove the background from the video. And then, Whisper is letting you take what I'm saying into actual words. So we had seemingly all of the building blocks in order to be able to do a really good video editor powered by AI, but unfortunately, I wasn't able to get all the way there. And this talk is going to be the story of why.

So usually when I walk into some new product like this, there's some things that I think are true I'm going to use to base all of the things I'm doing upon. But there were three things in this case that were not true. So the first one is that time only travels forward. The second is that when you encode one frame, you're getting one frame back. And finally that WASM is faster than JavaScript for video decoding. So if you want to know why this is not true, buckle up. We're getting to it. So let's start with thinking about making a video.

2. Video Editing API and Image Compression

Short description:

Unfortunately, the desired API for video editing in the browser is not possible due to the large file sizes involved. A single image of a thousand by thousand pixels can already be around four megabytes in size. With 60 frames per second, a one-second video would be around 200 megabytes. This is too big for current browsers and computers. However, image compression techniques have been developed to address this issue, which will be discussed in the following minutes.

And unfortunately I cannot be here in person today, so what I decided to do was to bring some of the sunny California to Amsterdam. And for this I put a palm tree in all of the pictures. So in this case, we have React summit in the background and then moving to the foreground and the palm tree fading away. So what would be the API that I would expect to be able to do that? So I initially wanted a load video kind of API. That takes a file path and returns me a list of images. And then I'm going to modify the images, remove the background, like cut and paste and a bunch of stuff. And then like a save video that would take the file path and render. And like a list of images and like actually save the video.

So unfortunately, this API cannot exist. So let's see why. So let's go into like one image of this whole video. And not too big, not too small. Like a thousand by thousand image. And how large is it actually to represent this? So it's going to be like one thousand by one thousand pixels. About one megabyte. And then there's red, green and blue. And so we are about like four megabytes in size. And this is just for one image. Now, if you want like 60 fps, like one second, you're going to be at like 200 megabytes for every single second. So this talk right now is around 20 minutes. So this is going to be big. And this is actually going to be too big for the browser or like any computer right now. And what do we do? So fortunately, a lot of very smart people have worked on this for years. And what they built is a shrinking machine. Well, not exactly. What people have been doing is image compression. And so I'm going to talk for like the next few minutes around like different types of image compression. And not because I find interesting, which I do, but because they actually have a big factor into the actual API used for video encoding. So let's see the main ideas around video encoding. Sorry, about image compression.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

React Compiler - Understanding Idiomatic React (React Forget)
React Advanced 2023React Advanced 2023
33 min
React Compiler - Understanding Idiomatic React (React Forget)
Top Content
Watch video: React Compiler - Understanding Idiomatic React (React Forget)
Joe Savona
Mofei Zhang
2 authors
The Talk discusses React Forget, a compiler built at Meta that aims to optimize client-side React development. It explores the use of memoization to improve performance and the vision of Forget to automatically determine dependencies at build time. Forget is named with an F-word pun and has the potential to optimize server builds and enable dead code elimination. The team plans to make Forget open-source and is focused on ensuring its quality before release.
Speeding Up Your React App With Less JavaScript
React Summit 2023React Summit 2023
32 min
Speeding Up Your React App With Less JavaScript
Top Content
Watch video: Speeding Up Your React App With Less JavaScript
Mishko, the creator of Angular and AngularJS, discusses the challenges of website performance and JavaScript hydration. He explains the differences between client-side and server-side rendering and introduces Quik as a solution for efficient component hydration. Mishko demonstrates examples of state management and intercommunication using Quik. He highlights the performance benefits of using Quik with React and emphasizes the importance of reducing JavaScript size for better performance. Finally, he mentions the use of QUIC in both MPA and SPA applications for improved startup performance.
SolidJS: Why All the Suspense?
JSNation 2023JSNation 2023
28 min
SolidJS: Why All the Suspense?
Top Content
Suspense is a mechanism for orchestrating asynchronous state changes in JavaScript frameworks. It ensures async consistency in UIs and helps avoid trust erosion and inconsistencies. Suspense boundaries are used to hoist data fetching and create consistency zones based on the user interface. They can handle loading states of multiple resources and control state loading in applications. Suspense can be used for transitions, providing a smoother user experience and allowing prioritization of important content.
From GraphQL Zero to GraphQL Hero with RedwoodJS
GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
From GraphQL Zero to GraphQL Hero with RedwoodJS
Top Content
Tom Pressenwurter introduces Redwood.js, a full stack app framework for building GraphQL APIs easily and maintainably. He demonstrates a Redwood.js application with a React-based front end and a Node.js API. Redwood.js offers a simplified folder structure and schema for organizing the application. It provides easy data manipulation and CRUD operations through GraphQL functions. Redwood.js allows for easy implementation of new queries and directives, including authentication and limiting access to data. It is a stable and production-ready framework that integrates well with other front-end technologies.
Jotai Atoms Are Just Functions
React Day Berlin 2022React Day Berlin 2022
22 min
Jotai Atoms Are Just Functions
Top Content
State management in React is a highly discussed topic with many libraries and solutions. Jotai is a new library based on atoms, which represent pieces of state. Atoms in Jotai are used to define state without holding values and can be used for global, semi-global, or local states. Jotai atoms are reusable definitions that are independent from React and can be used without React in an experimental library called Jotajsx.
Debugging JS
React Summit 2023React Summit 2023
24 min
Debugging JS
Top Content
Watch video: Debugging JS
Debugging JavaScript is a crucial skill that is often overlooked in the industry. It is important to understand the problem, reproduce the issue, and identify the root cause. Having a variety of debugging tools and techniques, such as console methods and graphical debuggers, is beneficial. Replay is a time-traveling debugger for JavaScript that allows users to record and inspect bugs. It works with Redux, plain React, and even minified code with the help of source maps.

Workshops on related topic

Build Modern Applications Using GraphQL and Javascript
Node Congress 2024Node Congress 2024
152 min
Build Modern Applications Using GraphQL and Javascript
Featured Workshop
Emanuel Scirlet
Miguel Henriques
2 authors
Come and learn how you can supercharge your modern and secure applications using GraphQL and Javascript. In this workshop we will build a GraphQL API and we will demonstrate the benefits of the query language for APIs and what use cases that are fit for it. Basic Javascript knowledge required.
Building a Shopify App with React & Node
React Summit Remote Edition 2021React Summit Remote Edition 2021
87 min
Building a Shopify App with React & Node
Top Content
WorkshopFree
Jennifer Gray
Hanna Chen
2 authors
Shopify merchants have a diverse set of needs, and developers have a unique opportunity to meet those needs building apps. Building an app can be tough work but Shopify has created a set of tools and resources to help you build out a seamless app experience as quickly as possible. Get hands on experience building an embedded Shopify app using the Shopify App CLI, Polaris and Shopify App Bridge.We’ll show you how to create an app that accesses information from a development store and can run in your local environment.
Build a chat room with Appwrite and React
JSNation 2022JSNation 2022
41 min
Build a chat room with Appwrite and React
WorkshopFree
Wess Cope
Wess Cope
API's/Backends are difficult and we need websockets. You will be using VS Code as your editor, Parcel.js, Chakra-ui, React, React Icons, and Appwrite. By the end of this workshop, you will have the knowledge to build a real-time app using Appwrite and zero API development. Follow along and you'll have an awesome chat app to show off!
Hard GraphQL Problems at Shopify
GraphQL Galaxy 2021GraphQL Galaxy 2021
164 min
Hard GraphQL Problems at Shopify
WorkshopFree
Rebecca Friedman
Jonathan Baker
Alex Ackerman
Théo Ben Hassen
 Greg MacWilliam
5 authors
At Shopify scale, we solve some pretty hard problems. In this workshop, five different speakers will outline some of the challenges we’ve faced, and how we’ve overcome them.

Table of contents:
1 - The infamous "N+1" problem: Jonathan Baker - Let's talk about what it is, why it is a problem, and how Shopify handles it at scale across several GraphQL APIs.
2 - Contextualizing GraphQL APIs: Alex Ackerman - How and why we decided to use directives. I’ll share what directives are, which directives are available out of the box, and how to create custom directives.
3 - Faster GraphQL queries for mobile clients: Theo Ben Hassen - As your mobile app grows, so will your GraphQL queries. In this talk, I will go over diverse strategies to make your queries faster and more effective.
4 - Building tomorrow’s product today: Greg MacWilliam - How Shopify adopts future features in today’s code.
5 - Managing large APIs effectively: Rebecca Friedman - We have thousands of developers at Shopify. Let’s take a look at how we’re ensuring the quality and consistency of our GraphQL APIs with so many contributors.
0 To Auth In An Hour For Your JavaScript App
JSNation 2023JSNation 2023
57 min
0 To Auth In An Hour For Your JavaScript App
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.js backend + Vanilla JS frontend) to authenticate users with One Time Passwords (email) and OAuth, including:
- User authentication – Managing user interactions, returning session / refresh JWTs- Session management and validation – Storing the session securely for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.