Going Live from a Browser...with Another Browser

Rate this content
Bookmark

There are other ways to proxy live video from a browser to an RTMP endpoint, but what if we wanted to interact with that stream first? And not just interact writing obtuse ffmpeg filters, but just some good ol' HTML and CSS? Let's do that! We'll talk about how you can allow your streamers to go live directly from their browser using headless Chrome and ffmpeg.

This talk has been presented at JSNation Live 2020, check out the latest edition of this JavaScript Conference.

FAQ

Mux offers online video infrastructure services for developers, including an API for live broadcasting.

Live chat involves direct communication between browsers with low latency, suitable for synchronous talking. Live broadcast is a one-to-many communication where one input stream is broadcasted to many viewers, typically with higher latency and without the expectation of direct communication back to the streamer.

GetUserMedia is part of the WebRTC API that allows access to the device's camera and microphone. Mux can broadcast content captured via GetUserMedia to a server using WebSockets, which is then encoded into RTMP for live broadcasting.

Live broadcasts are primarily powered by RTMP for ingesting live content and HLS for broadcasting content, allowing scalability and handling of video files via HTTP requests.

WebRTC is a technology that enables direct browser-to-browser communications with low latency. Mux uses WebRTC for live chats and can implement server-side WebRTC solutions for more complex broadcasting scenarios.

No, it's currently not feasible to convert WebRTC into RTMP directly in the browser due to limitations in accessing the necessary network stack through browser technologies.

Using Chrome in headless mode for broadcasting requires running one Chrome instance per stream, which can be resource-intensive and complex to orchestrate. It's not the most common approach due to these challenges.

Yes, Mux enables users to go live directly from the browser without the need to download third-party software like OBS.

Matt McClure
Matt McClure
8 min
18 Jun, 2021

Comments

Sign in or register to post your comment.
Video Summary and Transcription
This video explains how to go live from one browser to another using WebRTC and RTMP. Live chat involves low latency communication between browsers using WebRTC, while live broadcast uses RTMP and HLS for one-to-many streaming. The video discusses the limitations of converting WebRTC to RTMP directly in the browser and suggests using GetUserMedia to capture media and broadcast via WebSockets to a server, which then encodes it into RTMP. Another approach involves using Chrome in headless mode and the MediaRecorder API, although this method is resource-intensive. A more efficient method is using a full Docker container to capture and stream the screen with FFmpeg, avoiding the MediaRecorder API. This method offers more reliable streaming and flexibility in manipulating the stream. For more details, you can refer to Nick's talk from All Things RTC.

1. Introduction to Live Broadcast and WebRTC

Short description:

Hey everybody, my name is Matthew McClure. I'm one of the cofounders of Mux, and we do online video infrastructure for developers. Today we're talking about going live from the browser via another browser. Live chat and live broadcast are different in terms of communication and technology. Live chat uses WebRTC for low latency synchronous communication between browsers, while live broadcast uses RTMP and HLS for one-to-many streaming. We can't turn WebRTC into RTMP in the browser, but we can use a server-side WebRTC implementation. However, this approach may not be the easiest or most flexible for video processing on the server side.

Hey everybody, my name is Matthew McClure. I'm one of the cofounders of Mux, and we do online video infrastructure for developers. So one of our features is an API to live broadcast, and that's where we get a ton of questions from developers on how to help their customers go live. They're in a world where they want to just build an application in the browser, let the user just log in and immediately go live without needing to download third-party software like OBS or something like that to be able to do it. Totally makes sense.

But today we're not talking about just going live from the browser, we're talking about going live from the browser via another browser. This is also probably a bad idea for most use cases, but when you need this kind of thing, this can be a really great path forward. So we covered something similar, or another path to do this at React Summit. So we're going to quickly recap some of these high-level concepts, just to get on the same page. But if you want more information, you might want to check out that talk as well. You can just find it on YouTube.

So common misconception is that live broadcast is the same as live chat. So live chat, you have two browsers that can communicate, or a few browsers, that can communicate directly to each other, sub 500 milliseconds of latency so they can talk synchronously. Live broadcast, on the other hand, is one-to-many. So you have one input stream out to many viewers, and that can be 50 to a million viewers. Latency can be 10 seconds plus, it's fine, because there's not really an expectation to be able to communicate back to that streamer. So because of those constraints, the same technology really doesn't work very well for both of them. For a live chat, it's typically powered by browser technologies like WebRTC or proprietary implementations that can allow you to communicate directly between the streamers so that you have as low a latency as possible. Live broadcast, on the other hand, is powered by technologies like RTMP and HLS. RTMP is kind of an old flash implementation that has become the de facto standard for being able to ingest live content directly into a server, which then will transcode that content and broadcast out via HLS. We won't get the specifics of HLS, but for our purposes, it allows you to download video via git requests on the browser, and you can just scale it as you would any other file transfer, which is really nice.

Okay, so let's just take WebRTC and then turn that into RTMP in the browser, is probably what you're thinking. Unfortunately, no, we can't get quite low enough in the network stack in a browser to be able to do it, so even in our current modern world of WASM and all this other goodies, we just can't quite get there. But let's talk about what technologies we can access. So whatever we're talking about here, it's all involving a server in some way, but the first way is we can take WebRTC and then use a server-side WebRTC implementation. So if you'd asked me a year ago, I'd have said, This is crazy, this has gotten a lot better. Projects like Pyon have come a really long way. It's a per year ago implementation. So this actually isn't that crazy anymore, but it's still not, it's certainly not the easiest way that you can get this done. And if you want to be able to do anything interesting with the video on the server side, via client-side technologies, this would kind of leave you in the cold a little bit.

2. Running WebRTC on a Server via Chrome

Short description:

To fix the issue of running WebRTC on a server via Chrome, an alternative approach is to use GetUserMedia to capture the microphone and camera, broadcast it to a server via WebSockets, and encode it into RTMP. This involves running one Chrome per input or output stream, which can be resource-intensive. However, open source projects like Jitsi have implemented this method for one-to-many or few-to-many broadcasts. Another approach is to use the Chrome.tabCapture API, which has similar internals to the MediaRecorder API. This allows for running Chrome in headless mode, providing easier multi-tenant access and browser features, but still relying on the MediaRecorder API.

So, to fix that last thing, what if we just took WebRTC, ran it on a server via Chrome? It can be done, but the problem is now you're running Chrome. Or we can take GetUserMedia, which is just a few of the WebRTC APIs that allow you to get like the microphone and camera. We'll broadcast that to a server via WebSockets, and then encode it into RTMP.

So, you might be thinking, how does that work? Let's go back to this headless Chrome thing. If you want more information on that one, you can talk about the other talk I mentioned. Or you can go watch the other talk I mentioned. So WebRTC to a server-side WebRTC via headless Chrome. Kind of cool. You can just have a chat, one-to-one, few-to-few. Have headless Chrome join that chat, broadcast that via RTMP. Really interesting. You want to hide that Chrome in the client of the other chatters, but that Chrome can then lay out the chat interface how it wants, add overlays, anything like that, right there.

So what about these downsides? You have to run one Chrome per input stream or per output stream. And so you have all the orchestration that comes with that. So if you use Chrome as your normal browser, you might notice it's resource-intensive. That also applies on the server. The bigger issue, though, is it's not the most beaten path. A lot of people are doing this. They're just not talking about it. The exception is open source projects like Jitsi, which if you're not familiar is like an open source Zoom competitor. That's how they do one-to-many broadcast, or a few-to-many broadcast.

So there are a few paths to get this done, which come with a few tradeoffs. One is to do this getUserMedia style approach and then broadcast that to a server of web sockets. You might be thinking like, wait, why are we talking about this again? It's not actually getUserMedia. Now we're going to use the Chrome.tabCapture, but it uses a very similar API under the hood. It's the same internals as the MediaRecorder API, which is what we would use in that implementation. So here we take WebRTC, the same process where we have it in the browser, that joins, call the TabCapture API, broadcast that via WebSocket to a server that encodes it in RTP and goes to the rest of the workflow. Those can be on the same server, but that's kind of the high level. The pros are that you can actually run Chrome in headless mode, which means you get much more multi-tenant, much easier multi-tenant access, you get all these browser features, we can use the fancy WebSocket workflow. The downside is it still uses the MediaRecorder API, which is kind of a disaster.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Video Editing in the Browser
React Summit 2023React Summit 2023
23 min
Video Editing in the Browser
Top Content
Watch video: Video Editing in the Browser
This Talk discusses the challenges of video editing in the browser and the limitations of existing tools. It explores image compression techniques, including Fourier transform and Huffman encoding, to reduce file sizes. The video codec and frame decoding process are explained, highlighting the importance of keyframes and delta frames. The performance bottleneck is identified as the codec, and the need for specialized hardware for efficient video editing is emphasized. The Talk concludes with a call to create a simplified API for video editing in the browser and the potential for AI-powered video editing.
Pushing the Limits of Video Encoding in Browsers With WebCodecs
JSNation 2023JSNation 2023
25 min
Pushing the Limits of Video Encoding in Browsers With WebCodecs
Top Content
Watch video: Pushing the Limits of Video Encoding in Browsers With WebCodecs
This Talk explores the challenges and solutions in video encoding with web codecs. It discusses drawing and recording video on the web, capturing and encoding video frames, and introduces the WebCodecs API. The Talk also covers configuring the video encoder, understanding codecs and containers, and the video encoding process with muxing using ffmpeg. The speaker shares their experience in building a video editing tool on the browser and showcases Slantit, a tool for making product videos.
Creating Videos Programmatically in React
React Summit Remote Edition 2021React Summit Remote Edition 2021
34 min
Creating Videos Programmatically in React
The Talk discusses the use of ReMotion, a library that allows for declarative video creation in React. It covers the process of creating videos, animating elements, and rendering multiple compositions. The Talk also mentions the features of ReMotion, such as audio support and server-side rendering. ReMotion 2.0 introduces audio support and the possibility of live streaming. The Talk concludes by highlighting the frustration with existing video editing tools and the integration of existing videos into ReMotion projects.
Getting Weird with Video Manipulation and HTML5 Canvas
React Summit 2020React Summit 2020
16 min
Getting Weird with Video Manipulation and HTML5 Canvas
Today's Talk at React Summit focused on the Canvas and HTML5 video APIs, showcasing the capabilities and possibilities they offer for video manipulation and interactivity. The speaker demonstrated how to use the HTML5 video element and canvas to manipulate and draw images, apply filters, and add real-time text overlays. They also showcased real-time object detection on video frames using machine learning. The Talk concluded with an example of enhancing a marketing website with interactive video using the canvas element. Overall, the Talk highlighted the power and potential of these APIs for video development.
Going Live from your Browser without WebRTC
React Summit Remote Edition 2020React Summit Remote Edition 2020
13 min
Going Live from your Browser without WebRTC
Mux provides an API for live streaming and aims to keep users in their own applications. Live broadcast and live chat are different, with live chat using WebRTC and live broadcast using RTMP and HLS. WebRTC can be implemented using headless Chrome or the getUserMedia process. Mux targets developers building platforms and suggests using semantic HTML. Ionic supports native apps and custom native views.
Remember CSS Sprites? Let's Do That with Video!
React Summit Remote Edition 2021React Summit Remote Edition 2021
7 min
Remember CSS Sprites? Let's Do That with Video!
Today's Talk discusses the use of video sprites to optimize video playback. Video sprites, similar to CSS sprites, allow for the selection of specific parts of an image. By combining multiple videos into one and selecting the desired one, video synchronization can be enhanced, and viewers can choose different camera angles in live events. The implementation involves dividing the video into quadrants, allowing viewers to select which quadrant they want to watch for synchronized feeds and a shared audio stream.

Workshops on related topic

Build Your Own Live Streaming Platform
React Summit Remote Edition 2021React Summit Remote Edition 2021
138 min
Build Your Own Live Streaming Platform
Workshop
Dylan Jhaveri
Dylan Jhaveri
In this workshop we will go over the basics of streaming video over the internet including technologies like HLS and WebRTC. We will then build our own React app to play live streams and recordings of live streams.