Video Summary and Transcription
Today, I'll be talking about GspotJS and Gaussian Splatting, a revolutionary graphics pipeline that can render high-fidelity scenes at 144 FPS. Gaussian Splatting is a technique that converts data directly into an image using Gaussians. GspotJS is a lightweight JavaScript library for Gaussian Splat rendering, with features like 4D rendering. The library aims to provide a simple and speedy way to view Splats on the web, while more advanced applications can use Mackellog Gaussian Splats 3D. Both Gaussian Splatting and gSplotJS are open-source.
1. Introduction to GspotJS
Today, I'll be talking about GspotJS and Gaussian Splatting. GspotJS is a revolutionary graphics pipeline that can render high-fidelity scenes at 144 FPS. Gaussian Splatting is the technique behind it, where multiple pictures are used to estimate a 3D point cloud, which is then represented as Gaussians in a matrix. These Gaussians are then rasterized into an image and trained to produce images that resemble the original ones. The trained set of Gaussians can be rasterized from any angle to generate an image.
Hi everyone, today I'll be talking about GspotJS. What is it? Its history? How it works? And where it's going? But first, who am I? My name is Dylan. I'm a developer advocate at Hugging Face, where I build tools and create educational content, sometimes under my name and sometimes under individual keks.
Speaking of which, to answer the question, what is GspotJS? I first need to answer, what is Gaussian Splatting? I have a 2 minute video on that, here it is. Gaussian Splatting. What's that? It's a way to render stuff really high fidelity, really fast. It's a big deal because it's totally different from any existing graphics pipeline and is capable of rendering scenes that look like this, at 144 FPS. The original research paper is 3D Gaussian Splatting for Real-Time Radiance Field Rendering.
What does that mean? I'll explain how it works. Step one, take a bunch of pictures of stuff from different angles, then use an old algorithm called Structure From Motion to estimate a point cloud from the pictures at different angles. Step two, take every point in the point cloud and say you're a Gaussian now. I'm a what? A distribution that looks like this, but in 3D, and also can be skewed, which is what I like to call multivariate. Multivariate. Everyone calls it that. We also assign a color and an alpha. Now we can put all these Gaussians into one giant matrix, with 16 columns and rows, one for every Gaussian. This is all the data we need to represent the scene.
So are we done? No. Step three, rasterization, meaning turn all these Gaussians into an image. How? The simplified version is, according to your camera perspective, project the Gaussians into 2D, then sort them by depth, then for every pixel, iterate over every Gaussian, front to back, calculate their contribution to that pixel, then blend them all together. Now we have an image. So are we done? No. Part four, training. These Gaussians don't have the right values, so we need to train them. Meaning, adjust the values of the Gaussians so that they produce images that look like the original images. This is a lot like training a neural network, but with zero layers, which is why it's so fast. The training also uses automated densification and pruning. Meaning, when a Gaussian is struggling to fit a detailed part of the scene, it splits into two Gaussians. And when a Gaussian's alpha gets too low, it gets removed. Now we have a trained set of millions of Gaussians that can be rasterized from any angle, to produce an image.
2. Overview of Gaussian Splatting and G-Splat JS
Gaussian Splatting is a revolutionary rasterization technique that converts data directly into an image. G-Splat JS is a lightweight JavaScript library for Gaussian Splat rendering, similar to other rendering libraries but with added features like 4D rendering. The history of G-Splat JS involves the use of Spaces, a machine learning application at HuggingFace, and the development of a JavaScript library to simplify visualizing Splat results in machine learning demos.
Okay, now what? Well, this is extremely new. It's kinda like when traditional rasterization was first invented, and then Doom came along, and added shadows. And everyone was like, wow, you added shadows. And then came reflections, normal maps, indirect lighting, you know, And this paper is basically reinventing step one. Now you may be thinking, isn't this the same as photogrammetry? No, because this is a rasterization technique, meaning it converts the underlying data directly into an image, without the need for ray tracing, path tracing, or diffusion. So why didn't it exist until now? Because even though it's a simple operation, for it to look as good as it does, you need millions of Gaussians. Which requires several gigs of VRAM. So is graphics about to totally change forever? Or is this a niche application like photogrammetry? Let me know what you think.
So that's Gaussian Splatting. Now what's G-Splat JS? It's a JavaScript library for Gaussian Splat rendering. It has a lot in common with other rendering libraries, like 3JS, or BabylonJS, where you can render a scene, with this code, where you setup a scene, a camera, a renderer, and controls. And then, in an update loop, update the controls, and render the scene. Pretty simple. It also has some extra bells and whistles, like 4D rendering. Basically, a video you can look around. It's also very lightweight, under 1MB, a lot smaller than other rendering libraries.
Now for the history. I'm not a graphics programmer, or a JavaScript developer. But something really cool we have at HuggingFace, is Spaces. These are machine learning applications, made by the community, usually using Gradio, a Python library, that makes it really easy to build machine learning web applications. One of the components of this is Model 3D, which makes it easy to display 3D mesh results. And when Gaussian Splatting came along, I wanted to enable visualizing Splat results. So I found this open source JavaScript renderer, Antimatter15-Splat, by Kevin Kwok. And I was re-implementing this in Spaces. It was really painful. And I thought, it'd be nice if there was a JavaScript library that made this easier. So I made it. Hopefully it'll save others some time. By the way, earlier I mentioned that Gradio Model 3D could visualize mesh results. Well now it can also visualize Splat results, enabling machine learning demos like this.
3. Working and Future of gSplotJS
You can upload an image and generate a 3D Gaussian Splatting scene with gSplotJS. The renderer in gSplotJS uses CountingSort with an asynchronous Web Worker in WebAssembly for fast rendering. The library aims to provide a simple and speedy way to view Splats on the web. For more advanced applications, Mackellog Gaussian Splats 3D, built on top of 3JS, is recommended. Gaussian Splatting and gSplotJS are both open-source.
Where you can upload an image, and generate a 3D Gaussian Splatting scene. That's the history.
Now what about how it works? If you look at the project files, most things like cameras, controls, math, are all pretty standard 3D stuff. The Gaussian Splatting part is the renderer. Here's most of the rendering code. I'm not gonna walk through it. But something's worth noting. Is it the biggest bottleneck of Gaussian Splat rendering, is sorting the Splats. For this, I'm using CountingSort, with an asynchronous Web Worker, in WebAssembly. This makes gSplotJS very fast. But theoretically, it could be even faster with GPU-based Parallel Radix Sort, using newer technologies like WebGPU. And if you have experience with that, this is all open-sourced. So come contribute.
So what about the future of gSplotJS? The goal of this library, is to easily view Splats on web, with a focus on speed and simplicity. If you're interested in heavier applications, like games, or hybrid mesh and splat rendering, I recommend, Mackellog Gaussian Splats 3D. A Gaussian Splat renderer built on top of 3JS. In conclusion, Gaussian Splatting's pretty cool. And gSplotJS makes it easy to render on web. And the best part, it's open-source. Thank you for watching.
Comments