Using MediaPipe to Create Cross Platform Machine Learning Applications with React

Rate this content
Bookmark

This talk gives an introduction about MediaPipe which is an open source Machine Learning Solutions that allows running machine learning models on low-powered devices and helps integrate the models with mobile applications. It gives these creative professionals a lot of dynamic tools and utilizes Machine learning in a really easy way to create powerful and intuitive applications without having much / no knowledge of machine learning beforehand. So we can see how MediaPipe can be integrated with React. Giving easy access to include machine learning use cases to build web applications with React.

This talk has been presented at React Advanced 2021, check out the latest edition of this React Conference.

FAQ

Shivay Lamba's talk at React Advanced focuses on using MediaPipe to create cross-platform machine learning applications with React.

More information about MediaPipe can be found on the MediaPipe documentation site at docs.mediapipe.dev and on the GitHub repository under Google's organization.

MediaPipe is Google's open-source cross-platform framework that helps build different kinds of perception pipelines, enabling the use of multiple machine learning models in a single end-to-end pipeline.

MediaPipe is important for web applications because it allows for the integration of machine learning models to enhance functionalities like face detection, hand tracking, and object detection across various platforms.

Common use cases of MediaPipe include face detection, hand tracking, selfie segmentation, face mesh, hair segmentation, object detection and tracking, human pose detection, and holistic tracking.

Face Mesh in MediaPipe is a solution that provides over 400 facial landmarks, which can be used to create applications like AR filters and virtual makeup.

You can integrate MediaPipe with React by using various NPM modules provided by the MediaPipe team, such as face mesh, face detection, hand tracking, and selfie segmentation, and incorporating them into your React application code.

MediaPipe enables cross-platform development by allowing developers to build a solution once and deploy it across multiple platforms such as Python, JavaScript, Android, and iOS.

Examples of MediaPipe in real-world applications include face mesh used in AR Lipstick try-on on YouTube, augmented reality movie filters, and Google Lens translation.

Shivay Lamba is a Google Snowfold Mentor at MediaPipe and a speaker at React Advanced.

Shivay Lamba
Shivay Lamba
21 min
25 Oct, 2021

Comments

Sign in or register to post your comment.
Video Summary and Transcription
MediaPipe is a cross-platform framework that helps build perception pipelines using machine learning models. It offers ready-to-use solutions for various applications, such as selfie segmentation, face mesh, object detection, hand tracking, and more. MediaPipe can be integrated with React using NPM modules provided by the MediaPipe team. The demonstration showcases the implementation of face mesh and selfie segmentation solutions. MediaPipe enables the creation of amazing applications without needing to understand the underlying computer vision or machine learning processes.

1. Introduction to MediaPipe and Machine Learning

Short description:

Hello, everyone. I'm Shivay Lamba. I'm currently a Google Snowfold Mentor at MediaPipe, and I'm going to be talking at React Advanced. So excited to be speaking at React Advanced on the topic of using MediaPipe to create cross-platform machine learning applications with React. Machine learning is literally everywhere today, and it's important to use it in web applications as well. MediaPipe is Google's open source cross-platform framework that helps build perception pipelines using machine learning models. It can process audio, video, image-based data, and sensor data, and includes features like end-to-end acceleration.

Hello, everyone. I'm Shivay Lamba. I'm currently a Google Snowfold Mentor at MediaPipe, and I'm going to be talking at React Advanced. So excited to be speaking at React Advanced on the topic of using MediaPipe to create cross-platform machine learning applications with React.

So a lot of this talk is going to be centering around machine learning, MediaPipe, and how you can integrate, basically, MediaPipe with React to create really amazing applications.

So without wasting any further time, let's get started.

The first thing, of course, I mean, today, machine learning is literally everywhere. You look at any kind of an application, you'll see machine learning being used there. Whether it's education, healthcare, fitness, or mining, for the sake of it. You'll find the application of machine learning today in each and every industry that is known to humankind.

So that makes machine learning so much more important to also be used in web applications as well. And today, as more and more web applications are getting into the market, we are seeing a lot more of the machine learning use cases within web applications as well.

And let's actually look at a few of these examples that we can see. For example, over here we can see a face detection happening inside of the Android. Then you can see the hands getting detected in this iPhone XR image. Then you can see the Nest Cam that everyone knows is a security camera. Then you can see some of these web effects where you can see this lady and she has some facial effects happening on her face using the web. Or you can also see the Raspberry Pi and other such kind of micro based microchips or such kind of devices that run on the edge.

And what are the things in common in all of these? That's the question. So the thing that is common in all of these is media pipe.

So what exactly is media pipe? Media pipe is essentially Google's open source cross-platform framework that actually helps you to build different kinds of perception pipelines. What that means is that we are able to basically build or use multiple machine learning models and use them in a single end-to-end pipeline to let's say build something. And we'll also look at some of the common use cases very soon.

And it has been previously used widely in a lot of the research-based products at Google. But now it has been made upstream. And now everyone can actually use it since it's an open source project. And it can be used to process any kind of an audio, video, image-based data and also sensor data. And it helps primarily with two things. One is the data set preparation for different kinds of pipelines within machine learning and also building basically end-to-end machine learning pipelines. And some of the features that are included within MediaPipe include end-to-end acceleration because everything is actually happening on-device.

2. MediaPipe Solutions and Real-World Examples

Short description:

MediaPipe is a cross-platform-based framework that offers ready-to-use solutions for various applications. Some of the solutions include selfie segmentation, face mesh with over 400 facial landmarks, hair segmentation, object detection and tracking, facial detection, hand tracking, human pose detection and tracking, holistic tracking, and 3D object detection. These end-to-end solutions are popular and have real-world applications in AR, movie filters, Google Lens, and augmented faces. A live perception example demonstrates hand tracking using landmarks to denote the edges of the hand.

Then secondly is that you just have to actually build it once and different kinds of solutions including Python, JavaScript, Android, iOS, all those can actually be used. So you just have to build it once and you can use it on different types of platforms. That is why we are calling it a cross-platform-based framework.

And then these are just ready-to-use solutions. You just have to import them and integrate them into your code and it will be very easily used. And the best part about it is that it is open-source. So all the different kinds of solutions, all different codebases you can find on the MediaPipe repository on Google's organization on GitHub.

Now, looking at some of the most commonly used solutions, some of the most well-known solutions include the selfie segmentation solution that basically, you know, is also actually being used in Google Meet where you can see the different kind of backgrounds that you can actually apply, the blurring effect. So what it does is that it uses segmentation mask to only detect the humans in the scene and it is able to extract only the information needed for the humans. And then we have Face Mesh that basically has more than 400 plus facial landmarks that you can put, and you can make a lot of different interesting applications using this. For example, let's say AR filters or makeup, right? Then we have hair segmentation that allows only you to segment out the hair. Then we have a standard computer vision based algorithms like object detection and tracking that you can do to detect specific objects. Then we have facial detection, we also have hand tracking that can track your hands and you can probably use it for things like, you know, being able to use hand-based gestures to control, let's say, your web application. Then we have the entire human pose detection and tracking that you could probably use to create some kind of a fitness application or a dance application that can actually track you. Then we have the holistic tracking that actually tracks your entire body, right? And it tracks your face, your hands, your entire pose, right? So it's a combination of basically the human pose, hand tracking and the face mesh. Then we have some more advanced object detection, like the 3D detection that can help you to detect, you know, bigger objects like a chair, shoes, table. And then we have a lot more other kinds of solutions that you can actually go ahead and look at. And these are all end-to-end solutions that you can directly just implement. That is why MediaByte solutions are so popular.

And just to look at some of the real-world examples where it's being actually used. We just spoke about the face mesh solution that you can see over here, you know, taking place on the AR Lipstick try-on that is there on YouTube. Then we have the AR-based movie filter that can be used directly in YouTube. Then we have some basically Google Lens surfaces that you can see like augmented reality taking place. Then you can also see it also being used not only like in these augmented reality or like these kind of things, but also in like more other kinds of inferences, like the Google Lens translation, that also does use the MediaByte pipelines in its packet. And you can see like augmented faces that again is based on the face mesh. So let's look at a very quick live perception example of how basically you know, it actually takes place. For this, what we're going to be doing is we're going to be looking at the hand tracking, right? So essentially what we want to do is that we take an image or a video of your hand and we're able to put these landmarks. What are landmarks? Basically, landmarks are these dots that you see and you can superimpose them on your hand and they sort of denote all the different uh, you know, like you could say the different edges of the, of your hand and you're going to be superimposing them. So this is what the example is going to be looking like. So how would that simple perception pipeline look like? So essentially, first you'll take your video input.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

A Guide to React Rendering Behavior
React Advanced 2022React Advanced 2022
25 min
A Guide to React Rendering Behavior
Top Content
This transcription provides a brief guide to React rendering behavior. It explains the process of rendering, comparing new and old elements, and the importance of pure rendering without side effects. It also covers topics such as batching and double rendering, optimizing rendering and using context and Redux in React. Overall, it offers valuable insights for developers looking to understand and optimize React rendering.
Building Better Websites with Remix
React Summit Remote Edition 2021React Summit Remote Edition 2021
33 min
Building Better Websites with Remix
Top Content
Remix is a web framework built on React Router that focuses on web fundamentals, accessibility, performance, and flexibility. It delivers real HTML and SEO benefits, and allows for automatic updating of meta tags and styles. It provides features like login functionality, session management, and error handling. Remix is a server-rendered framework that can enhance sites with JavaScript but doesn't require it for basic functionality. It aims to create quality HTML-driven documents and is flexible for use with different web technologies and stacks.
React Compiler - Understanding Idiomatic React (React Forget)
React Advanced 2023React Advanced 2023
33 min
React Compiler - Understanding Idiomatic React (React Forget)
Top Content
Watch video: React Compiler - Understanding Idiomatic React (React Forget)
Joe Savona
Mofei Zhang
2 authors
The Talk discusses React Forget, a compiler built at Meta that aims to optimize client-side React development. It explores the use of memoization to improve performance and the vision of Forget to automatically determine dependencies at build time. Forget is named with an F-word pun and has the potential to optimize server builds and enable dead code elimination. The team plans to make Forget open-source and is focused on ensuring its quality before release.
Using useEffect Effectively
React Advanced 2022React Advanced 2022
30 min
Using useEffect Effectively
Top Content
Today's Talk explores the use of the useEffect hook in React development, covering topics such as fetching data, handling race conditions and cleanup, and optimizing performance. It also discusses the correct use of useEffect in React 18, the distinction between Activity Effects and Action Effects, and the potential misuse of useEffect. The Talk highlights the benefits of using useQuery or SWR for data fetching, the problems with using useEffect for initializing global singletons, and the use of state machines for handling effects. The speaker also recommends exploring the beta React docs and using tools like the stately.ai editor for visualizing state machines.
Routing in React 18 and Beyond
React Summit 2022React Summit 2022
20 min
Routing in React 18 and Beyond
Top Content
Routing in React 18 brings a native app-like user experience and allows applications to transition between different environments. React Router and Next.js have different approaches to routing, with React Router using component-based routing and Next.js using file system-based routing. React server components provide the primitives to address the disadvantages of multipage applications while maintaining the same user experience. Improving navigation and routing in React involves including loading UI, pre-rendering parts of the screen, and using server components for more performant experiences. Next.js and Remix are moving towards a converging solution by combining component-based routing with file system routing.
(Easier) Interactive Data Visualization in React
React Advanced 2021React Advanced 2021
27 min
(Easier) Interactive Data Visualization in React
Top Content
This Talk is about interactive data visualization in React using the Plot library. Plot is a high-level library that simplifies the process of visualizing data by providing key concepts and defaults for layout decisions. It can be integrated with React using hooks like useRef and useEffect. Plot allows for customization and supports features like sorting and adding additional marks. The Talk also discusses accessibility concerns, SSR support, and compares Plot to other libraries like D3 and Vega-Lite.

Workshops on related topic

React Performance Debugging Masterclass
React Summit 2023React Summit 2023
170 min
React Performance Debugging Masterclass
Top Content
Featured WorkshopFree
Ivan Akulov
Ivan Akulov
Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)
Concurrent Rendering Adventures in React 18
React Advanced 2021React Advanced 2021
132 min
Concurrent Rendering Adventures in React 18
Top Content
Featured WorkshopFree
Maurice de Beijer
Maurice de Beijer
With the release of React 18 we finally get the long awaited concurrent rendering. But how is that going to affect your application? What are the benefits of concurrent rendering in React? What do you need to do to switch to concurrent rendering when you upgrade to React 18? And what if you don’t want or can’t use concurrent rendering yet?

There are some behavior changes you need to be aware of! In this workshop we will cover all of those subjects and more.

Join me with your laptop in this interactive workshop. You will see how easy it is to switch to concurrent rendering in your React application. You will learn all about concurrent rendering, SuspenseList, the startTransition API and more.
React Hooks Tips Only the Pros Know
React Summit Remote Edition 2021React Summit Remote Edition 2021
177 min
React Hooks Tips Only the Pros Know
Top Content
Featured Workshop
Maurice de Beijer
Maurice de Beijer
The addition of the hooks API to React was quite a major change. Before hooks most components had to be class based. Now, with hooks, these are often much simpler functional components. Hooks can be really simple to use. Almost deceptively simple. Because there are still plenty of ways you can mess up with hooks. And it often turns out there are many ways where you can improve your components a better understanding of how each React hook can be used.You will learn all about the pros and cons of the various hooks. You will learn when to use useState() versus useReducer(). We will look at using useContext() efficiently. You will see when to use useLayoutEffect() and when useEffect() is better.
React, TypeScript, and TDD
React Advanced 2021React Advanced 2021
174 min
React, TypeScript, and TDD
Top Content
Featured WorkshopFree
Paul Everitt
Paul Everitt
ReactJS is wildly popular and thus wildly supported. TypeScript is increasingly popular, and thus increasingly supported.

The two together? Not as much. Given that they both change quickly, it's hard to find accurate learning materials.

React+TypeScript, with JetBrains IDEs? That three-part combination is the topic of this series. We'll show a little about a lot. Meaning, the key steps to getting productive, in the IDE, for React projects using TypeScript. Along the way we'll show test-driven development and emphasize tips-and-tricks in the IDE.
Web3 Workshop - Building Your First Dapp
React Advanced 2021React Advanced 2021
145 min
Web3 Workshop - Building Your First Dapp
Top Content
Featured WorkshopFree
Nader Dabit
Nader Dabit
In this workshop, you'll learn how to build your first full stack dapp on the Ethereum blockchain, reading and writing data to the network, and connecting a front end application to the contract you've deployed. By the end of the workshop, you'll understand how to set up a full stack development environment, run a local node, and interact with any smart contract using React, HardHat, and Ethers.js.
Designing Effective Tests With React Testing Library
React Summit 2023React Summit 2023
151 min
Designing Effective Tests With React Testing Library
Top Content
Featured Workshop
Josh Justice
Josh Justice
React Testing Library is a great framework for React component tests because there are a lot of questions it answers for you, so you don’t need to worry about those questions. But that doesn’t mean testing is easy. There are still a lot of questions you have to figure out for yourself: How many component tests should you write vs end-to-end tests or lower-level unit tests? How can you test a certain line of code that is tricky to test? And what in the world are you supposed to do about that persistent act() warning?
In this three-hour workshop we’ll introduce React Testing Library along with a mental model for how to think about designing your component tests. This mental model will help you see how to test each bit of logic, whether or not to mock dependencies, and will help improve the design of your components. You’ll walk away with the tools, techniques, and principles you need to implement low-cost, high-value component tests.
Table of contents- The different kinds of React application tests, and where component tests fit in- A mental model for thinking about the inputs and outputs of the components you test- Options for selecting DOM elements to verify and interact with them- The value of mocks and why they shouldn’t be avoided- The challenges with asynchrony in RTL tests and how to handle them
Prerequisites- Familiarity with building applications with React- Basic experience writing automated tests with Jest or another unit testing framework- You do not need any experience with React Testing Library- Machine setup: Node LTS, Yarn