Using MediaPipe to Create Cross Platform Machine Learning Applications with React

Rate this content
Bookmark
Slides

This talk gives an introduction about MediaPipe which is an open source Machine Learning Solutions that allows running machine learning models on low powered devices and helps integrate the models with mobile applications. It gives these creative professionals a lot of dynamic tools and utilizes Machine learning in a really easy way to create powerful and intuitive applications without having much / no knowledge of machine learning beforehand. So we can see how MediaPipe can be integrated with React. Giving easy access to include machine learning use cases to build web applications with React.

This talk has been presented at React Day Berlin 2022, check out the latest edition of this React Conference.

FAQ

MediaPipe is an open-source, cross-platform framework designed for building perception pipelines, particularly focused on video and audio data. It enables developers to prepare datasets, run machine learning models, and visualize outputs for various applications.

MediaPipe can be integrated with ReactJS to create machine learning applications by utilizing JavaScript and specific NPM packages. Developers can embed MediaPipe solutions directly into their React code, facilitating the development of interactive web applications that leverage machine learning.

MediaPipe is used in various applications including facial recognition on devices like the iPhone XR, object detection in security cameras, and augmented reality features in apps like Google Meet for virtual backgrounds. Companies like L'Oreal use it for AR-based makeup trials.

MediaPipe offers end-to-end acceleration utilizing system CPUs or GPUs, supports multiple platforms like JavaScript, Android, and iOS, and provides ready-to-use solutions for tasks like face mesh, object detection, and human pose tracking.

To use MediaPipe models in React, you install the necessary NPM packages, import the MediaPipe functions, and integrate them with your application components. This setup allows for the utilization of machine learning features directly in the React application environment.

Developers can explore various MediaPipe implementations and demonstrations on the official MediaPipe website (mediapipe.dev) and through specific demos listed under mediapipe.dev/demo. The site provides extensive documentation and sample code for different use cases.

Shivay Lamba
Shivay Lamba
20 min
05 Dec, 2022

Comments

Sign in or register to post your comment.
Video Summary and Transcription
Welcome to a talk on using MediaPipe for cross-platform machine learning applications with ReactJS. MediaPipe provides ready-to-use solutions for object detection, tracking, face mesh, and more. It allows for video transformation and tensor conversion, enabling the interpretation of video footage in a human-readable form. MediaPipe utilizes graphs and calculators to handle the perception pipeline. Learn how to use MediaPipe packages in React and explore a demo showcasing the hands model for detecting landmarks. Custom logic can be written to detect open and closed landmarks, making it useful for applications like American Sign Language.

1. Introduction to MediaPipe

Short description:

Welcome to my talk at React Day Berlin 2022. I'm Shivaay, presenting on using MediaPipe for cross-platform machine learning applications with ReactJS. MediaPipe is an open source framework that allows for end-to-end machine learning inference and is especially useful for video and audio analysis. It provides acceleration using system hardware and can be used across multiple platforms.

Welcome, everyone, to my talk at React Day Berlin 2022. I'm Shivaay, presenting virtually on the topic of using MediaPipe to create cross-platform machine learning applications with the help of ReactJS. I'm a Google Code Mentor at MediaPipe and also on TensorFlow.js working group lead, and you can connect with me on my Twitter, how-to-develop. And without wasting any further time, let's get started.

So today we see a lot of applications of machine learning everywhere. And this is especially true for web applications with the advent of libraries like TensorFlow.js, MediaPipe. There are a lot of these full stack applications that utilize the machine learning capabilities in their web apps, and we are seeing those also in production with a lot of startups and even companies like LinkedIn, which are using machine learning to power up multiple applications. And that's because of the fact that machine learning is so versatile that it could use it for a number of different applications. And here are some of the common areas where you see the use of machine learning. And especially one thing that is common amongst all of these applications. We can see from the left-hand side we have some people utilizing the face detection on the iPhone XR. You can see some points that are able to detect your hands. Then you can see some really cool effects with web and we can see some facial expressions. And then you have the next cam that uses the camera to be able to detect objects. Then we have OkGoogle or Google assistant and even things like Raspberry Pi, Coral, Edge TPUs. So all of them have one thing in common and that common thing is they are being powered with the help of machine learning and that with the help of MediaPipe.

So what is MediaPipe? MediaPipe is an open source cross-platform framework that is used for building perceptions and dedicatedly towards video and audio based perceptions. So just think of it in this line that in case you want to build an end-to-end machine learning application, so MediaPipe allows you to actually prepare not only your datasets, but also it allows you to go through the entire machine learning inference. That means that not only getting your objects that will be used for the detection, but also then being able to get the visualizations and the outputs for a particular model that you might be running. Because in a typical machine learning scenario or typical machine learning algorithm, you'll start off by taking some input data and you will run a machine learning model on top of it and then you'd be getting some inference. So MetaPy allows for an end-to-end pipeline for being able to do machine learning inference. And it's especially useful for analyzing video or audio data. And today we'll be seeing some examples where you could actually use it for a live audio live video or a camera based scenario. And there are of course a lot of different features that come out of the box. So it provides end-to-end acceleration. That means that MetaPy can use your system's CPU or GPU as well. And the underlying technology, especially if you're using it with JavaScript is that it uses WebAssembly on the backend. And that means that with the help of WebAssembly you can also leverage the use of your system hardware to be able to accelerate and improve the performance of the inference of the machine learning models. And one of the great things is that you just need one MediaPipe pipeline and one MediaPipe model, and it can be used in multiple areas because MediaPipe is supported across multiple frameworks, including JavaScript, Android, iOS, and other platforms, and you can also actually deploy it on platforms like Raspberry Pi for IoT or Edge applications. And there are ready to use solutions.

2. Exploring MediaPipe Solutions

Short description:

We'll explore ready-to-use MediaPipe solutions that cover object detection, tracking, face mesh, human pose tracking, and more. These solutions are being used in various applications, such as virtual exercise tracking and augmented reality-based lipstick techniques. MediaPipe provides end-to-end machine learning pipelines that can be easily integrated into your programs. Check out MediaPipe.dev for more information and examples of how MediaPipe is used in JavaScript and other platforms.

That means we'll be exploring some of these MediaPipe solutions in a bit. And these are completely ready to use. You just have to import them inside of your functions, inside of your programs.

For example, if you're using JavaScript, you just have to import the actual function and you'll be able to use it very quickly and all of these different solutions that we'll be exploring are completely open sourced. So in case you are interested to level with them, you can also check out their code base and they can apply them for your own use case.

And here are some of the solutions that are currently there. And when we, again, just kind of a quick reminder that when we talk about solutions, these essentially are end-to-end machine learning pipelines. That means right from being able to detect and then also classify or get your inference running, up and running, all of that is handled with the help of these machine learning pipelines provided by MediaPipe. So you have some standard models that you will also see in Python. Things like object detection, object tracking, but also being able to do things like face mesh or human pose tracking, place post, all of these are really being used by a lot of different startups that are basically providing electronic or e- being able to do like, you know, gym or being able to do your exercises and keeping a track of your exercises virtually with just the help of your webcam to do things like, you know, rep counts or like having an E physiotherapist in your, so they're being utilized for those.

Then we have the face detection, which is being used by companies like L'Oreal for augmented reality based lipstick techniques. So a lot of these solutions are already being used in production. And then of course you see some even more solutions and you can of course take them out on the MediaPipe website. It's called MediaPipe.dev. So you can just visit that and check out all these different solutions. Things like the selfie segmentation solution has been used in Google meet to put up virtual backgrounds. So these are just some of the solutions that you can directly use and embed inside of your program. And of course, these are some of the examples that we can share. So you can see on the left hand side, the one that is being very similar to the one in LogL that you can use an augmented reality based lipstick. Then you can see some augmented reality based movie trailers in YouTube. You can use Google Lens that is able to basically add virtual reality-based or augmented reality-based objects in front of you using computer vision. So these are some examples where MediaPipe is being used for not just applications for JavaScript, but also for other platforms as well. But of course, I'd like to also break down how this inference is actually taking place in the first place. So for that, let's take a look at a live perception. And for that, we'll take the example of a hand tracking algorithm. So the idea is that if you were to use a webcam and you use your hand in front of the webcam, it should be able to detect something that we call as these landmarks. So basically the idea is that you will take the image or a video of your hand and the ML model are typically basically the media pipe pipeline will be able to get these specific landmarks. And these landmarks are usually denoting the different joints inside of your hand. And you'll be able to overlap the landmarks on top of your hand so that it detects your hand and it detects the exact location of the landmarks and then superimposes them to it. So that's what we are trying to do with the meaning of actually localizing your hand landmarks.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

A Guide to React Rendering Behavior
React Advanced 2022React Advanced 2022
25 min
A Guide to React Rendering Behavior
Top Content
This transcription provides a brief guide to React rendering behavior. It explains the process of rendering, comparing new and old elements, and the importance of pure rendering without side effects. It also covers topics such as batching and double rendering, optimizing rendering and using context and Redux in React. Overall, it offers valuable insights for developers looking to understand and optimize React rendering.
Building Better Websites with Remix
React Summit Remote Edition 2021React Summit Remote Edition 2021
33 min
Building Better Websites with Remix
Top Content
Remix is a web framework built on React Router that focuses on web fundamentals, accessibility, performance, and flexibility. It delivers real HTML and SEO benefits, and allows for automatic updating of meta tags and styles. It provides features like login functionality, session management, and error handling. Remix is a server-rendered framework that can enhance sites with JavaScript but doesn't require it for basic functionality. It aims to create quality HTML-driven documents and is flexible for use with different web technologies and stacks.
React Compiler - Understanding Idiomatic React (React Forget)
React Advanced 2023React Advanced 2023
33 min
React Compiler - Understanding Idiomatic React (React Forget)
Top Content
Watch video: React Compiler - Understanding Idiomatic React (React Forget)
Joe Savona
Mofei Zhang
2 authors
The Talk discusses React Forget, a compiler built at Meta that aims to optimize client-side React development. It explores the use of memoization to improve performance and the vision of Forget to automatically determine dependencies at build time. Forget is named with an F-word pun and has the potential to optimize server builds and enable dead code elimination. The team plans to make Forget open-source and is focused on ensuring its quality before release.
Using useEffect Effectively
React Advanced 2022React Advanced 2022
30 min
Using useEffect Effectively
Top Content
Today's Talk explores the use of the useEffect hook in React development, covering topics such as fetching data, handling race conditions and cleanup, and optimizing performance. It also discusses the correct use of useEffect in React 18, the distinction between Activity Effects and Action Effects, and the potential misuse of useEffect. The Talk highlights the benefits of using useQuery or SWR for data fetching, the problems with using useEffect for initializing global singletons, and the use of state machines for handling effects. The speaker also recommends exploring the beta React docs and using tools like the stately.ai editor for visualizing state machines.
Routing in React 18 and Beyond
React Summit 2022React Summit 2022
20 min
Routing in React 18 and Beyond
Top Content
Routing in React 18 brings a native app-like user experience and allows applications to transition between different environments. React Router and Next.js have different approaches to routing, with React Router using component-based routing and Next.js using file system-based routing. React server components provide the primitives to address the disadvantages of multipage applications while maintaining the same user experience. Improving navigation and routing in React involves including loading UI, pre-rendering parts of the screen, and using server components for more performant experiences. Next.js and Remix are moving towards a converging solution by combining component-based routing with file system routing.
(Easier) Interactive Data Visualization in React
React Advanced 2021React Advanced 2021
27 min
(Easier) Interactive Data Visualization in React
Top Content
This Talk is about interactive data visualization in React using the Plot library. Plot is a high-level library that simplifies the process of visualizing data by providing key concepts and defaults for layout decisions. It can be integrated with React using hooks like useRef and useEffect. Plot allows for customization and supports features like sorting and adding additional marks. The Talk also discusses accessibility concerns, SSR support, and compares Plot to other libraries like D3 and Vega-Lite.

Workshops on related topic

React Performance Debugging Masterclass
React Summit 2023React Summit 2023
170 min
React Performance Debugging Masterclass
Top Content
Featured WorkshopFree
Ivan Akulov
Ivan Akulov
Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)
Concurrent Rendering Adventures in React 18
React Advanced 2021React Advanced 2021
132 min
Concurrent Rendering Adventures in React 18
Top Content
Featured WorkshopFree
Maurice de Beijer
Maurice de Beijer
With the release of React 18 we finally get the long awaited concurrent rendering. But how is that going to affect your application? What are the benefits of concurrent rendering in React? What do you need to do to switch to concurrent rendering when you upgrade to React 18? And what if you don’t want or can’t use concurrent rendering yet?

There are some behavior changes you need to be aware of! In this workshop we will cover all of those subjects and more.

Join me with your laptop in this interactive workshop. You will see how easy it is to switch to concurrent rendering in your React application. You will learn all about concurrent rendering, SuspenseList, the startTransition API and more.
React Hooks Tips Only the Pros Know
React Summit Remote Edition 2021React Summit Remote Edition 2021
177 min
React Hooks Tips Only the Pros Know
Top Content
Featured Workshop
Maurice de Beijer
Maurice de Beijer
The addition of the hooks API to React was quite a major change. Before hooks most components had to be class based. Now, with hooks, these are often much simpler functional components. Hooks can be really simple to use. Almost deceptively simple. Because there are still plenty of ways you can mess up with hooks. And it often turns out there are many ways where you can improve your components a better understanding of how each React hook can be used.You will learn all about the pros and cons of the various hooks. You will learn when to use useState() versus useReducer(). We will look at using useContext() efficiently. You will see when to use useLayoutEffect() and when useEffect() is better.
React, TypeScript, and TDD
React Advanced 2021React Advanced 2021
174 min
React, TypeScript, and TDD
Top Content
Featured WorkshopFree
Paul Everitt
Paul Everitt
ReactJS is wildly popular and thus wildly supported. TypeScript is increasingly popular, and thus increasingly supported.

The two together? Not as much. Given that they both change quickly, it's hard to find accurate learning materials.

React+TypeScript, with JetBrains IDEs? That three-part combination is the topic of this series. We'll show a little about a lot. Meaning, the key steps to getting productive, in the IDE, for React projects using TypeScript. Along the way we'll show test-driven development and emphasize tips-and-tricks in the IDE.
Web3 Workshop - Building Your First Dapp
React Advanced 2021React Advanced 2021
145 min
Web3 Workshop - Building Your First Dapp
Top Content
Featured WorkshopFree
Nader Dabit
Nader Dabit
In this workshop, you'll learn how to build your first full stack dapp on the Ethereum blockchain, reading and writing data to the network, and connecting a front end application to the contract you've deployed. By the end of the workshop, you'll understand how to set up a full stack development environment, run a local node, and interact with any smart contract using React, HardHat, and Ethers.js.
Designing Effective Tests With React Testing Library
React Summit 2023React Summit 2023
151 min
Designing Effective Tests With React Testing Library
Top Content
Featured Workshop
Josh Justice
Josh Justice
React Testing Library is a great framework for React component tests because there are a lot of questions it answers for you, so you don’t need to worry about those questions. But that doesn’t mean testing is easy. There are still a lot of questions you have to figure out for yourself: How many component tests should you write vs end-to-end tests or lower-level unit tests? How can you test a certain line of code that is tricky to test? And what in the world are you supposed to do about that persistent act() warning?
In this three-hour workshop we’ll introduce React Testing Library along with a mental model for how to think about designing your component tests. This mental model will help you see how to test each bit of logic, whether or not to mock dependencies, and will help improve the design of your components. You’ll walk away with the tools, techniques, and principles you need to implement low-cost, high-value component tests.
Table of contents- The different kinds of React application tests, and where component tests fit in- A mental model for thinking about the inputs and outputs of the components you test- Options for selecting DOM elements to verify and interact with them- The value of mocks and why they shouldn’t be avoided- The challenges with asynchrony in RTL tests and how to handle them
Prerequisites- Familiarity with building applications with React- Basic experience writing automated tests with Jest or another unit testing framework- You do not need any experience with React Testing Library- Machine setup: Node LTS, Yarn