Web Apps of the Future With Web AI

Rate this content
Bookmark
Web AI is the practice of running machine learning models directly in the browser using JavaScript, WebAssembly, and WebGPU. This approach offers significant benefits such as enhanced privacy, low latency, and the ability to operate offline. The use of TensorFlow.js allows developers to deploy models like object recognition, text toxicity detection, and face mesh for real-time applications. Practical examples include background blurring in video conferencing and remote physiotherapy using pose estimation models. Web AI also improves accessibility by automatically filling captions for images. Popular models include YOLO for object detection, and the MediaPipe LLM inference API for language tasks. Books like 'Deep Learning in JavaScript' and 'Learning TensorFlow.js' are recommended for beginners.

From Author:

AI is everywhere, but why should you care, as a web developer? Join Jason Mayes, Web AI Lead at Google, who will get you on track by demystifying common terminology ensuring no one is left behind, and then take you through some of the latest machine learning models, tools, and frameworks you can use right in the browser via JavaScript to help you bring your creative web app ideas to life for almost any industry you may be working in. By moving AI to the client side, there is no reliance on the server after the page load, bringing you benefits such as privacy, low latency, offline solutions, and lower costs which will be of growing importance as the field develops. This talk is suitable for everyone with a curiosity for web and machine learning, so come along and learn something new to put in your web engineering toolkit for 2024.

This talk has been presented at JSNation 2024, check out the latest edition of this JavaScript Conference.

FAQ

Jason Mayes is the Web AI Lead at Google.

Web AI is the art of using machine learning models client-side in a web browser, running on your own device's processor or graphics card using JavaScript and surrounding web technologies like WebAssembly and WebGPU for acceleration.

Web AI runs machine learning models on the client side in the web browser, using the device's processor or graphics card, whereas Cloud AI executes models on the server side and requires an active internet connection to access the server's API.

Benefits of using Web AI include enhanced privacy, the ability to run offline, low latency, lower costs, and a frictionless user experience.

Yes, Web AI can operate offline on the device itself, making it possible to perform tasks even in areas with low or no connectivity after the page has loaded.

Practical examples include remote physiotherapy using browser-based pose estimation models, product placement verification in supermarkets, background blurring in video conferencing, and real-time facial feature recognition for augmented reality.

TensorFlow.js is a JavaScript library for training and deploying machine learning models in the browser and on Node.js.

Popular Web AI models include object recognition, text toxicity detection, selfie depth estimation, face mesh, hand tracking, and large language models.

Web AI can improve accessibility by using models to automatically fill captions for images that lack alt text, among other applications.

No, Web AI can work in any browser that supports WebAssembly or WebGPU, allowing it to run on a wide range of devices including mobile phones.

Jason Mayes
Jason Mayes
32 min
13 Jun, 2024

Comments

Sign in or register to post your comment.
  • Francisco Baptista
    Francisco Baptista
    Great keynote Jason!! TeamSportz is expanding our use of pose estimation to deliver exercises to help athletes recover from injuries. We should talk

Video Transcription

1. Introduction to Web AI in JavaScript

Short description:

I'm Jason Mayes, web AI lead at Google. Start investigating machine learning on the client side in JavaScript to gain superpowers in your next web app. Web AI is the art of using ML models client-side in a web browser, different from cloud AI. AI will be leveraged by all industries in the future. Upskill in this area now for unique benefits in JavaScript.

I'm Jason Mayes, web AI lead here at Google. Today I've come to you as a fellow JavaScript engineer to share with you a story about why you should start investigating machine learning on the client side in JavaScript to gain superpowers in your next web application.

First, let's formally define what I mean by web AI which is a term I coined back in 2022 to stand out versus cloud-based AI systems which were popular back then. Web AI is the art of using machine learning models client-side in a web browser, running on your own device's processor or graphics card, using JavaScript and surrounding web technologies like WebAssembly and WebGPU for acceleration. This is different from cloud AI whereby the model would be executing on the server side and be accessed via some sort of API instead, which means you need an active internet connection to talk to that API at all times to provide the advanced capabilities provided.

As web developers and designers, we have the privilege of working across industries when we work with our customers. In a similar manner, artificial intelligence is likely to be leveraged by all of those industries in the future to make them more efficient than ever before. In fact, in a few years' time, customers will expect AI features in their next product to keep up with everyone else who is already doing it. So now is the perfect time to upskill in this area as you can get unique benefits when doing this on-device in JavaScript.

2. Advantages of Client-side AI in Web Applications

Short description:

Privacy: No data needs to be sent to the server for classification, protecting user's personal data. Ability to run offline on the device itself. Low latency enables real-time model execution. Lower cost by running AI directly in the browser. Frictionless experience for end users. Reach and scale of the web. Growing usage of client-side AI libraries. Real-world example of video conferencing solution with background blur. Cost savings of using client-side AI in video segmentation.

What are those? Well, first up is privacy. As no data from things like the camera, the microphone, or even text for that matter needs to be sent to the server for classification which protects the user's personal data. A great example of this is shown here by include health who use browser-based pose estimation models to perform remote physiotherapy without sending any imagery to the cloud. Instead, only the resulting range of motion and statistics from the session are sent allowing the patient to perform the check-up from the comfort of their own house.

You also have the ability to run offline on the device itself, so you can even perform tasks in areas of low or no connectivity at all after the page load. Now, you might be wondering why would a web app need to do all that stuff offline? Well, in this great example by Hugo Zanini, he performs a product placement verification task using a web app in supermarkets for a retail customer he was working with. We all know how bad the Wi-Fi connections are in supermarkets. He leveraged TensorFlow.js right in the browser that can work entirely offline and then syncs the data back when he's got connectivity later on.

Next is low latency which can enable you to run many models in real time as you don't have to wait for the data to be sent to the cloud and then get an answer back again. As such, our body, pose, and segmentation models, for example, can run over 120 frames per second on a mid-range GPU's laptop with great accuracy as you can see on this slide. You've also got lower cost as you don't need to hire and keep running expensive cloud-based GPUs 24-7, which means you can now run generative AI directly in the browser like this large language model on the left-hand side without breaking the bank. And we're seeing production-ready web apps benefit from significant cost savings too like the example shown for advanced video conferencing features like background blurring shown on the right.

And even better, you can offer a frictionless experience for your end users as no install is required to run a web page. Just go to a link and it works. In fact, Adobe did exactly that here with Adobe Photoshop web, enabling anyone anywhere to use their favourite creative features on almost any device. When it comes to the object selection tool shown on this slide, embracing client-side machine learning can provide Adobe's users with a better user experience by eliminating that cloud server latency resulting in faster predictions and a more responsive user experience. And on that note, it also means you can leverage the reach and scale of the web itself that has over six billion browser-enabled devices for people capable of viewing your creation. So no matter if you're levelling up your next YouTuber livestream to become a different persona or capturing detailed facial movements to drive a game character using nothing more than a regular webcam or client-side in the browser, to the latest in generative AI where you can even run diffusion models in the web browser at incredible speeds with new browser technologies like web GPU now enabled by default in Chrome and Chrome-based browsers, things are about to get really exciting with regards to what we can expect from a web app in the future.

So even if you're not yet using client-side AI, I want to illustrate how fast this is growing and why you should take a look. I've only got statistics for Google's web AI libraries, so worldwide usage is probably higher than this, but in the past two years alone, we've averaged 600 million downloads per year of TensorFlow.js and media-based web models and libraries, bringing us to over 1.2 billion downloads in that time for the first time ever, and we're on track to be even higher in 2024 with our usage continuing to grow. So now it's really time to be part of this growth yourselves. In fact, we've seen this steady growth since 2020 as more and more developers just like you have started to use web AI in production use cases. And speaking of real-world examples, let's take a deeper dive into a typical video conferencing solution.

There goes my notifications. Many of these services provide background blur or background replacements in video calls for privacy. So let's crunch some hypothetical numbers for the value of using client-side AI in a use case like this. First, a webcam typically produces video at 30 frames per second. So assuming the average meeting is about 30 minutes in length, that's 54,000 frames you have to process every single meeting. Now, assuming, if you have a popular service, you might have a million meetings per day, that means 54 billion segmentations every single day. Now, even if we assume a really ultra-low cost of just 0.0001 cents per segmentation, that would still be $5.4 million a day that you would have to spend on the cloud, which is around $2 billion a year just for those GPU costs.

QnA

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Building a Voice-Enabled AI Assistant With Javascript
JSNation 2023JSNation 2023
21 min
Building a Voice-Enabled AI Assistant With Javascript
Top Content
This Talk discusses building a voice-activated AI assistant using web APIs and JavaScript. It covers using the Web Speech API for speech recognition and the speech synthesis API for text to speech. The speaker demonstrates how to communicate with the Open AI API and handle the response. The Talk also explores enabling speech recognition and addressing the user. The speaker concludes by mentioning the possibility of creating a product out of the project and using Tauri for native desktop-like experiences.
AI and Web Development: Hype or Reality
JSNation 2023JSNation 2023
24 min
AI and Web Development: Hype or Reality
Top Content
This talk explores the use of AI in web development, including tools like GitHub Copilot and Fig for CLI commands. AI can generate boilerplate code, provide context-aware solutions, and generate dummy data. It can also assist with CSS selectors and regexes, and be integrated into applications. AI is used to enhance the podcast experience by transcribing episodes and providing JSON data. The talk also discusses formatting AI output, crafting requests, and analyzing embeddings for similarity.
The Rise of the AI Engineer
React Summit US 2023React Summit US 2023
30 min
The Rise of the AI Engineer
Watch video: The Rise of the AI Engineer
The rise of AI engineers is driven by the demand for AI and the emergence of ML research and engineering organizations. Start-ups are leveraging AI through APIs, resulting in a time-to-market advantage. The future of AI engineering holds promising results, with a focus on AI UX and the role of AI agents. Equity in AI and the central problems of AI engineering require collective efforts to address. The day-to-day life of an AI engineer involves working on products or infrastructure and dealing with specialties and tools specific to the field.
Building the AI for Athena Crisis
JS GameDev Summit 2023JS GameDev Summit 2023
37 min
Building the AI for Athena Crisis
Join Christoph from Nakazawa Tech in building the AI for Athena Crisis, a game where the AI performs actions just like a player. Learn about the importance of abstractions, primitives, and search algorithms in building an AI for a video game. Explore the architecture of Athena Crisis, which uses immutable persistent data structures and optimistic updates. Discover how to implement AI behaviors and create a class for the AI. Find out how to analyze units, assign weights, and prioritize actions based on the game state. Consider the next steps in building the AI and explore the possibility of building an AI for a real-time strategy game.
Code coverage with AI
TestJS Summit 2023TestJS Summit 2023
8 min
Code coverage with AI
Codium is a generative AI assistant for software development that offers code explanation, test generation, and collaboration features. It can generate tests for a GraphQL API in VS Code, improve code coverage, and even document tests. Codium allows analyzing specific code lines, generating tests based on existing ones, and answering code-related questions. It can also provide suggestions for code improvement, help with code refactoring, and assist with writing commit messages.
What AI Can, Can’t, and Shouldn’t Do for Games
C3 Dev Festival 2024C3 Dev Festival 2024
26 min
What AI Can, Can’t, and Shouldn’t Do for Games
AI in game development has evolved rapidly, with generative AI being a focus. However, game developers like Romero Games have concerns about ethics and prefer using AI to automate processes and make creative work easier. AI has been used in games for decades, from path-finding AI to decision trees. Procedural world building and advanced AI technology are pushing the boundaries of FPS games. Different teams within a company have different approaches to the use of AI, depending on their specific needs and requirements.

Workshops on related topic

AI on Demand: Serverless AI
DevOps.js Conf 2024DevOps.js Conf 2024
163 min
AI on Demand: Serverless AI
Top Content
Featured WorkshopFree
Nathan Disidore
Nathan Disidore
In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.
Leveraging LLMs to Build Intuitive AI Experiences With JavaScript
JSNation 2024JSNation 2024
108 min
Leveraging LLMs to Build Intuitive AI Experiences With JavaScript
Featured Workshop
Roy Derks
Shivay Lamba
2 authors
Today every developer is using LLMs in different forms and shapes, from ChatGPT to code assistants like GitHub CoPilot. Following this, lots of products have introduced embedded AI capabilities, and in this workshop we will make LLMs understandable for web developers. And we'll get into coding your own AI-driven application. No prior experience in working with LLMs or machine learning is needed. Instead, we'll use web technologies such as JavaScript, React which you already know and love while also learning about some new libraries like OpenAI, Transformers.js
Llms Workshop: What They Are and How to Leverage Them
React Summit 2024React Summit 2024
66 min
Llms Workshop: What They Are and How to Leverage Them
Featured Workshop
Nathan Marrs
Haris Rozajac
2 authors
Join Nathan in this hands-on session where you will first learn at a high level what large language models (LLMs) are and how they work. Then dive into an interactive coding exercise where you will implement LLM functionality into a basic example application. During this exercise you will get a feel for key skills for working with LLMs in your own applications such as prompt engineering and exposure to OpenAI's API.
After this session you will have insights around what LLMs are and how they can practically be used to improve your own applications.
Table of contents: - Interactive demo implementing basic LLM powered features in a demo app- Discuss how to decide where to leverage LLMs in a product- Lessons learned around integrating with OpenAI / overview of OpenAI API- Best practices for prompt engineering- Common challenges specific to React (state management :D / good UX practices)
Working With OpenAI and Prompt Engineering for React Developers
React Advanced Conference 2023React Advanced Conference 2023
98 min
Working With OpenAI and Prompt Engineering for React Developers
Top Content
Workshop
Richard Moss
Richard Moss
In this workshop we'll take a tour of applied AI from the perspective of front end developers, zooming in on the emerging best practices when it comes to working with LLMs to build great products. This workshop is based on learnings from working with the OpenAI API from its debut last November to build out a working MVP which became PowerModeAI (A customer facing ideation and slide creation tool).
In the workshop they'll be a mix of presentation and hands on exercises to cover topics including:
- GPT fundamentals- Pitfalls of LLMs- Prompt engineering best practices and techniques- Using the playground effectively- Installing and configuring the OpenAI SDK- Approaches to working with the API and prompt management- Implementing the API to build an AI powered customer facing application- Fine tuning and embeddings- Emerging best practice on LLMOps
Building AI Applications for the Web
React Day Berlin 2023React Day Berlin 2023
98 min
Building AI Applications for the Web
Workshop
Roy Derks
Roy Derks
Today every developer is using LLMs in different forms and shapes. Lots of products have introduced embedded AI capabilities, and in this workshop you’ll learn how to build your own AI application. No experience in building LLMs or machine learning is needed. Instead, we’ll use web technologies such as JavaScript, React and GraphQL which you already know and love.
Building Your Generative AI Application
React Summit 2024React Summit 2024
82 min
Building Your Generative AI Application
WorkshopFree
Dieter Flick
Dieter Flick
Generative AI is exciting tech enthusiasts and businesses with its vast potential. In this session, we will introduce Retrieval Augmented Generation (RAG), a framework that provides context to Large Language Models (LLMs) without retraining them. We will guide you step-by-step in building your own RAG app, culminating in a fully functional chatbot.
Key Concepts: Generative AI, Retrieval Augmented Generation
Technologies: OpenAI, LangChain, AstraDB Vector Store, Streamlit, Langflow