AI Right in the Browser With Chrome’s Built-in AI APIs

This ad is not shown to multipass and full ticket holders
JSNation US
JSNation US 2025
November 17 - 20, 2025
New York, US & Online
See JS stars in the US biggest planetarium
Learn More
In partnership with Focus Reactive
Upcoming event
JSNation US 2025
JSNation US 2025
November 17 - 20, 2025. New York, US & Online
Learn more
Bookmark
Rate this content

In this talk, Thomas Steiner from Google Chrome's AI team dives into the various built-in AI APIs that the team is currently exploring. First, there's the exploratory Prompt API that allows for free-form interactions with the Gemini Nano model. Second, there are the different task APIs that are fine-tuned to work well for a particular task like translations, language detection, summarization, and writing or rewriting. After introducing the APIs, the final part of the talk focuses on demos, applications, and use cases unlocked through these APIs.

This talk has been presented at JSNation 2025, check out the latest edition of this JavaScript Conference.

FAQ

You can contact Thomas Steiner on social networks at Tomajag or via email.

The polished version of the talk was presented at Google I-O and can be accessed via a QR code provided during the talk.

The jsnation-rocks.glitch.me website is used for demonstrating AI APIs in Chrome as part of the talk.

The language of the demo content was detected as English with 99% confidence.

The speaker is Thomas Steiner.

Thomas Steiner is currently based in Google Spain.

The main topic is AI in the browser using Chrome's built-in AI APIs.

The Prompt API is used for creating a multimodal detector that can analyze images and provide true or false classifications.

Thomas Steiner demonstrates the language detector, translate API, summarizer API, and prompt API in Chrome.

The Summarizer API is used to summarize text into bullet points for easier reading.

Thomas Steiner
Thomas Steiner
31 min
12 Jun, 2025

Comments

Sign in or register to post your comment.
  • Va Da
    Va Da
    P4
    Browser local LLMs is the future!
Video Summary and Transcription
Thomas Steiner discusses AI in Chrome, language detection, translation, and summarization using Chrome APIs. He troubleshoots slow performance with the Summarizer API and introduces the Prompt API for text formatting. The development of a multimodal image detector, model interaction enhancements, and utilizing image recognition for prompt responses are demonstrated. The exploration of multimodal conversations with the Prompt API, seamless conversations with PWA, and cross-browser compatibility for Chrome APIs are highlighted.

1. Thomas Steiner on AI in Chrome

Short description:

Thomas Steiner talks about AI in the browser with Chrome's built-in AI APIs. He mentions a polished version of the talk presented at Google I-O and encourages the audience to view it. The JS Nation version is described as uncut with possible glitches and issues. A live demo of a site built by Thomas is showcased on glitch.me.

I'm here, Thomas Steiner. Well, Google Germany is no longer true. Google Spain now just recently relocated. But anyway, so AI right in the browser with Chrome's built-in AI APIs is what I want to talk about. If you have questions after the talk, there's obviously the Q&A. But you can also reach me at Tomajag on all the social networks, except X. I refuse to be on X anymore. So if there's anything you have after the talk, just feel free to reach out. There's ways to find me on the other social networks, or just email. Everything works.

All right, and with that, let's get going. I want to tell you that there is actually a polished version of this talk. So at Google I-O this year, I had a prerecorded talk with Practical Built-in AI with Gemini Nano in Chrome. As a prerecorded talk, I'm not sure if any one of you ever has had the pleasure of doing a prerecorded talk. They make sure that everything is right, there's no glitches, there's no AMS. There's a few, but they cut them out. If you want to see the polished version of this talk, this is the QR code that you want to scan. And I highly recommend you do.

But this JS Nation version is like the uncut, rock and roll, rough cut. So there will be glitches, there will be AMS, there will be things that might go wrong. At Google I-O for the live talks, they established a rule this year, where if a demo works, you applaud. If a demo doesn't work, you applaud harder. So if anything goes wrong, please applaud harder. And with this, I just actually want to leave the slides and jump right into my live demo here. So I built a little site, jsnation-rocks.glitch.me. It's probably one of the last glitch demos that I have, because glitch is dying. It's super sad. But luckily, it's still up today. And I just took my bio that I used for the presentation here. So you can see who I am and blah, yada yada.

2. Chrome APIs for Language Detection

Short description:

The speaker demonstrates Chrome APIs for language detection by creating and using a language detector. The detector is utilized to identify the language of user-generated content with an example of detecting English with confidence levels.

So I want to show you some of the APIs that we have in Chrome now. You can test them. They're in origin trial, so you can test them on actual real users. And the first thing that I want to show you is, well, let's assume we don't know what the language of this is. Let's assume this was just some user-generated content that we got in. So we need some sort of language detector.

Detector equals a weight, language detector dot create. So we just create us a language detector. So we can see what we can do with it. So it has expected input languages. It has an input quota, which is infinity. So we can throw as much as we want at it.

So let's actually do something with this language detector. So language detector dot not destroy, but detect. And then I give it my bio. And the bio is really just this diff here. So I just made it a global variable. So whenever I say bio, it's the contents of this diff. And this is all async, so we need to await it.

3. Language Detection and Translation in Chrome

Short description:

The speaker demonstrates the language detection process in Chrome, identifying English with confidence levels and exploring translation capabilities between English and Dutch.

And then we can see we get back an array with two entries. And if you open the first one, we can see with 99% confidence, the detector thinks that this is English. And then the other entry is, well, with the remaining whatever, 0, 0, 0, it thinks that this is not determinable. So that's great. We have our first demo that worked.

So let's actually just store this in a variable. So we have const lang equals this. And we need to create this and make it the first entry. And then we take the detect did language. So hopefully this works. So now in lang we have English. All right, so first demo worked. That's kind of boring-ish.

So what can I do with this? Well, of course, you can translate. And we in Chrome now have a translate API. So we have const translator equals await translator dot create. And I can just create a translator. Here I need to give it some additional parameters. So I need to tell it what is the source language. And it's, of course, the lang that we had before. And we need a target language.

4. Translation and Summarization with Chrome APIs

Short description:

The speaker demonstrates the simplicity of using the translation API in Chrome for English and Dutch languages and introduces the Summarizer API for creating bullet point summaries.

And of course, we are here in the Netherlands, so we make it NL. So now we have a translator for our detected input language, which was English and Dutch. So now we can, of course, translate something with it. And again, I give it my bio dot inner text. And again, this is async. And I will not try to pronounce this, but hopefully you can see that it is something like Dutch. It's using the same models that we have in Chrome for the browser native level translation. So the same neural network translation models that we have for many years. So yeah, you can see this is super easy to use. Just two lines of code, and you have a translation API directly powered in the browser.

Thank you. Thank you. This is the second API. And now I want to show you one other third API. So here, my bio, that's like two paragraphs. But some of you are generation TikTok, so two paragraphs. Oh my god, that's super complex, right? So can I just summarize this and make it bullet points? And it turns out we can, because in the browser now we have the Summarizer API. So we can say like, summarize it dot await. And you can see already the pattern here. Summarize it dot create. And now we have a summarizer. And you can see already there's, well, let me actually expand it. You can see already there is, yeah, the format, which is marked down. There's the length, which is short. By default, there is the type, which is key points. So let's actually try it and hope that it summarizes my text here. So summarizer dot summarize. Bio dot inner text. And as always, we need to await the result. And if this works, we should get a summary of this.

5. Troubleshooting Slow Summarizer API in Chrome

Short description:

The speaker encounters slow performance issues with the Summarizer API in Chrome Canary but eventually succeeds after troubleshooting.

It was slow in the last demo that I tried, so let's see. All of this is running in Chrome Canary today. So hopefully they didn't break it today. But let's give it a little bit more time.

Come on, Summarizer. Summarize quicker. Yes. Yes. So it's hanging. Sometimes it works if I just reload. So let's actually do this. So let's go back to what we had before. So Summarizer and then Summarizer inner text. Let's see. Still nothing? Come on.

OK. You know what? Let's just restart Canary. This also sometimes works. And you can see I made the switch to Tahoe. So hopefully there's no thing that interplays here with Mac OS Tahoe or Mac OS 26, as they call it now. OK, so let's see. Now with a completely fresh session. So we need to create the Summarizer and then Summarize. One last chance to work. No, it refuses to work today. OK, so. Oh! That's amazing. OK, so it took a lot longer than usual, but it did work. Thank you very much for clapping harder. This really makes it a lot easier.

6. Introducing Chrome's Prompt API

Short description:

The speaker discusses the formatting challenges of marked-down text and introduces the Prompt API in Chrome, demonstrating its use and internal workings.

So let's see. This is actually marked down, so it's not super easy to parse. But we can see there's bullet points. And the first bullet point goes from here to here. Then the new line, then we have the next bullet point that goes from here to here. And then the last bullet point. So I just summarized my two paragraphs into three bullet points that are easy to digest for people who don't like reading complex text like two paragraph bios.

All right, so with this suspension moment here, moment of tension, everything has worked, eventually. But there's more. So I want to show you another API that we are working on in Chrome. And it's the Prompt API. And for this, I can actually close my Developer Tools. I have a little bit more space here. So I can see down there, there's a field where I can enter stuff. So I can say, I don't know, what can I do in Amsterdam if I have one day? I can see the LLM does the LLM thing. Well, it's not formatted properly. So you can see it starts with a heading and then some content and then some bold text and markdown and so on. So that's kind of interesting.

But you've seen this before. What I want to show you is something that makes this a little bit more useful. But before we go there, let me actually show you how all of this works internally. We have HTML here, no framework. I listened to Alex's talk. So it's all vanilla JavaScript, vanilla HTML, vanilla CSS in CSS, not CSS in JS. We have a basic form. We have a basic label, an input, a button that is of type submit. And we have an output here. Then this simple HTML is hooked up to, for now, script basic. So what we do in script basic is we get references to the DOM, so form, input, and output. Then we have an event listener submit.

7. Developing Multimodal Image Detector

Short description:

The speaker explains event listener usage, demonstrates the prompt streaming method for AI, and introduces a JSON Schema for response constraints.

So some of you might see this code for the very first time. So that's how you have an event listener in JavaScript. So we listen for the submit event, prevent the default. And then comes the actual secret here. So we have the input, take the value, trim it. If there's something left, sorry, if there's nothing left, so if there's an empty input string, essentially, we return. But as we continue, reset the inner HTML of our output.

And then comes the actual AI code. So here, we have the language model dot create. And you've seen the pattern before. So the thing dot create gives you the language model in this case. And then before, I showed you the non-streaming variants. So I showed you translate instead of translate streaming. I showed you summarize instead of summarize streaming. But here, I want to show you the prompt streaming method, which gives you a readable stream. And you can then just iterate over each of the chunks of the stream. And then just append them to the HTML. So that's super easy to use. And you have seen the result here.

But next, I want to slowly develop this talk and make it, as I said, something more useful. So I want to work on making this a multimodal detector of images where it could ask questions about the image. And not just ask questions about the image, but also have a really true or false classifier. So let me hook up the next script, which is this guy here, JSON Schema. Let me just take the file name and then go here and put that over here. So what we have now is we've introduced a JSON Schema. So everything from here to here is the same. But now we have a response constraint. And the simplest possible JSON Schema is just this object that can say, my response should be true or false. And then you can see everything else is the same just here. For the prompt streaming, we give it the response constraint.

8. Enhancing Model Interaction with Image Prompts

Short description:

The speaker demonstrates using true or false responses for model interaction and combines text and image prompts for enhanced usability.

So now, instead of having a long text, the model will be forced to just respond with true or false. So let's actually try this now. So I can ask something like, is Amsterdam located in Europe? Answer with true or false. And then I can submit that. And the model, thankfully, responds correctly with true down there. So you can see, well, this is starting to become useful because now, instead of having this, oh yes, Amsterdam is located in Europe, blah, blah, yada, yada, LLM kind of response, you boil it down to just a true or false value, which, of course, is a lot more easy to digest and to parse if you work with generated data.

All right, so that's OK useful. But next, I want to show you how you can make this really useful. And you can see this image here. So this is my usual avatar that I have on most pages. So let me hook up the final script here, which is this one. Go back to here. And we take that script now. And let's have a look at what is happening here. So now we get a reference to the image. We still have the response constraint from before. But now our prompt becomes a little bit more complex because now what I'm doing is I'm combining a text prompt with an image prompt. So you can see here, I've just renamed my variable from prompts to text prompt. But it's essentially still the same.

But then down there, you can see my prompt is now a little bit more complex. So this is the entire prompt. You can see it's an array with an object that has a role of user. And then a content, which is an array again. And then we have the two prompts here. So we have the text prompt, so type text and a type image. And just give it the reference to the image from the HTML. Everything down there is the same. You can see for now, I have commented out my response constraint. So now I can ask questions about this image. So I can say things like, what do you see here? And then Submit.

9. Utilizing Image Recognition for Prompt Response

Short description:

The model successfully identifies image content and provides true or false responses for specific queries, showcasing its practical applications.

And if everything works, the model should hopefully respond with something. Yes, now it starts. So it says, this is a selfie of a man standing on a beach looking at the ocean. He has short, brown hair. Well, that's a compliment. And a beard. Well, I was not well shaved that day. But you can see already it has understood what is on the image. So that's kind of cool.

But now let's bring our response constraint back in. And now we forced the model to just respond with true or false. So now we can go like here and say, I need to reload. And now I can say like, is there a person on this image? Question mark. I submit. And the model says true. So yes, you can see, well, this is useful. I've built a little classifier here that just tells me, is there a person in this image? Yes, I know.

But of course, I can make this more complex, like does the person on this image wear a hat? And we can see what the model says. It says true. So yes, amazing. It's down there, a little bit hard to see. But you can see how this can be useful. So if you work in any kind of space where you need to, let's say, before people upload an image, classify does this show, I don't know, a driver's license, you can ask the model, hey, does this depict a driver's license? And then respond with just yes or no. So before verifying manually that the picture is a driver's license, for example, if you need to verify people, you can just have this LLM do the work client side for you. Yeah, you can see how this can be actually very useful.

10. Exploring Multimodal Conversations with Prompt API

Short description:

Exploring Prompt API capabilities with multimodal input and context retention for ongoing conversations.

But yeah, response constraint in combination with multimodal input, that's pretty cool. I took the Prompt API and built something a little bit more complex. So it's like a mini Gemini or a chat GPT, or what you will, in your browser. And it's an installable PWA, so we can actually just install it. So Chrome is doing the install thing. So now it is installed. And I can create a new conversation now.

And I can add an image. And I happen to have prepared one. So this is just Google making sure that I'm not uploading anything illegal. So you can see, this is the iAmsterdam sign. So I can say, what do you see here? And the AI will look at the image and think about it. And yeah, you can see this image shows the Amsterdam sign, blah, blah, blah, Rijksmuseum, and so on. So it has understood that this is the image.

I can say, what is the building in the background? And I can see it continues with the session. So it has the context of this. All right, so this is kind of neat. But let's actually kill this. So I close the application. It's entirely away. Let's actually turn off Wi-Fi, which is a scary thing to do. So I'm shaking. My Wi-Fi is off now. OK, so you can see it's off. Let's relaunch the PWA. And it has restored where I was before. And I can continue my conversation. I can say, what is being exposed there? And you can see it continues. Well, it's not super, super, super catching up with the context here. What is exposed in the museum? Well, it still sort of is hanging on the picture.

11. Seamless Conversations with PWA

Short description:

Continuing seamless conversations offline with Gemini's humor and PWA functionality for local AI.

Well, it does finally give me some Dutch art, Dutch history, other art. So you can see I've completely closed the application, reopened it. I was offline. And I could continue the conversation here. And then, of course, if you go up here, you can see I can close this and then create a new conversation.

So tell me a joke. And you can see how it updates the conversation in real time. And then I come back here, switch conversations again, and is it open today? And it tells me it actually doesn't know. So it switches conversations easily. I can go back to my joke conversation here. Let's close this one, open that one. Let's close this one, open that one. And I can just say another. And all of this entirely offline.

Well, amazing. Gemini's humor is just great. So yeah, I'm entirely offline, everything working with an installable PWA that I have on my browser. And it's like my private, completely running local AI right in the browser. So with that, let's actually go back to there. So this was that demo. So let me go back to here. And I think I need to turn on Wi-Fi again for the presentation to let me enter presentation mode. Reconnecting, let's go here. This is public. You can access the demo here, JS Nation Rocks Glitch.me. This is also public. I gave you a little bit of time to take the QR code, but I also share my slides, so no need to take individual pictures if you don't feel like.

12. Utilizing Chrome APIs and Early Preview

Short description:

Guide to Chrome APIs utilization, early preview program, cross-browser testing, and Origin Trial for API access.

And yeah, so you're asking yourself, how can I use all of this? Well, we have tons of documentation, developerchrome.com, docs.ai built in. Scan this code to see all of it. So there's on the left-hand side a lot of links about all the different APIs that I've showed you. There's some more that are also still in development. And above all, please, if you're interested in this, join our early preview program.

So there's the link for that. There's a simple Google form. You fill that out. It's mostly for us to get information about the users of this. What are the possible use case that you would have for this? So we will be sending out surveys. We will be asking you, how does this feel? Do you think this is the right way to address the problem? And this is in Chrome today.

Microsoft Edge also has an implementation not based on Gemini Nano, but based on 5.4, so a Microsoft model, but the same API, essentially. So you can already today test on two different browsers. And some of those APIs are actually right now already in Origin Trial. So Origin Trial is a concept that we have in Chrome. Edge also has a similar concept, where it can say, like, I get a special token from a website. I put the token in my meta tag or as a link header. And then Chrome, if it sees that token, unlocks the APIs for users.

QnA

API Testing and Chrome AI Usage

Short description:

Testing APIs in Origin Trial, presentation slides link, Chrome AI local model usage.

So you can test all of those APIs. Sorry, not all of those. The ones that are in Origin Trial with actual real users, and get a feeling of how they would react to all of those APIs. Those APIs are mentioned in some part news cases, again, in the talk. So if you want to see the polished, non-JS Nation rough cut version, again, think about the talk.

And with that, bedankt. It was a pleasure to be here. The slides are here. Thank you. The slides are here. So google.gle.jsnation. built in AI, if you want to get the slides. So this is your one link that you want to get. And then everything else is linked from the deck.

And I think we have time for some questions now. Thank you very much, again. So the first one, let's start with some easy one. How does the Chrome AI use local models or remote instance of Yemeni? Instead of? Does the Chrome AI use local models or remote instances of Yemeni? OK. So in Chrome, we have Gemini Nano, which is a proprietary model built by Google, by the Gemini team. That on the first attempt to use one of those APIs is downloaded dynamically to the browser. So it's downloaded once, and then shared by all of the pages that want to use that API. So if you work on different applications, chances are the more popular those APIs get, but the model will already be there. Perfect. So basically, yeah, you can use it locally.

Cross-Browser Compatibility and Model Performance

Short description:

Ensuring Chrome APIs work on all browsers, expanding to mobile devices with model support, device compatibility challenges, and future development goals.

OK, now there's a question that has so many votes. So let's go for this one. How do you guys make sure if a Chrome-only API works to make sure it also lands in other browsers? We don't want browsers logging for APIs. Yes, so neither do I. So first, of course, I mentioned it before. It is now also starting to land in Edge. We do have a lot of conversations with the Mozilla team, with the Safari team, asking for them to join the conversation early on to get their opinions on what do they think about those APIs? How can we put them on a standards track? So everything is working in the open, so there's a paper trail of all of those different steps where we ask at the standards positions repositories for what is the other browser vendor's opinions. And yeah, so in the end, of course, even within Chrome, so on Chrome Android, today you can't use those APIs because Android doesn't support running the model yet. Our response so far is a Firebase AI logic SDK, where essentially if the model is supported locally, it will use the local model, or else it will run on the server, which obviously comes with privacy differentiation. So if before your pitch was everything is processed locally, of course, when you go to the server, yeah, this can have an impact if you can still run your application with that. Of course, our objective is in the end to make the APIs run on all of our platforms, so including Android, including Chrome OS. So it works on macOS, Linux, and Windows. And yeah, so as I said, it's our objective. My personal objective, I always tell people I'm not doing Chrome deferral that much, I'm doing web deferral. So I hope that this becomes an interoperable API relatively soon. Nice. That's nice to hear.

OK, so the next question is, what kind of sacrifices have to be made in terms of performance on these on device models? Can we expect to be able to run these on mobile devices without ESus at some point? So yeah, so the models are relatively big. So it's 4.39 gigabytes on disk, which means it requires a minimum of 6 gigabytes of RAM, GPU RAM, to be able to run. There is an approach called early exit, where essentially you have different layers of the LLM. And if the model determines that it has a good enough response, it can exit early, which means it's less computationally expensive. So there's some work to make it happen on mobile as well so that you can just early exit, get a little less quality compared to if you were to run through all of the layers. But eventually, the pipelines get better. The models get better. Devices slowly get better. So I know Alex, and he's fully right, it will be a long, long time until this will be runnable on a, what was the price tag, $250, I think, was the mobile phone cost average or something. So it will be a long time until we reach those devices. But for those devices, the Firebase SDK is there, and you can still use the cloud. OK, OK. Well, at least at some point, we will have something to run there.

Browser APIs and Model Safety Measures

Short description:

No license or subscription needed for browser APIs. Gemini Nano restrictions in various languages. Safety measures for model responses and language expansion plans.

But otherwise, APIs. So let's give it up for one another question, and then we can stop asking here. And you can always go to the spot and ask more. But for this one, do we need any license or subscription for this? No. So this is all coming with the browser for free. The compute happens at the client. So if you will, they pay with their battery and their GPU and CPU usage. But for you, it's free. There's no license that you need to sign. Other than for Gemini Nano, there's some general usage restrictions. You can't use Gemini Nano to ask it, I don't know, how to build a bomb or something. So this, by the way, is also the limitation for other languages so far. So you will notice it will very often tell you, oh, I can't respond to this yet when you ask it for answer in French, for example. So languages are hard. So there's internal safety measures that the model doesn't tell you, hey, I want to kill myself. Is it more efficient to jump off a bridge or shoot myself in the head? So the model should, of course, not tell you what is more efficient. But getting this right, so making it sure that the model blocks, this is actually right now the bottleneck for unlocking more languages. So slowly, as more languages get supported, we will target, obviously, by popularity. So it will be classic fixed, so French, Italian, German, Spanish, Japanese, and so on. But we do hope that eventually we will have a lot more languages supported by those models. I hope so, too. Nice. So then now, thank you, Thomas, for all the amazing answers. Thank you. And you can go to the log. There's a spot for asking more questions if you have any for him.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Building a Voice-Enabled AI Assistant With Javascript
JSNation 2023JSNation 2023
21 min
Building a Voice-Enabled AI Assistant With Javascript
Top Content
This Talk discusses building a voice-activated AI assistant using web APIs and JavaScript. It covers using the Web Speech API for speech recognition and the speech synthesis API for text to speech. The speaker demonstrates how to communicate with the Open AI API and handle the response. The Talk also explores enabling speech recognition and addressing the user. The speaker concludes by mentioning the possibility of creating a product out of the project and using Tauri for native desktop-like experiences.
The Ai-Assisted Developer Workflow: Build Faster and Smarter Today
JSNation US 2024JSNation US 2024
31 min
The Ai-Assisted Developer Workflow: Build Faster and Smarter Today
Top Content
AI is transforming software engineering by using agents to help with coding. Agents can autonomously complete tasks and make decisions based on data. Collaborative AI and automation are opening new possibilities in code generation. Bolt is a powerful tool for troubleshooting, bug fixing, and authentication. Code generation tools like Copilot and Cursor provide support for selecting models and codebase awareness. Cline is a useful extension for website inspection and testing. Guidelines for coding with agents include defining requirements, choosing the right model, and frequent testing. Clear and concise instructions are crucial in AI-generated code. Experienced engineers are still necessary in understanding architecture and problem-solving. Energy consumption insights and sustainability are discussed in the Talk.
AI and Web Development: Hype or Reality
JSNation 2023JSNation 2023
24 min
AI and Web Development: Hype or Reality
Top Content
This talk explores the use of AI in web development, including tools like GitHub Copilot and Fig for CLI commands. AI can generate boilerplate code, provide context-aware solutions, and generate dummy data. It can also assist with CSS selectors and regexes, and be integrated into applications. AI is used to enhance the podcast experience by transcribing episodes and providing JSON data. The talk also discusses formatting AI output, crafting requests, and analyzing embeddings for similarity.
The Rise of the AI Engineer
React Summit US 2023React Summit US 2023
30 min
The Rise of the AI Engineer
Top Content
Watch video: The Rise of the AI Engineer
The rise of AI engineers is driven by the demand for AI and the emergence of ML research and engineering organizations. Start-ups are leveraging AI through APIs, resulting in a time-to-market advantage. The future of AI engineering holds promising results, with a focus on AI UX and the role of AI agents. Equity in AI and the central problems of AI engineering require collective efforts to address. The day-to-day life of an AI engineer involves working on products or infrastructure and dealing with specialties and tools specific to the field.
Install Nothing: App UIs With Native Browser APIs
JSNation 2024JSNation 2024
31 min
Install Nothing: App UIs With Native Browser APIs
Top Content
This Talk introduces real demos using HTML, CSS, and JavaScript to showcase new or underutilized browser APIs, with ship scores provided for each API. The dialogue element allows for the creation of modals with minimal JavaScript and is supported by 96% of browsers. The web animations API is a simple and well-supported solution for creating animations, while the view transitions API offers easy animation workarounds without CSS. The scroll snap API allows for swipers without JavaScript, providing a smooth scrolling experience.
Web Apps of the Future With Web AI
JSNation 2024JSNation 2024
32 min
Web Apps of the Future With Web AI
Web AI in JavaScript allows for running machine learning models client-side in a web browser, offering advantages such as privacy, offline capabilities, low latency, and cost savings. Various AI models can be used for tasks like background blur, text toxicity detection, 3D data extraction, face mesh recognition, hand tracking, pose detection, and body segmentation. JavaScript libraries like MediaPipe LLM inference API and Visual Blocks facilitate the use of AI models. Web AI is in its early stages but has the potential to revolutionize web experiences and improve accessibility.

Workshops on related topic

AI on Demand: Serverless AI
DevOps.js Conf 2024DevOps.js Conf 2024
163 min
AI on Demand: Serverless AI
Top Content
Featured WorkshopFree
Nathan Disidore
Nathan Disidore
In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.
AI for React Developers
React Advanced 2024React Advanced 2024
142 min
AI for React Developers
Top Content
Featured Workshop
Eve Porcello
Eve Porcello
Knowledge of AI tooling is critical for future-proofing the careers of React developers, and the Vercel suite of AI tools is an approachable on-ramp. In this course, we’ll take a closer look at the Vercel AI SDK and how this can help React developers build streaming interfaces with JavaScript and Next.js. We’ll also incorporate additional 3rd party APIs to build and deploy a music visualization app.
Topics:- Creating a React Project with Next.js- Choosing a LLM- Customizing Streaming Interfaces- Building Routes- Creating and Generating Components - Using Hooks (useChat, useCompletion, useActions, etc)
Vibe coding with Cline
JSNation 2025JSNation 2025
64 min
Vibe coding with Cline
Featured Workshop
Nik Pash
Nik Pash
The way we write code is fundamentally changing. Instead of getting stuck in nested loops and implementation details, imagine focusing purely on architecture and creative problem-solving while your AI pair programmer handles the execution. In this hands-on workshop, I'll show you how to leverage Cline (an autonomous coding agent that recently hit 1M VS Code downloads) to dramatically accelerate your development workflow through a practice we call "vibe coding" - where humans focus on high-level thinking and AI handles the implementation.You'll discover:The fundamental principles of "vibe coding" and how it differs from traditional developmentHow to architect solutions at a high level and have AI implement them accuratelyLive demo: Building a production-grade caching system in Go that saved us $500/weekTechniques for using AI to understand complex codebases in minutes instead of hoursBest practices for prompting AI agents to get exactly the code you wantCommon pitfalls to avoid when working with AI coding assistantsStrategies for using AI to accelerate learning and reduce dependency on senior engineersHow to effectively combine human creativity with AI implementation capabilitiesWhether you're a junior developer looking to accelerate your learning or a senior engineer wanting to optimize your workflow, you'll leave this workshop with practical experience in AI-assisted development that you can immediately apply to your projects. Through live coding demos and hands-on exercises, you'll learn how to leverage Cline to write better code faster while focusing on what matters - solving real problems.
Building Full Stack Apps With Cursor
JSNation 2025JSNation 2025
46 min
Building Full Stack Apps With Cursor
Featured Workshop
Mike Mikula
Mike Mikula
In this workshop I’ll cover a repeatable process on how to spin up full stack apps in Cursor.  Expect to understand techniques such as using GPT to create product requirements, database schemas, roadmaps and using those in notes to generate checklists to guide app development.  We will dive further in on how to fix hallucinations/ errors that occur, useful prompts to make your app look and feel modern, approaches to get every layer wired up and more!  By the end expect to be able to run your own AI generated full stack app on your machine!
Please, find the FAQ here
Free webinar: Building Full Stack Apps With Cursor
Productivity Conf for Devs and Tech LeadersProductivity Conf for Devs and Tech Leaders
71 min
Free webinar: Building Full Stack Apps With Cursor
Top Content
WorkshopFree
Mike Mikula
Mike Mikula
In this webinar I’ll cover a repeatable process on how to spin up full stack apps in Cursor.  Expect to understand techniques such as using GPT to create product requirements, database schemas, roadmaps and using those in notes to generate checklists to guide app development.  We will dive further in on how to fix hallucinations/ errors that occur, useful prompts to make your app look and feel modern, approaches to get every layer wired up and more!  By the end expect to be able to run your own ai generated full stack app on your machine!
Working With OpenAI and Prompt Engineering for React Developers
React Advanced 2023React Advanced 2023
98 min
Working With OpenAI and Prompt Engineering for React Developers
Top Content
Workshop
Richard Moss
Richard Moss
In this workshop we'll take a tour of applied AI from the perspective of front end developers, zooming in on the emerging best practices when it comes to working with LLMs to build great products. This workshop is based on learnings from working with the OpenAI API from its debut last November to build out a working MVP which became PowerModeAI (A customer facing ideation and slide creation tool).
In the workshop they'll be a mix of presentation and hands on exercises to cover topics including:
- GPT fundamentals- Pitfalls of LLMs- Prompt engineering best practices and techniques- Using the playground effectively- Installing and configuring the OpenAI SDK- Approaches to working with the API and prompt management- Implementing the API to build an AI powered customer facing application- Fine tuning and embeddings- Emerging best practice on LLMOps