English versionEN

Let’s build K.I.T.T. With JavaScript

Google Developer Expert in Web Technologies

In the TV series Knight Rider, the concept of an intelligent car was pure sci-fi. Over 40 years later, we now have the tools to bring it to life - purely in the browser. This talk explores combining AI techniques like RAG and function calling with advanced browser APIs (e.g., WebBluetooth) to create a browser-based AI agent with personality, memory, and autonomy, powered by WebAssembly and WebGPU - fully offline and privacy-preserving!

This talk has been presented at React Advanced 2025, check out the latest edition of this React Conference.

FAQ

Nico is a machine learning engineer from Switzerland, working at Hugging Face, and part of the Google Developer Experts Program in web technologies and AI.

Kit is short for Knight Industries 2000, an advanced AI-powered car from the 1980s TV show Knight Rider, known for its sophisticated AI and ability to fight crime alongside Michael Knight.

Transform.js is a library created by Hugging Face to make it easy to use machine learning models in the browser.

Kit listens and talks in natural language, processes information intelligently, has a personality with humor and sarcasm, and possesses a history with physical presence.

Machine learning can be used in the browser with Transform.js, which allows the import of models and execution of tasks like automatic speech recognition and text synthesis.

Large language models provide intelligence to Kit, enabling it to process and generate human-like responses and interact with users intelligently.

Transformers.js uses a pipeline function to specify tasks and models which can be executed directly in the browser using WebGPU for efficient processing.

Prompt engineering involves crafting precise input prompts to optimize AI model outputs, enabling customization of AI behavior, such as Kit's personality and responses.

Web Bluetooth is used to communicate with Kit's physical representation, a Lego car, allowing it to move and perform actions based on inputs and AI decisions.

Nico created an AI version of Kit as a fun side project to explore the capabilities of AI and web technologies in the browser, driven by personal interest and enjoyment.

artificial intelligence webgpu

Nico Martin

29 min

28 Nov, 2025

Comments

Video Summary and Transcription

Nico introducing himself and his work in machine learning and AI. Exploring the essence of Kit and its human-like features, and the use of Transform.js for machine learning in the browser. Understanding the significance of model parameters and the availability of open source models like Apertus and GEMMA3. Creating a fine-tuned version of the AI pipeline using prompt engineering with a new system prompt for improved behavior and responses. Utilizing Reddit to extract scenes, transforming screenplays into structured JSON, and creating an LLM summarizer for semantic similarity comparisons. Introduction to Agentic Rack System and Autonomous Task Completion by AA Agents. Utilizing TensorFlow and Transformer JS for Various Tasks and Models. Running LLMs in the browser is not practical due to large model sizes. Device performance matters in browser-based AI processing.

Available in Español: Construyamos K.I.T.T. Con JavaScript

1. Introduction to Kit and Nico

Short description:

Nico introducing himself and his work in machine learning and AI.

Hi, everyone. My name is Nico. And today, I would like to take you on a little journey of how I tried to create one of the most iconic cars in television history right in JavaScript in the browser. But before we do that, some words about myself. As you've already heard, my name is Nico. I am from Switzerland. I work as a machine learning engineer at Hugging Face, and I'm also part of the Google Developer Experts Program in web technologies and also AI and machine learning. That means I spend a lot of time trying to bridge the gap between the browser or the browser ecosystem and the AI world.

2. Decoding the Essence of Kit

Short description:

Exploring the essence of Kit and its human-like features, and the use of Transform.js for machine learning in the browser.

But let's start with a very simple question. Who is Kit? Kit is short for Knight Industries 2000s. It's this advanced AI-powered car from the 1980s show Knight Rider. And the whole show is basically about Kit and Michael Knight. They fight the bad guys, so it's this good guys versus bad guys. And in the end, Kit is just this supercomputer, sophisticated AI. So it's very hard to actually define who Kit is.

So let's rephrase the question, what is Kit? And in the 1980s, it was quite simple. Back then, Kit was just a very advanced supercomputer car doing very advanced supercomputer things. But then we learned about artificial neural networks, about machine learning. So we have networks that learn things by tweaking the connections between their neurons. So maybe Kit is just this huge artificial neural network. But today we are in the age of AI, and I have a very simple explanation. So Kit is just an AI agent that connects different services with a decision-making process. Okay, that makes sense. Let's build it.

So if you want to build Kit, you first need to dig into the features. So what are the features that make Kit human-like? First of all, it listens, it talks, so we can talk to it in natural language. It is intelligent, so it is able to process information, to act on that information. It has a personality, it's not just a soulless machine. It has humor, it has sarcasm, it has all of that. It has a history. So Michael Knight and Kit, they lived through seasons and episodes of stuff that they can talk about. And last but not least, it has a body. It actually lives in the physical world. Okay, if we want to solve those problems, those features, we need to work with a lot of machine learning. And the good thing is, we actually can use machine learning in the browser. And the easiest way to do so is to use Transform.js. And full disclosure, Transform.js is also the library that I work on full time. It was released in 2021 by my colleague, Joshua Lochner.

3. Integrating Transform.js with Hugging Face

Short description:

Exploring Transform.js integration with Hugging Face for easy browser-based machine learning model usage.

He then joined Hugging Face, so the whole library is now part of the Hugging Face family. And in the end, the goal of Transform.js is to make it as easy as possible for you to use machine learning models in the browser.

The way it works is, you can basically do almost everything with just this pipeline function. So you have the pipeline, you can import that from Transform.js. Then you need to specify a task, so what should that pipe do for you? You can then set a model, so you can specify the model that you want to use.

When this code runs, in the first time, it will download the whole model from the Hugging Face hub into the application storage on your device. So it could take some time, because some models are super small, a couple of kilobytes. Then we also have models that are a couple of gigabytes. So there you want to show a progress callback, and then you can also specify a device and the D types. The device is basically the runtime that will actually execute the operations.

4. Utilizing WebGPU and Speech Models in Transform.js

Short description:

Exploring WebGPU for GPU operations and using different models for speech transcription and synthesis in Transform.js.

The most recommended way is WebGPU. WebGPU is a browser API that allows the browser to actually execute GPU operations via JavaScript. And then the D type is basically the quantization, so the compression that you want to use with your model. In that case, you're using the full precision FP32, but it could also be quantized 4-bit, 8-bit, whatever you want.

And then you have the pipe. You can take any input, run that through the model, and you get the output. So the input and output type heavily depends on the task that you're going to use.

OK, let's see how we can solve my problems with Transform.js. Let's start with listening and talking. For the first part, we have a task called automatic speech recognition, which is basically transcribing audio into text.

So we can use the Whisper Tiny model. It's a very small transcription model from OpenAI. We can then create the transcriber. We can then take any audio input, run that through the transcriber, and we will get back the text output.

On the other end, we can use Kokoro. And Kokoro is a very small library that allows us to synthesize text into speech. We can use a library called Kokoro.js. It is a small abstraction on top of Transform.js. It was also written by Joshua, so it's very similar.

What we can do here is you can take any text, we can define a voice, and then it will return the audio. OK, let's try that. So I have a little pipeline, which records what I say. It will transcribe it, synthesize it, and it will output it back.

So I can say, hello, how are you? And if the audio works, hello, how are you? Perfect. So we now have an end-to-end pipeline, but the problem is, it's not actually intelligent. So how can we add intelligence to that pipeline? And the closest thing we have to intelligence right now is large language models.

Now, there are quite a lot of large language models out there. Most of us use models in this upper left corner. We have GPT, we have Gemini, we have Claude. But those are closed-source models.

5. Exploring Model Parameters and Open Source Models

Short description:

Understanding the significance of model parameters and the availability of open source models like Apertus and GEMMA3 for various applications.

That means we can use them through maybe their interface, through an API. But we're not allowed to use them in our own systems. And they are quite big. And the size of any machine learning model is determined by the amount of parameters they have between the neurons. And in the case of those models, we are talking about trillions, like 1, 2, 3 trillion parameters.

And the bigger the model, the more processing power we need to run those models. Then we also have smaller models like the Gemini Nano here, which is only 3.25 billion parameters. And then we have open models. And there we have open weight models and open source models. And the difference is that in both cases, the trained weights and the architecture is open source.

But open source models, there we also have access to the full training process, to the training data and all of that. So we do have, for example, the Apertus model, which is very close to my heart because it's a Swiss LLM. It was trained by a couple of Swiss universities and the ALP supercomputer. And in that category, we also have small models. We, for example, have the GEMMA3 family. We have IBM Granite. And we have the QEN3, which is the model I used for building a kit. And we also have open source models like a small version of Apertus and also the small LLM3 that was trained by Hugging Face.

Utilizing Browser Integration and Token Generation

Short description:

Exploring the browser integration of models with transformer shares, emphasizing the use of GEMMA3 270 million parameter model and text streaming for token generation.

OK. Let's see how we can use those models in the browser with transformer shares, of course. We can use this text generation pipeline task. Then we want to specify the GEMMA3 270 million parameter model. And then we basically have an array of messages. We have a system prompt. We have a user prompt. We can run that through the pipe. And it will return the answer in the end.

And we can also use streaming. So we can import this text streamer from transformers shares. We can then create a new streamer. We can pass that streamer to the pipe. And now we have a callback function. So whenever a new token is generated, it will add that or in that case, it will just console lock the new token.

OK. Let's try the same thing. But now we have this processing step. Hey, there. My name is Nico. And we are right now at React Advanced. It's an amazing conference. And we are on stage right now. So could you quickly introduce yourself to the audience? Hello, everyone. I'm GEMMA, an AI model developed by Google DeepMind. I'm here to help and engage with you all during this amazing conference. I'm excited to be part of this event and to connect with people like you who are passionate about advanced technologies and the future of AI. Let me know if you have any questions or need assistance. OK. That did work.

Enhancing AI Pipeline with Prompt Engineering

Short description:

Creating a fine-tuned version of the AI pipeline using prompt engineering with a new system prompt for improved behavior and responses.

So in the end, we created another random chatbot. Because right now, we are talking to GEMMA. And I wanted to talk to Kit, this amazing AI system from the TV series. So how can I create like a more, like let's say a fine-tuned or a more individual version of that same pipeline? And the easiest way to do that is to use prompt engineering. And prompt engineering is basically just the process of crafting precise and effective input prompts to optimize the output of AI models. Sounds complex. But it all comes down to this.

I have the same pipeline here. But now, I have a different system prompt. I have a system prompt where I tell the LLM how it should behave. So it is Kit from Knight Rider. I give it context. It should talk to Nico. I have a summary of Kit's tone. So how it should actually behave, how it should answer. And I also have some guardrails of what it should do, what it should not do.

Let's try the same pipeline as before. Hey there, we are right now at React Advanced. It's an amazing conference. And we are on stage right now. So could you quickly introduce yourself to the audience? Hello, I am Kit the Knight Rider. I am a high-performance automobile equipped with the latest in technology and a mind that can handle any challenge. I am here with my partner, the Rider, to help you navigate the world of React Advanced. I am a bit of a puzzle, but I enjoy solving problems and helping others. I am here to assist you in whatever way I can. Let me know if you need help with anything. Okay. Exact same model, exact same pipeline. The only thing we changed is the system prompt. Now let's talk about memories, because in the case of Kit, we have a lot of things that we could talk about.

Building Scene Summarization with LLM

Short description:

Utilizing Reddit to extract scenes, transforming screenplays into structured JSON, and creating an LLM summarizer for semantic similarity comparisons.

We have seasons of episodes of scenes that we can refer to. The next couple of slides took me down a significant rabbit hole, but I aimed to keep it concise. If you ever need scenes from your favorite show, Reddit might be the place to go. You can find screenplays or transcripts there, which you can use to create a playwright script. These screenplays, written pre-worldwide web times, require a custom parser to structure them into JSON.

IndexedDB, a document database in the browser, housed metadata about scenes and texts. To help Kit identify scenes, a vector database was used, along with text embeddings created with TransformersJS. Text embeddings transformed text into a 384-dimensional vector space, enabling semantic similarity comparisons. The challenge of summarizing scenes of varying lengths led to the creation of an LLM summarizer. It processed scenes from the TV series Knight Rider, generating summaries as strings in a JSON array.

Operating on a MacBook Pro, the LLM summarized 6,175 scenes, processing 3.1 million input tokens to produce 330,000 output tokens. Although it faced crashes, the browser-based tool successfully completed the task, providing metadata, text, summaries, and embeddings. This enabled the creation of a function to find similar scenes based on user questions. The workflow integration involved calling the function in the LLM workflow using a tool calling approach.

Implementing Tool Calling for LLM Workflow

Short description:

Creating an LLM summarizer for scene summaries from Knight Rider, finding similar scenes based on user questions with tool calling integration for structured outputs.

So basically what we have is a database of all the things and whatever is semantic similar is closer to each other. Some scenes are long with multiple plot twists, while others are short. The solution was creating an LLM summarizer, an asynchronous function processing scenes from the TV series Knight Rider on a MacBook Pro. The LLM summarized 6,175 scenes, processing millions of tokens to generate summaries and embeddings.

Utilizing the metadata, text, and summaries, a function to find similar scenes based on user questions was created. The workflow integration involved tool calling to execute functions in the LLM workflow. The process included the LLM detecting when tools should be called based on input, generating structured outputs in text, XML, or JSON, and the application executing the function and returning responses back.

The system prompt guided the LLM on tool calling, providing examples and XML snippet tool calls. The application parsed the input, executed the function to find similar scenes, and returned the output. This system, known as Rack, augmented prompts with retrieved information and operated as an agentic Rack system where the application decided when and if tools should be called.

Implementing Tool Calling for LLM Workflow II

Short description:

Creating a find similar scenes function in the LLM workflow through tool calling, enabling structured output generation in text, XML, or JSON.

It did take a couple of hours and days. It did crash from time to time, but I was actually able to do that directly in the browser on my device. So now I have something like this. I have some metadata, I have the text, and I have summaries, and I also have the embeddings of those summaries, and that allowed me to create a find similar scenes function where I take some question, create the vector embedding, then I figure out how similar is the question to one of the summaries, and I will then return the four scenes that are most similar to the question that the user asked.

Okay, I have the function. Now the big question is how do I call that function in my LLM workflow, and the answer to that is tool calling. So the idea of tool calling is first of all, an LLM cannot execute a tool or a function, but the application can do that. The LLM can detect when a tool should be called based on the input. It can then generate structured output. It could be text, XML, or JSON, and the application can then take that, can execute the function, and it can return the response back into the conversation.

Okay, let's see how that could work. I have, again, I have a system prompt, and I have user prompt, and in that system prompt, I basically just tell the LLM that it can do tool calling. It is able to call tools. Available tools are find similar scenes with description, with parameters. I also give a couple of examples so the LLM knows when and how to call that tool, and I also tell it how it can call tools. So I have an XML snippet tool call with a name with parameters, and when I run that, so I have a question, and it now knows, okay, to answer that question, I should maybe call a tool to find similar scenes. My application can now take that or can parse it. It sees, oh, I should call a tool, should execute a function, it can do that, and it can then return the output back to the conversation.

Utilizing Agentic Rack System and AA Agents

Short description:

Introduction to Agentic Rack System and Autonomous Task Completion by AA Agents

That system is basically what or is basically called Rack, because we are augmenting the prompt with more information that we retrieved in that retrieval step, and it will then generate the output. To be more specific, that is an agentic Rack system because the application decides when it should call a tool or if it should call a tool, and that's also what an AA agent is in the end.

It's a program that autonomously completes tasks or makes decisions based on data. Let's try that out. Let's head over to kit.nico.dev. I can now wake it up, and then whenever I press the spacebar, it should record. Come on. There we go.

Good morning, Kit. How are you today? I'm doing well, Nico. How can I assist you today? Do you remember the connection between Sony and the deadly poison gas exchange and did something tragic happen back then? I need to look up information about the connection between Sony and the deadly poison gas exchange to provide you with an accurate answer. The connection between Sony and the deadly poison gas exchange was part of a larger scheme involving a chemical caterer, Sonny, and a weight lifter, Mr. Body. The exchange...

Exploring Tool Calling Structure and Web Bluetooth

Short description:

Exploring Tool Calling Structure and Web Bluetooth Communication with Arduino Nano RP2040

Yeah. So, let's speed that up. So, what you can see is that I have the whole conversation here. I have the system prompt with everything about Kit's tone and all of that. I also have instruction about tool calling. Then, I have my first prompt, good morning, and that's just an answer, a question that it can answer without calling a tool. It will do that by just, I'm doing well, Nico, blah, blah, blah.

Then I ask about something that happened in the series. So, it tries to or it does generate this tool calling structure. The application can then use that. It can execute the tool. It can return the things that it found from the database back to the conversation and it can then answer the question with the context of all the scenes that it found in the database.

Okay. Now, let's head back to the last thing that I also wanted it to have. I also wanted it to have a body. And in this series, Kit is actually a Pontiac Firebird. And I did look at a couple of Pontiac Firebirds, but they are around 20,000 to 30,000 Swiss francs. And believe it or not, that's not in the budget for my side project. And also, I'm not a mechanic, so I had no idea how to deal with it. But I'm still, there you go. I'm still a kid at heart, so I was able to create this little Lego car. There we go. I was able to create this little Lego car. And the real question is now, how can I communicate with that car through the browser?

And I can do that using Web Bluetooth. So, Web Bluetooth is an API that allows me to communicate with BLE devices from the browser. On the device, I have little Arduino Nano RP2040. It is connected to the wheels, to the batteries, and to this little LED strip. And on that thing, I have one BLE service that has one BLE characteristic that has two values. Each value controls one side of the car. And that is then the code that I use in the Bluetooth API.

Testing Agentic Workflow and Advanced Intelligence

Short description:

Testing Agentic Workflow and Advanced Intelligence in Kit's JavaScript Implementation

But let's try it out. I can now turn it on. Here we go. And it's actually now looking for a connection. I'm going to head over to Kit. I should be able to connect to the car. And when I then press arrow keys, it is actually able to move. So, that works, but I also wanted to do that in an agentic workflow. So, I also added tools to move the car in my system prompt.

So, I can now ask, hey, Kit, we have a huge problem. We need to get out of here as quickly as possible. Could you go full speed forward? I am going full speed forward to get us out of here as quickly as possible. Let me know if you need to change direction or slow down. I have changed the speed to 100% to get us out of here as quickly as possible. Now, that's a thing that I could have achieved by just having some keywords that we could listen to. Full speed forward would be to call this function.

But in this case, there's actually some intelligence behind it. So, if I have a more advanced thing, so I can, for example, say, listen, there is a roadblock in front of us and there is a river on the right-hand side. So, we can't go right either. What should we do next? I am going to make a left turn to avoid the river and get us out of this situation. So, it actually realized that, oh, it can go straight. It can go right. So, what should it do? It should go left, of course. Okay. So, this actually worked. Let's recap what we did. We actually have Kit that is working in the browser in JavaScript. It listens and it talks using Whisper and Kokoro with Transformers.js. It is intelligent using the Quen3, 4B LLM in the browser. It has a personality using prompt engineering.

Unveiling Creativity Motivation in Web Development

Short description:

Exploring Motivation Behind Recreating TV Series Car in Browser with AI Tools

It has a history using all the scenes from Knight Rider and the Gentic Rag. And it has a body that it can communicate to with the Bluetooth and that can execute tools using tool calling.

Now, we now know what I did. We know how I did it. But the real question is why? Why should I spend evenings and hours recreating a car from a TV series that I didn't even watch when I was a kid? And the answer is pretty simple, just because it's fun. When I started building websites, I didn't build websites because I had to, because I had to make money. I was maybe 15, 14 back then. And I just wanted to see what is possible in the browser. And we have a whole tool set of things that we can work with in the browser.

And now with AI, with LLMs, with all the machine learning stuff, we just have a couple of more tools in our toolbox that we can use to build cool stuff. So that was it from my side. Thank you so much for your attention. Yeah, thank you so much. I maybe I can't and I don't want to answer that, but I started. I started about a year ago because I had the idea. And I just wanted to see how that works. And back then, I had so many things in mind. So the first idea was to have also camera input and that you can analyze the scene and all of that. But at some time, I just had to stop. I was building things, and then I was like, okay, that's like the minimal product. It has memory. It has a body, all of that. So I can't say how much time, but it was a lot. Yeah. That's fine. I'm very loud. It's fine.

Does Transformer have similarities to TensorFlow, or more context on Transformer JS, basically? Yes, it has. So the thing is, everything in Transformer JS is based on those tasks. So you can basically do different tasks with Transformer JS.

Enhancing Browser-based AI Processing

Short description:

Utilizing TensorFlow and Transformer JS for Various Tasks and Models. Running LLMs in the browser is not practical due to large model sizes. Specific use cases like voice detection benefit from on-device processing.

And there are some tasks that you can also do with TensorFlow. I know, for example, TensorFlow does a lot with image recognition. So background removal, depth estimation, you can do all of that also with Transformer JS. And then there are a lot of natural language processing stuff like text generation, feature extraction, where I'm not sure if TensorFlow can do that as well. But I think they have some overlap, definitely.

Someone is also asking, basically saying that it can be slow because it needs to download the entire model onto your computer. And, yeah, what do you think is a use case for it that makes sense? I would say running LLMs in the browser is not a production use case for now. It can work. But, again, the model that I'm using is around three gigabytes. So you have to download that up front, which doesn't make sense for the vast majority of users of your site. But you have so many small models for, there's, for example, one model that is only for voice detection.

So it takes the audio stream, and it will then detect whether that's voice or not. And there you have a case where it doesn't make sense to send 30 milliseconds of audio to the server to analyze if that is voice or not and then send it back. So there it has to be on device. And also, the model is super small. We have, whenever we want to do, if you have a video stream and you want to perform operations on the video frames, it doesn't make sense to send each frame to the server and then back. So there you can use the model directly in the browser. So you can analyze each frame, and you can do that. I mean, it depends on the task and the device.

Optimizing Device Performance for AI

Short description:

Device performance matters in browser-based AI processing. Watching Knight Rider inspired futuristic AI visions in the 80s.

I mean, it depends on the task and the device. But you normally can do that at 30 or 60 frames per second. So it's easy to do all of that directly in the browser where you don't want to have the latency. Cool. Yeah, that makes sense.

So next question is, since it's in the browser, were you able to use how fast your computer is, like the GPUs and everything? Or no? Like, would it matter if you use the shitty HP laptop? Yes, it does. It matters? It matters a lot. Yeah. Okay, cool. It definitely helps to have a good laptop, especially the silicon chips from Apple. They're quite nice. Also the new pilot. Yeah. They have the neural processing unit. I think they are quite good at doing machine learning operations. Snapdragons, right? Yeah, similar architecture. But the device matters a lot.

Okay, cool. Did you actually watch Knight Rider? And if you did, did you have a favorite episode? I did watch one episode. The first one. What is that one? Is it your favorite? It has to be my favorite, I guess. I mean, maybe you hated it. Can you imagine if you started this project for three months and then actually watch it and be like, whoa, this is bad. It was so cool to see. In the 80s, if you were thinking about what could be done with AI in the future, you have this car that can see miles away with what is in front of it.

It has so much information that there's no way that it could ever achieve that. I think back then it was just, they didn't matter if that makes sense or not. They just wanted to build the coolest thing that they could think of. And they had it in this one series. Okay, so final question.

Reflecting on Model Selection and Process

Short description:

Reflecting on past choices and model selection satisfaction.

If you were to do this all over again, would you do anything differently? I mean, I had a lot of moments there. I did try a lot of different models. I figured out that the Quen3 is the best model for that use case right now. I definitely wouldn't. If I would start over, I would know what tools to use, what models to use and all of that. But I think with the output, I'm pretty happy. So I wouldn't change a lot.

Definitely the process would be way easier. Okay, cool. Again, thank you so much for the questions. Thank you so much, Nico. A round of applause for Nico and his kid.

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Embracing WebGPU and WebXR With Three.js

JSNation 2024

27 min

Embracing WebGPU and WebXR With Three.js

Top Content

Mr.doob

Author of Three.js

The 3JS project has evolved into a community-driven effort with numerous contributors over the past 14 years. It started with 3D engine work in Flash and transitioned to using SVGs for rendering in HTML5 before adopting WebGL. The project showcases various projects and frameworks, including a no-code tool powered by 3.js. The team is working on a new render using WebGPU and developing a new shader language called TSL. The hope is that WebGPU will eventually replace WebGL, offering better control and performance.

webgpu three.js

Building a Voice-Enabled AI Assistant With Javascript

JSNation 2023

21 min

Building a Voice-Enabled AI Assistant With Javascript

Top Content

Tejas Kumar

Author of the "Fluent React" bestselling book, software engineer with 23 years of experience, and host of the developer-loved ConTejas Code podcast.

This Talk discusses building a voice-activated AI assistant using web APIs and JavaScript. It covers using the Web Speech API for speech recognition and the speech synthesis API for text to speech. The speaker demonstrates how to communicate with the Open AI API and handle the response. The Talk also explores enabling speech recognition and addressing the user. The speaker concludes by mentioning the possibility of creating a product out of the project and using Tauri for native desktop-like experiences.

artificial intelligence case study

The Ai-Assisted Developer Workflow: Build Faster and Smarter Today

JSNation US 2024

31 min

The Ai-Assisted Developer Workflow: Build Faster and Smarter Today

Top Content

Addy Osmani

Engineering Leader Working on Google Chrome

AI is transforming software engineering by using agents to help with coding. Agents can autonomously complete tasks and make decisions based on data. Collaborative AI and automation are opening new possibilities in code generation. Bolt is a powerful tool for troubleshooting, bug fixing, and authentication. Code generation tools like Copilot and Cursor provide support for selecting models and codebase awareness. Cline is a useful extension for website inspection and testing. Guidelines for coding with agents include defining requirements, choosing the right model, and frequent testing. Clear and concise instructions are crucial in AI-generated code. Experienced engineers are still necessary in understanding architecture and problem-solving. Energy consumption insights and sustainability are discussed in the Talk.

artificial intelligence

The Rise of the AI Engineer

React Summit US 2023

30 min

The Rise of the AI Engineer

Top Content

Watch video: The Rise of the AI Engineer

Shawn Swyx Wang

Latent.Space Editor & Smol.ai Founder

The rise of AI engineers is driven by the demand for AI and the emergence of ML research and engineering organizations. Start-ups are leveraging AI through APIs, resulting in a time-to-market advantage. The future of AI engineering holds promising results, with a focus on AI UX and the role of AI agents. Equity in AI and the central problems of AI engineering require collective efforts to address. The day-to-day life of an AI engineer involves working on products or infrastructure and dealing with specialties and tools specific to the field.

web development builders and founders artificial intelligence future of development

AI and Web Development: Hype or Reality

JSNation 2023

24 min

AI and Web Development: Hype or Reality

Top Content

Wes Bos

Full Stack Developer, Speaker & Teacher, Co-host of Syntax.fm podcast.

This talk explores the use of AI in web development, including tools like GitHub Copilot and Fig for CLI commands. AI can generate boilerplate code, provide context-aware solutions, and generate dummy data. It can also assist with CSS selectors and regexes, and be integrated into applications. AI is used to enhance the podcast experience by transcribing episodes and providing JSON data. The talk also discusses formatting AI output, crafting requests, and analyzing embeddings for similarity.

artificial intelligence productivity

Transformers.js: State-of-the-Art Machine Learning for the Web

JSNation 2025

27 min

Transformers.js: State-of-the-Art Machine Learning for the Web

Joshua Lochner

Hugging Face

Joshua introduces Transformers JS and Hugging Face, emphasizing community collaboration and pre-trained models. Transformers JS evolution led to 1.4 million monthly users, supporting 155 architectures. The library's browser-based capabilities offer real-time processing, cost-efficiency, and scalability. Integration enhancements include native web GPU execution and React Native implementation. Web ML implementation focuses on Onyx Runtime for device execution and web GPU for resource optimization. Browser-based ML applications cover vision, speech recognition, and text-to-speech. Advanced implementations include multimodal applications and educational tools. Interactive AI demonstrations showcase semantic search and conversational AI scenarios. Model licensing transitions to ECMAScript for efficiency and model redownloading factors are discussed.

webgpu machine learning

Workshops on related topic

AI on Demand: Serverless AI

DevOps.js Conf 2024

163 min

AI on Demand: Serverless AI

Top Content

Featured WorkshopFree

Nathan Disidore

In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.

serverless artificial intelligence architecture

AI for React Developers

React Advanced 2024

142 min

AI for React Developers

Top Content

Featured Workshop

Eve Porcello

Knowledge of AI tooling is critical for future-proofing the careers of React developers, and the Vercel suite of AI tools is an approachable on-ramp. In this course, we’ll take a closer look at the Vercel AI SDK and how this can help React developers build streaming interfaces with JavaScript and Next.js. We’ll also incorporate additional 3rd party APIs to build and deploy a music visualization app.
Topics:- Creating a React Project with Next.js- Choosing a LLM- Customizing Streaming Interfaces- Building Routes- Creating and Generating Components - Using Hooks (useChat, useCompletion, useActions, etc)

react next.js artificial intelligence

Building Full Stack Apps With Cursor

JSNation 2025

46 min

Building Full Stack Apps With Cursor

Featured Workshop

Mike Mikula

In this workshop I’ll cover a repeatable process on how to spin up full stack apps in Cursor. Expect to understand techniques such as using GPT to create product requirements, database schemas, roadmaps and using those in notes to generate checklists to guide app development. We will dive further in on how to fix hallucinations/ errors that occur, useful prompts to make your app look and feel modern, approaches to get every layer wired up and more! By the end expect to be able to run your own AI generated full stack app on your machine!
Please, find the FAQ here

artificial intelligence

Vibe coding with Cline

JSNation 2025

64 min

Vibe coding with Cline

Featured Workshop

Nik Pash

The way we write code is fundamentally changing. Instead of getting stuck in nested loops and implementation details, imagine focusing purely on architecture and creative problem-solving while your AI pair programmer handles the execution. In this hands-on workshop, I'll show you how to leverage Cline (an autonomous coding agent that recently hit 1M VS Code downloads) to dramatically accelerate your development workflow through a practice we call "vibe coding" - where humans focus on high-level thinking and AI handles the implementation.You'll discover:The fundamental principles of "vibe coding" and how it differs from traditional developmentHow to architect solutions at a high level and have AI implement them accuratelyLive demo: Building a production-grade caching system in Go that saved us $500/weekTechniques for using AI to understand complex codebases in minutes instead of hoursBest practices for prompting AI agents to get exactly the code you wantCommon pitfalls to avoid when working with AI coding assistantsStrategies for using AI to accelerate learning and reduce dependency on senior engineersHow to effectively combine human creativity with AI implementation capabilitiesWhether you're a junior developer looking to accelerate your learning or a senior engineer wanting to optimize your workflow, you'll leave this workshop with practical experience in AI-assisted development that you can immediately apply to your projects. Through live coding demos and hands-on exercises, you'll learn how to leverage Cline to write better code faster while focusing on what matters - solving real problems.

artificial intelligence

The React Developer's Guide to AI Engineering

React Summit US 2025

96 min

The React Developer's Guide to AI Engineering

Featured WorkshopFree

Niall Maher

A comprehensive workshop designed specifically for React developers ready to become AI engineers. Learn how your existing React skills—component thinking, state management, effect handling, and performance optimization—directly translate to building sophisticated AI applications. We'll cover the full stack: AI API integration, streaming responses, error handling, state persistence with Supabase, and deployment with Vercel.Skills Translation:- Component lifecycle → AI conversation lifecycle- State management → AI context and memory management- Effect handling → AI response streaming and side effects- Performance optimization → AI caching and request optimization- Testing patterns → AI interaction testing strategiesWhat you'll build: A complete AI-powered project management tool showcasing enterprise-level AI integration patterns.

artificial intelligence

Build LLM agents in TypeScript with Mastra and Vercel AI SDK

React Advanced 2025

145 min

Build LLM agents in TypeScript with Mastra and Vercel AI SDK

Featured WorkshopFree

Eric Burel

LLMs are not just fancy search engines: they lay the ground for building autonomous and intelligent pieces of software, aka agents.
Companies are investing massively in generative AI infrastructures. To get their money's worth, they need developers that can make the best out of an LLM, and that could be you.
Discover the TypeScript stack for LLM-based development in this 3 hours workshop. Connect to your favorite model with the Vercel AI SDK and turn lines of code into AI agents with Mastra.ai.

artificial intelligence typescript