In this talk, we'll build our own Jarvis using Web APIs and langchain. There will be live coding.
This talk has been presented at JSNation 2023, check out the latest edition of this JavaScript Conference.
In this talk, we'll build our own Jarvis using Web APIs and langchain. There will be live coding.
This talk has been presented at JSNation 2023, check out the latest edition of this JavaScript Conference.
The project uses the Web Speech API for speech recognition, the OpenAI GPT-3.5 Turbo model for processing text, and the Speech Synthesis API for converting text to speech.
Tauri is a tool that allows you to create native desktop applications using web technologies like HTML and JavaScript, with Rust as the backend. It can be used to turn the AI assistant into a native desktop app.
Tejas Kumar is the founder of a developer relations consultancy that helps developer-oriented companies build and maintain strong relationships with developers through strategic discussions, mentorship, hiring, and hands-on execution.
Although the AI assistant uses non-standard APIs requiring prefixes, it could potentially be used in production with custom grammars and further development.
The consultancy operates on the philosophy of 'DevRel, not DevSell,' emphasizing building genuine relationships with developers rather than trying to sell them products.
The consultancy works on projects that involve building tools and technology for fun and learning, such as creating a voice-activated AI assistant using web APIs and JavaScript.
The main goal of the AI assistant project is to have fun while learning about JavaScript and AI, rather than building a product to sell.
You can support Tejas Kumar's DevRel work by following him and engaging with his content.
The project uses web APIs, JavaScript, VEET for the dev server, and Visual Studio Code for coding.
Tejas Kumar's consultancy helps developer-oriented companies build great relationships with developers through high-level strategic discussions, mentorship, hiring, and hands-on execution such as writing documentation and giving talks.
The consultancy prefers to use Chrome because the Speech Recognition API works reliably in Chrome, although it can be made to work in other browsers with different implementations.
Hi, I'm Tejas Kumar, and I run a small but effective developer relations consultancy. We help other developer oriented companies have great relationships with developers through strategic discussions, mentorship, and hands-on execution. Today, we're going to build a voice activated AI assistant using web APIs and JavaScript. The purpose is to have fun while learning and celebrating JavaScript and AI.
Hi, I'm Tejas Kumar, and I run a small but effective developer relations consultancy. What that means is we help other developer oriented companies have great relationships with developers. And we do this through high level strategic discussions, and mentorship, and hiring. Or we do it through low level, hands on execution, like we literally sometimes write the docs, do the talks, etc.
In that spirit, it's important for us to kind of, you know, stay in the loop, and be relevant and be relatable to developers to have great DevRel developer relationships. And sometimes to do that, you just have to build stuff. You see, a lot of conferences these days, are a bunch of DevRel people trying to sell you stuff, and we don't like that. It's DevRel, not DevSell.
And in that spirit, we're not going to sell you anything here, we're just going to hack together. The purpose is to have some fun, to learn a bit, and so on. What we're gonna do in our time together is we're going to build a voice activated AI assistant, like Jarvis from Ironman, using only web APIs, just JavaScript. We'll use VEET for a dev server, but that's it, this works. We're gonna be using some non-standard APIs that do require prefixes and stuff, but if you really wanted to, you could use it in production. You could supply your own grammars and so on. The point today, though, is not that, it's to have fun while learning a bit and also vibing a little bit. All in the spirit of celebrating JavaScript and AI.
We're going to use the Web Speech API for speech to text and the speech synthesis API for text to speech. We'll give the text to OpenAI's GPT 3.5 Turbo model and then speak the response. It's a straightforward process using browser APIs that have been around for a while.
So with that, let's get into it by drawing a plan in tldraw. We're gonna go to tldraw, and what do we want to do? Well, we want to first have speech to text. This is using the Web Speech API. From there, we want to take this text and give it to OpenAI, the GPT 3.5 Turbo model. From there, we want to speak. So text to speech from OpenAI. This is the plan. We want to do this with browser APIs. We want to reopen the microphone after GPT 4 talks and have it come back here. This is what we want to do. Let's draw some lines. So it's really just speech to text, an AJAX request and text to speech. This is what we want to do. Not necessarily hard. There are some functions here. This is called the speech recognition we're going to use. That's actually a thing introduced in 2013. It's been around for a while. This is the speech synthesis API. So both of these exist in JavaScript in your browser runtime. They're just ready to use. What we're going to do is use them to fulfill this diagram.
We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career
Comments