No, I didn't. You might be wondering at this point, how hard is that to actually use? It's pretty straightforward. It fits on a single slide, so let's walk you through this code.
First, you import the MediaPipe LLM inference API as you can see here. Next, you define where your large language model is hosted. You would have downloaded this from one of the slides on the previous link on the previous slides where you downloaded the model and hosted it on your own CDN.
Now you can define a new asynchronous function that will load and use the model. Inside this, you can specify your file set URL that defines the MediaPipe runtime to use and this is using the default one that they provide and host for you which is safe for you to use too. However, if you want to, you can download this file and host it on your own server for your own privacy reasons, if you prefer.
Now you can use the file set URL from the prior line to initialise MediaPipe's file set resolver to actually download and use the runtime for this generative AI task you're about to perform. Next, you load the model by calling LLM task create from model path to which you pass the file set and model URL you defined above. As the model is a large file, you must wait for that to finish loading after which it will return the loaded model which you can then assign to a variable called LLM.
Now that you've got the model loaded, you can use it to generate a response given some input text as shown on this slide which you store in a variable called answer. With that, you can now lock the answer, display it on screen, or do something useful with it as you desire. That's pretty much it. Now just call the function above to kick off the loading process and wait for the results to be printed.
Now, the key takeaway here is that while there are some scary-sounding variables like file set resolver, anyone here could take and run this ten lines of code or so and then build around it with your own JavaScript knowledge for your own creative ideas, even if you're not an AI expert yet. So do stop playing with these things today.
Now, you can imagine turning something like this into a browser extension whereby you could highlight any text on the web page, right-click, and convert a lengthy blog post in a suitable form for social media, or maybe define some word you don't understand all in just a few clicks for anything you come across instead of going to a third-party website to do so. In fact, I did that right here in this demo, again, made in just a few hours on the weekend, entirely in JavaScript, client-side in the browser. So there are so many ideas waiting to be created here, and we're at the very beginning of a great adventure with much more waiting to be discovered.
In fact, people are already using these models to do more advanced things, like talking to a PDF document to ask questions about its contents without having to read it all yourself, as shown in this demo by Nico Martin. This is a great time-saver, and it's a really neat use of large language models when combined with surrounding rag techniques to extract sentences that matter from the PDF, and then use those as a context to the LLM to answer from to provide something that actually is meaningful. Again, this is all working locally on your device.
Okay. So you've got all these AI models that make you feel like a superhero, but how can they actually help you? Well, by selecting the right models for the right situation, you can provide your customers with superpowers themselves when you apply those models to their industries. In fact, at Google, we often need to explore what models to use for a given task. So we've created a system called Visual Blocks to do that in a more efficient way. It's a framework I worked on with my team that allows you to go from an AI-powered idea to a working prototype faster than ever before, built around JavaScript web components, so it's super easy to extend by anyone who knows JavaScript.
Comments