English versionEN

AI Will Revolutionize UI

Jack Herrington

Blue Collar Coder

Bookmark

And it's not the way you think!

This talk has been presented at React Summit US 2024, check out the latest edition of this React Conference.

FAQ

AI tools are used in UI development to revolutionize how interfaces are created and managed, allowing AI to interact with data, make UI decisions, and provide a more customized user experience.

TanStack React is a framework used in the demonstration to build a React app that interacts with AI models, allowing for live coding and creating dynamic UI elements based on AI interactions.

Tools allow AI to access and manipulate data, make multiple requests, and manage the UI dynamically, enabling it to perform tasks such as fetching data, filtering results, and presenting information effectively.

Zod is used to define the schema of data that the AI tools will handle, ensuring the structure of data being processed is consistent and reliable, which aids in managing how responses are structured.

Yes, the AI setup can be extended to perform mutations such as adding items to a cart, by allowing the AI to interact with data and perform actions based on user inputs and tool functions.

Local LLMs like OLama can be integrated by directly connecting them to the system via API calls, allowing them to perform similar tasks as cloud-based models but running locally to avoid vendor costs.

VectorDB and embeddings facilitate efficient data retrieval and processing, allowing AI to quickly search and use relevant data, which enhances the speed and accuracy of AI's responses and interactions.

RAG stands for Retieval Augmented Generation, a method used to provide AI with data context by retrieving relevant information from a database to generate a more informed response.

The AI uses a JSON database of video games and a tool to fetch data based on user queries, allowing it to provide tailored video game recommendations by interacting with the backend and using RAG.

Jack Herrington

27 min

19 Nov, 2024

Comments

Video Summary and Transcription

AI will revolutionize UI through the use of tools. Building recommendation systems for video games using AI libraries and data. Integrating Tansec Start to control server functions and override Fetch. Implementing RAG and using a vector database for response generation. Potential problems with context and data requests. Requesting tools and upgrading system context. Trying simulation games and using multiple steps for data retrieval. Client-side tool handling and tool request handling on the UI. Exploring Ollama application and direct post to AI. Giving AI tools and accessing data. AI evolution and the Tiny Troop project. Handling large databases and local models. RAG limitations and contextualized information. RAG live data retrieval and AI instruction. Exploring VectorDB and embeddings. Jack's thoughts on VectorDB and applause for presentation.

Available in Español: La IA Revolucionará la UI

1. AI Revolutionizing UI with Tools

Short description:

AI will revolutionize UI through the use of tools. We're going to give our AI tools and start with a TanStack React app. It's going to talk to our LLM and connect to our data. We're going to allow it to use tools to pick what kind of UI we want to show. Now let's see how easy it is to implement a React chatbot in Start.

So, AI will revolutionize UI through the use of tools. And when I talk about tools, I was talking to a few people about it, and they're like, oh, you mean like Copilot or Cursor? Nope, not that. Like V0 or Leonardo or Midjourney? Nope, not that. So, what we're going to do is we're going to give our AI tools. So, we're going to start with a TanStack React app. I'm going to do this live coding. Wish me luck. Yeah.

And our TanStack React app, if you made this first time you've seen TanStack start, it's been exciting on that face alone. It's going to talk to our LLM. In this case, it's going to talk to OpenAI. And then we're going to use a tool, a set of tools, and we're going to give it a tool to connect to our data. So, it's going to have our data in it, which is really cool. But more importantly, I think, at the end, we're going to actually allow it to use tools to pick what kind of UI we want to show. So, this is what you're going to see by the end of all this. So, all right.

So, now we've all kind of seen in the past a React chatbot. So, let's see how easy it is to actually implement this in Start. All right. Can you see the code? Can you see my dog? Hey, Murph. All right. Come here, you. There we go. I'm working on two different screens. So, this is like... Okay. So, I'm going to use a little bit of cheat code here. I'm going to give myself some... There we go. So, the first thing I'm going to do is I'm going to bring in AI React.

2. Building a Recommendation System for Video Games

Short description:

Versal's AI library called React works with TAN stack and manages the UI, handles input change, and submits to the AI endpoint. We can use ShazCN's input. We will build a recommendation system for video games using the games.json data. We will connect it with the backend using open AI and provide a system prompt.

So, Versal has an AI library called React. It actually works with other things other than Next.js. Now, it also works with TAN stack. And what it's going to do is manage the UI for us. So, it's going to give us messages. That's a transcript of our conversation with the AI, the input field, and the handle input change. So, the on change handler for the input, and then handle submit. And that's actually going to submit back to our AI endpoint. So, let's go and change the JSX. So, we actually use it.

All right. Now, we're going to bring in ShazCN's input. Anybody like ShazCN? Well, yeah. That also works with it. So, that's cool. Let's take a look. All right. So, now we got some things right here. Now, what we're going to do is we're going to build a recommendation for video games. So, we are a video game selling company. And we got a list of all the video games that we actually want to show, so, that we want to give to our customer. So, let me bring up that. Over here, we got games.json.

So, eventually, at the end of all this, we're going to allow the customer to get recommendations about the games that we have. So, that's really what we're, you know, the difference between this and a kind of stock UI. Okay. So, the next thing we want to do is actually connect this with the backend. So, we're going to connect it with our open AI. So, we're going to bring in open AI. And then we're going to give it a system prompt.

3. Integrating Tansec Start and Overriding Fetch

Short description:

Tansec Start allows you to create server functions that can be called from the client. You can control the method used, such as post or Git, and return a stream to the client. Tansec Start is distinct from Next.js as it allows streaming. We override Fetch to send all chat messages to the server function. Open AI is used to answer questions.

And we're going to do… And this is the cool part about Tansec Start in particular. So, in Tansec Start, you actually use create server function to create server functions that you can then call from the client. Really nice. And you can actually control the method that it's going to use to do that. So, in this case, we're just going to say this is a post. You can also use Git. So, you can do caching and all that good stuff. And they give it the basics of a function. So, in this case, we're going to take the messages as our transcript with the chatbot and or with the AI. And then we're going to use stream text to get out of it a stream that we're then going to send back to the client. Because we all know that as you're typing or as it's sending stuff back, you could be waiting for a while. So, you want to actually do the stream. So, that's one of the really cool things that distinguishes Tansec Start from Next.js is that Next.js' current server functions don't allow you to return a stream. So, that's why we're using Tansec Start in this case.

Now, to interface this, we need to do a little bit of hacking. So, I'm going to bring in an override for Fetch. So, I'm going to create a new function. It looks exactly like Fetch. It takes a URL and takes some kind of options data. And it turns out the body in there is a string that has all the messages in it automatically. And we're just going to send that on to our server function. So, let's go down here. And we'll go over here to Fetch. And we'll override Fetch with that chat override Fetch. All right. Let's take a look. So, what's asking? What's a good FPS game? There you go. Okay. So, this is actually going off just to open AI.

4. Implementing RAG and Using a Vector Database

Short description:

The implementation is called RAG, which stands for Retieval Augmented Generation. It uses data to create a response. A vector database is used to match a query and provide a list of games. The LLM determines the best response based on the context and the provided list.

So, this is actually going off just to open AI. And getting kind of who knows what. This is whatever list of FPS games it was trained on back in the day. Or recently. I don't know.

So, what are we going to do to actually connect it to our data in particular? So, let's go back and I'll actually continue playing here. All right. Cool. Keep on going. Yeah, yeah. Okay. So, we don't want to watch the video because I had it all preloaded with a video that would actually kind of do this if OpenAI didn't decide to work. So, don't need to do that. You can do it live.

Now, what this is actually implementing is called RAG. Anybody ever heard of RAG in the context of AI? Yes? Okay. Well, what it stands for is Retieval Augmented Generation. So, the idea is that you're going to give it some data, a context of data in the traditional sense. And it's going to go and use that data to create a response. So, normally, maybe have like this context in here, like you are a helpful assistant and all that. And what happens, actually, is that we go in the traditional model, we get the input query, good FPS games, and then we use a vector database.

So, there's a whole bunch of those on the market. It's going to give us back a list of games that it thinks match that query. And then we're going to go give it to the LLM as context. And that's going to then fire off to the LLM. And you say, okay, cool. From the list that you gave me, I think maybe Halo is the right one for that. So, in reality, as you give it a context, you're going to give it the original context, and you're going to give it like a whole bunch of stuff on the end of it.

5. Potential Problems with Context and Data Requests

Short description:

We don't want to give the AI a huge context with a lot of additional information. There are problems with this approach, including the risk of providing incorrect data and the inability to make multiple data requests. Additionally, the AI lacks the ability to manage the UI.

From the list that you gave me, I think maybe Halo is the right one for that. So, in reality, as you give it a context, you're going to give it the original context, and you're going to give it like a whole bunch of stuff on the end of it. We don't want to do that because there's some problems with that. One, is that context can get freaking huge. Another problem is that we don't know. Do we give it the right data? And two, the AI can't actually make multiple requests for data. I mean, it can't actually come back to us and say, oh, wait, hold on, I need a little more information here, which tools can do. And also, it can't manage the UI.

6. Requesting Tools and Upgrading System Context

Short description:

In our code, we're going to make a tool request to our old JSON database and receive results. We then upgrade the system context to use the games tool, which is imported from AI and returns a list of games.

All right. So, what we're going to do in our thing is we're going to do that tool request. That's going to go to our database, our old JSON database, and we're going to give it back some results. That's going to go back to itself. And that says, cool, we think you might like that. All right. And in order to get that system context, we then upgrade the system context to say, like, hey, use games if you need the list of games.

All right. Let's go over to our code. Hey, we're already there. Nice. All right. Okay. So, we need to upgrade with tools. So, we're going to bring in a games tool. So, I'll just kind of reshift this around a little bit here. We need some imports. We need a system change. All right. Okay. Cool. So, now we got a system. We're telling the system you have to use that games tool. And we'll take a look at that games tool. Well, the games tool is in a list of tools, and it says that which we're going to import from AI. Thank you very much, Cursor. And we're going to say we want to return a list of games. And then, literally, we just give back our games. And that's it. Okay.

7. Trying Simulation Game and Multiple Steps

Short description:

Let's give it a try. We make a request for a good simulation game. The AI in its current mode provides a single response, but if we ask again, we get back our data. This connects our data with our LLM. Now, let's talk about the client side. We make a request for games, get a tool response, and then we can take multiple steps to accomplish our mission as the AI.

So, let's give it a try. Okay. So, what about a good simulation game? So, that first request is a request for our tools, which is interesting. It's kind of stuck there. So, what happens is that the AI in its current mode, or our interface with the AI, says, okay, it's going to be a single response. I'm going to make a request. I'm going to get a response back, and that's it. We're done. So, if I ask that again, what about a good simulation game? Now, it actually gets back our data. So, that's Planet Coaster 2 and Farming Simulator and the stuff that we have in our JSON. So, this is actually really cool. This is actually connecting our data with our LLM, which is great.

So far so good. So, now that's the server side of the equation. Now let's talk about the client side of the equation. What's actually happening here? Oh, is it? Oh, sorry. This is the first time and only time I'm ever going to give this lecture. So, you're getting a one-off. Okay. So, we're making a request for games. We get back a tool response, and then we sort of stop. So, the little secret sauce here to actually make this work in a way that actually makes sense is we go and say that you can take multiple steps in order to actually accomplish your mission as the AI. So, let's go to that. We'll say max steps. We'll say you can take 10 steps. So, it could do a request. You could do other tools.

8. Using Multiple Steps and Tool Requests

Short description:

We'll say you can take 10 steps. It could do a request. It could do other tools. Let's try it again in ARC with a good FPS game. Now, it's made the tool request and can get the data back in one step. So far, so good. We make the tool request, get the response, and based on that, we get the data we want.

We'll say you can take 10 steps. So, it could do a request. You could do other tools. It could do all kinds of stuff in that series of 10.

So, let's go over to ARC and try it again. What about a good FPS game? Now it's made that tool request. That's the first message. And then now, subsequently, because we've given it multiple steps, it can actually go and get us back our data. All in one step. We don't have to ask multiple times.

So far, so good. Now, now we can get on with the job of the UI, because what's happening here is that we make that tool request. We get back that tool response. And then, based on that tool response, it says, cool. This is the data that you actually want. And what actually happens in real life is that there's a messages array. We saw that when we get back the response from chat that has first the recommendation, my request, then it's got the assistant's response. So, that would be JSON encoded text. Then my tool response, which is, again, JSON encoded text, goes back to OpenAI. And then finally, we get the assistant saying what we want. Okay. Cool.

9. Client Side Tool Handling

Short description:

We have finished the server side and now we're going to do the client side. We're going to give the AI a tool to show game recommendations using Zod to define the schema. We also need to handle the tool request on the client by overriding the hook and lying back to the UI.

Okay. Cool. So far, so good. We have finished the server side of this. Next up, we're actually going to do the client side of the house, which is really cool.

So, what we're going to allow our AI to do is give in some tools. Here is how I want you to represent that data. So, we're going to give it one tool. We're going to say, hey, if you want to show the game's recommendation, then use this tool to do that. All right. Eventually, I'll get which side of the screen this is on.

So, we're going to give it a tool. We're going to give it the show games tool. So, this is a tool that is going to allow the AI to actually show the list of games. Now, the way that you tell it what parameters you want to this function, because tools are basically functions, is you use Zod. Zod is a system where you can define the schema of an object. So, in this case, we're going to say that we want the games to be an array of IDs and the reason why you think that game is good for me.

Now that you've got that, we need to go and upgrade our system a little bit. So, the new one is going to say you can use show games. So, let's drop that in there. All right. Cool.

Now, that's not the end of the story, though, because we want to handle the tool request on the client. So, there's a little bit of override that we need to do to our hook. So, over here, we're going to say if you get a tool call, and that tool call is called show games, then what I want you to do is basically just lie. Lie back to the UI, or currently lie, and say that this is going to be handled by the UI. So, this is how we are going to basically say you don't need to handle this at all, we're going to handle this. It's basically the equivalent of calling the chat function on the server.

10. Tool Request Handling on the UI

Short description:

We handle the tool request on the UI by overriding the hook and lying back to the UI. We check the list of messages for tool call invocations, specifically for 'show games'. If found, we use the game cards component to display the games.

So, this is how we are going to basically say you don't need to handle this at all, we're going to handle this. It's basically the equivalent of calling the chat function on the server. This is how we'd handle it on the UI.

And the last thing we need to do is look at that list of messages coming back and see, were there any tool call invocations? So, we'll go. We'll boost this up a little bit. And we'll say we'll go through all the messages and we'll see, do you have any tool invocations? And if so, then is any one of them show games? And if they are, then we use our cool game cards component to actually show all those games. So, let's bring in Show Cards. Cool. And let's give it a go. All right.

11. Exploring Simulation Games and Local LLMs

Short description:

How about a simulation game under 30 bucks? This is all the data from our JSON. You can try it out for yourself with your open AI key. The cool thing is you don't have to use OpenAI, you can also use local LLMs like Ollama.

So, how about a simulation game under 30 bucks? There you go. Nice, right? So, this is all the data from our stuff. This is coming from our JSON. It's got our images in it. You have all this code. I'll put a link up that will allow you to go and access this code. You can try it out for yourself.

You have an open AI key and you can see for yourself how to do tools both on the client and the server. But I think the really cool thing here is, and this is actually what was gonna save my butt before I got my connectivity going, is that you don't actually have to do this on just the on just with OpenAI. You can actually use local LLMs for this. So, does anybody have any experience using Ollama? No? Yes, okay. I like it. Because you're gonna get a demo. Deal with it.

12. Exploring Ollama Application and Direct Post to AI

Short description:

I've installed the Ollama application, which has a variety of local LLMs that you can download and use without paying any money. Let's try another round on here, requesting an FPS game. It might take a while due to the slow LLMs on my machine. In the meantime, let's take a look at the code. We're connecting directly to the AI using a post to the Ollama endpoint, bypassing the Versal AI library.

So, we're gonna try. Okay. So, I've installed this cool Ollama application. And there we go. Oops. So, we can see all of the different models that I put in. Some of these models are LLMs. Other ones are for doing vector database work. You know, there's building embeds. So, there's a whole bunch of cool local LLMs that you can download just right on your machine. And you can use without actually paying any money to any of these AI vendors. It's great stuff.

Okay. So, what we're gonna do is another round on here. I'll go to slash direct. I should have called that like Ollama. And I'm gonna do exactly the same thing. I'm gonna say, hey, give me an FPS game. We'll see how it comes out. 130 bucks. I don't know. Actually, are there any? Okay. This actually might take a little bit because these LLMs are really slow, particularly on a machine like mine that only has 16 gigabytes. But let me go and take a look at the code while we're waiting on that.

So, over here, this is actually really cool. We're connecting directly, not through the Versal AI library at all. We're just using a direct post. A call to the AI. Which is literally a post to the endpoint of Ollama. So, this is literally just like a 127.001 API chat.

13. Giving AI Tools and Accessing Data

Short description:

We're giving the AI tools to retrieve data, with support for different versions and vendors. It provides direct and efficient access to functionality on your machine. The easiest way to use tools is through RAG, giving AI access to your data and allowing it to choose a tool. Versal, OpenAI library, or OLama can be used for this.

And we're giving it all this stuff. We're even giving it some tools to tell it how to actually get the data back. So, it's doing all that RAG support locally. And we'll see. Hey, there it goes.

Now, one of the interesting things is, as you start getting into the nitty gritties on this, is that all of these AIs actually use different ways of sending back the tool call invocation. And the reason I actually wrote all the code to actually go directly to it was that the Versal AI library only understands one. It only understands the OpenAI version. So, this one actually understands a couple of different versions, based on different vendors. And of course, it gets you access to all that data or all that functionality directly and on your machine. So, it's very cool. It just runs very slow, which is not great.

All right. So, cool. So, what have we learned? Well, we learned that we're going to do RAG. And that's the easiest way to use tools to do RAG, because they're going to be a much more efficient way than just preloading a massive context. You give that AI access to all of your data, however you want it. And you can use the tools on both the server and the UI. And you can have any AI. You can allow the AI to choose whichever tool you want for that. And of course, you don't have to use Versal. You can also use this OpenAI library or OLama.

14. AI Evolution and Tiny Troop

Short description:

I'm Jack Harrington, and I have a YouTube channel where you can find all the code for free. With the example of selling games, AI can evolve to add items to cart and perform mutations. Parameters can be used to let the AI handle tasks like creating a to-do list. Additionally, I am working on Tiny Troop, a Microsoft project that enables virtual agents to communicate and connect with tools.

All right. Yeah. I'm Jack Harrington. I have a YouTube channel. You should check it out. And yeah, of course, all the code is available to you for free. Just check it on out.

I'm going to give a shout out to Brett for our first question here. With your example of selling games, Jack, do you see it evolving into having the AI add things to cart? Oh, absolutely. Yes. Yeah, yeah. In addition to going and doing queries, you can also, of course, do mutations. So, a classic, if you want to just kind of take this code and extend it, you can make like a to-do list and then have a little open field where you can say, oh, you know, I had to do blah, blah, blah, blah. And then one of the really cool things is you can give those tool functions, as I said, parameters. So, you can say, hey, okay, the first parameter would be like a message and maybe another one's like a priority or something like that, and let the AI do all that work. And then in addition, you could actually have client stuff as well to kind of show the new to-do in a really cool way. So, I think there's all kinds of interesting ideas where you can go with this. Yeah, it's really exciting. It's kind of infinite possibilities here. Yeah, yeah.

Oh, I actually, I've been working on this thing called Tiny Troop. Yeah, it's crazy, crazy. It's from Microsoft, and it's this interesting little thing that allows you to create like virtual agents and have them talk to each other in a focus group. And so, actually, Rachel Neighbors, she was here last year, and I are thinking about actually having that also connect with tools, so I can actually go off and do those little agents can go off and do stuff. So, it's pretty exciting. That's awesome. Yeah, I love these little things. I mean, we live in an amazing world. We really do. Yeah, yeah.

15. Handling Large Databases and Local Models

Short description:

To handle a large database of games, additional parameters can be used to filter the games list. Local AI support for client-side rendering depends on the AI's ability to handle tools. While some AI models in browsers may not support tools, client-side tool calls can still be handled by the server. When using local models, it is recommended to use smaller models for faster performance.

Where you can wear an oven mitt. Yeah. How would this work if you have arbitrarily, let's say, 50,000 games in your database? Can the model store, cache all the games before the request is made? Right, so you'd have additional parameters on the games request, in this case, and then you could filter down your games list based on that. So, you could say, okay, well, give me the genre of game that you're looking for, and then perhaps another price filter or something like that. And then we do the price filtering and the genre filtering in the tool, send it back, and that's why you give it multiple steps. You go, oh, you know, I didn't find anything cool in that one, give me another one, and make other subsequent requests.

That's really cool. Yeah, yeah, it is neat. This might tie into what you were saying about local AIs. Can this be done in a client-side rendered app? I mean, yeah, actually, well, the AI would have to support tools. I know that the, I don't think the Google AI that they have in the browser currently does that. So, I mean, you have to come back to a server somewhere, unfortunately. Yeah. But beyond that, you'd still be able to handle all the client-side tool calls. So, you don't, you just need the browser to actually get the AI going. Sure. Or, sorry, the server to get the AI going. Yes.

We got, by the way, thank you, audience, I've gotten a swarm of questions. Holy moly, I love it. We're not going to have time for all of them, so apologies if yours doesn't hit. I'll be in the thing afterwards, right? Yeah. Yeah, there you go. It's perfect. Do you have tips for speeding up the local models? Or is it just the trick for running local? Sure. The mini ones are always smaller than, are always faster than the larger ones. The larger ones actually don't even work on my Mac. I only have like a 16 gig M1, I think, so it was all I could afford at the time. So, yeah, I would say always use the smaller models, but yeah, OOM is a fantastic way to get into this and not actually pay anything. If you wanted something that's really cheap, actually, I would say GPT-40 mini is a much cheaper option than GPT-40 direct, like a lot cheaper.

16. RAG Limitations and Contextualized Information

Short description:

RAG can be a flexible alternative to other methods that store and pull data. By providing a contextualized prompt, RAG allows for personalized and business-specific information. Tools can be used to scale the contextualization process and retrieve live data.

So I would definitely use that. And for my money, it's just as capable. And it's actually faster and cheaper. I like it. It's all good things. All good things. All good things. A little more diving into the code, how did you parse the response for the UI? Is the structure deterministic? How do you guarantee the response structure from the LLM? That was that Zod stuff. So the, oh yeah, yeah, yeah. That was the Zod return. We're giving it basically the structure of what we want back in terms of, we want an array of products. We want the IDs and we want the reasons. And that's why Zod is kind of brought into that mix. What they're using Zod for is that Zod, you can use kind of that object structure to create the structure that you want to see coming out the other side. And then you can use an accessor to basically say, cool, turn that into this JSON schema stuff. And they pass the JSON schema stuff on. So OpenAI or the LLMs on Olama don't actually know anything about Zod. They know about this JSON schema stuff. It's just a convenient way to create JSON schema. Cool. Cool.

What limitations are there to RAG compared to the other methods you've mentioned where all the data is stored and pulled from? I mean, they're both two alternative RAG methods. Like one is to, because what you're doing is you're basically saying, cool, the user has asked me for a prompt. I don't have all the, we want our information to be contextualized to us. I mean, a hundred percent of the time, you probably want AI that's not going off and making just random requests, right? You want something that actually is contextualized to your business. And so one way to do that is just give it a huge context, which I heard just recently, somebody was like taking their entire code base and all the documentation from it, creating a single text file and throwing that all into one context. It seems absurd. Or in this case, tools. And I think tools basically gives you that scaling factor because you can basically give it a tool and say, Hey, give me a call, tell me what you want and I'll go get it for you. And you can do it live.

17. RAG Live Data Retrieval and AI Instruction

Short description:

RAG enables live retrieval of various data sources. The system prompt plays a crucial role in constraining the AI's nature and providing specific instructions. Using tools, you can retrieve and display games. Consider trying a discord bot with Q&A functionality. The link to the code is yet to be decided.

And that's one of the really cool things too. Like if, as an example, you want to go off and get, I don't know, a news feed or, you know, look through blue sky posts or whatever, you could do that live. Cool.

Yeah. How important does the system prompt play a role here? You want to, you want to hint the AI with as much of the job as you want to do and kind of constrict the nature of the AI. And so that's why, like at the beginning of that we have, okay, you're an agent, your job is around video games, kind of set the stage for it. And then from there, you've got some tools. You got the ability to go get games with the tool. You got the ability to show games with the tool. And that was it. I mean, that's actually like a really short prompt. I mean, you could be, you could tell it, you know, you got me nice or, you know, you can be spicy and mean if you want that.

Are you like, oh yeah, that's actually something I want to try at some point. Yeah. I would totally put this on a discord bot with Q&A like this. Yeah. Yeah. Quick question. Where are you going to post the link to your code if anywhere? I don't know. I need to talk to the conference organizers about that. Okay.

18. Exploring VectorDB and Embeddings

Short description:

Jack shares his thoughts on VectorDB and embeddings. He highlights the speed of creating embeds in VectorDB, making it ideal for processing a large amount of data. Additionally, he mentions that Postgres and SQLite have built-in vector search capabilities, and some embedding engines like NOMIC allow for customizable precision. Unfortunately, the Q&A session has ended, but Jack receives applause for his presentation.

Yeah. We got last couple of minutes left here. So a few more questions, Jack, do you have any thoughts about VectorDB and embeddings versus other tools? Yeah, I like Vector. I won't cope a couple of cool. Yeah. I like vector embeddings. One thing is really neat is that old stuff that I showed the LLMs run locally really slowly, but VectorDB and creating embeds is actually incredibly fast. So if you want, you've got a ton of data you need to crunch through, get those Vector embeds and then pass it on to whatever Pinecone or whatever we're going to do, then the OLAM is a fantastic case for that. Two other things I would say, Postgres, SQLite, it's amazing. They all have vector search built in nowadays, which is pretty cool. I was really impressed by SQLite actually having like vector search built into it. And then also some of the embedding engines, like I think NOMIC is one of them, give you this ability to basically say, well, OK, it's going to give you, I don't know, 4,096 numbers or whatever coming out of this for each one. And you can actually slice as many of those as you want. So if you want like a low precision thing, you might take like the first hundred and twenty eight or 256 and then leave off the rest. And if you want more precision, you just keep more. But of course, that is a tradeoff there between the size of the embed versus the quality of the results. That's really nifty. I can really hear the weeds. Yeah. No, this is great. Unfortunately, we're out of time for the Q&A here. But everyone, one more time, please. A round of applause for Jack. That was fantastic. Thank you, Jack. Thank you.