AI on Demand: Serverless AI

Rate this content
Bookmark

In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.

This workshop has been presented at DevOps.js Conf 2024, check out the latest edition of this Tech Conference.

FAQ

The speaker is Nathan Visidor, an engineer at CloudFlare who works in the AI space, specifically on the vector DB.

The primary focus of the workshop is to teach participants about AI on demand, serverless architectures, and how to build a retrieval augmented generation (RAG) application.

The workshop utilizes Cloudflare Workers, HonoJS, LangChain, and Vectorize for building serverless AI applications.

Vector DB is a database optimized for storing and querying high-dimensional vectors, often used in AI and machine learning applications for tasks like similarity search.

The workshop is scheduled to last three hours, with built-in breaks to ensure participants can stretch, use the facilities, and stay hydrated.

Serverless computing is an architecture where the cloud provider dynamically manages the infrastructure, allowing developers to run code without managing servers.

Benefits include ease of deployment, built-in scalability, cost-effectiveness due to its usage-based pricing, and low latency due to edge network execution.

RAG is a method that combines information retrieval and text generation to provide more contextually relevant responses, often used in AI applications like chatbots and recommendation engines.

LangChain is an ecosystem that simplifies building AI applications by allowing developers to chain different operations and interact with various AI models and vector databases.

To set up an API token in Cloudflare, go to your profile, navigate to API tokens, create a new token with the necessary permissions for workers AI, and use this token in your application for authentication.

Nathan Disidore
Nathan Disidore
163 min
14 Feb, 2024

Comments

Sign in or register to post your comment.
  • Volodymyr Huzar
    Volodymyr Huzar
    Maersk
    It was nice workshop but it's sad that it cannot be reproducible without a special Cloudflare account which was available only during an online session

Video Summary and Transcription

The workshop explores the intersection of serverless and AI, discussing the basics, benefits, scalability, and challenges of serverless. It also delves into AI architecture components, vector databases, and the use of context in querying. The workshop demonstrates the building process using HonoJS and Linkchain, setting up Cloudflare Workers and Wrangler, and loading data into a vector database. It also covers the creation of a chatbot with Cloudflare Workers AI and the use of API tokens and environment variables. The workshop concludes with information on pricing and recommendations for further questions.
Available in Español: IA a demanda: IA sin servidor

1. Introduction to Serverless and AI

Short description:

Welcome to the workshop! I'm Nathan Visidor from CloudFlare, and today we'll be exploring the intersection of serverless and AI. We'll cover the basics of serverless and AI, discuss how they can work together, and have a hands-on exercise. The workshop is scheduled to last three hours, and we'll take breaks. To participate, you'll need a thinking cap, an editor for JavaScript coding, Node installed, and a free tier CloudFlare account. Fill out the form I provide to access the shared account. Let's get started!

Welcome, welcome, everyone. And thanks for joining us. And in case no one has told you yet, happy Valentine's Day, if that's our thing, and celebrate it wherever you are. We're happy to have you today, and we're gonna have a little bit of fun with the Valentine's Day theme here. We can see even right away, our little robot friend is giving us a little bit of love off the bat.

So yeah, again, thanks for joining. If you're looking for the workshop or course on AI on demand, then you are in the right spot. And let's get this party started. Quick little intro as to who I am and what I'm about and why you should even pay attention to me in the first place. My name is Nathan Visidor. I'm one of the engineers here at CloudFlare, who works on our AI space, actually. I'm working on the vector DB that we have. And if you don't know what that is, we'll get into kind of what that is a little bit more in just a few minutes here. But yeah, it's been at CloudFlare for a little over four years now. A variety of roles at the company working, again, most recently on AI, but our serverless offerings before that, and then before that, kind of a more traditional back-end role wherever we're dealing with things like Kafka clusters, that process, you know, a couple of terabytes, a couple of trillion and worth, trillions of messages every day, alert notification services, those kinds of like internal tooling things. We're happy to have you.

And yeah, let's talk a little bit about maybe to kind of start things off what to expect here. We'll kind of set the stage or give you the core syllabus, so to say. So, this is, you know, basically what we're going to do to kick things off is to go over some slides. I definitely want to make this interactive. And again, we'll get into that in just a minute here. I don't want to be talking at you. We'll make this more of a dialogue. But yeah, we'll go over some basic concepts to kind of set the stage for what we're going to be doing, you know, in the more hands-on portion of all this. And once we've done that, we will get into the live exercise part of this. And you all can build something of your own to kind of test this stuff out in the real world. Here is what our agenda kind of looks like. You know, the kind of the bullet points we're going to hit is first we're going to talk about what serverless is. There are a fair few of you probably already familiar with the concepts there. But just a quick little primer refresher for people who aren't as familiar or haven't ever used it before themselves. We'll chat about AI, which I imagine a little bit more people are unfamiliar with. But yeah, skipping that we'll do a little bit of a pulse here in a second to kind of see what that looks like. And then we'll see what it looks like to kind of bridge those concepts in how serverless and AI can kind of work together. And it's not an easy thing to make happen. But I think by the end of this we'll kind of discuss what... Oh, hey, Christina world wide viewers here. Yeah, we'll chat about kind of again, you know, how we're able to marry these two concepts into something that works together. And then we'll get down to business and actually again get hands on. So, I hope is that today, the takeaway will be again, one, if you haven't learned about kind of what the building blocks of like an AI architecture application are, you'll come away with that. But more importantly, for you know, the crux of this talk is that, you know, how we're going to be able to actually apply some of the concepts of serverless to traditional AI architecture, and semantic search, and serverless specifically. So again, if you're unfamiliar with semantic search, we'll cover that in the AI section of our primers here. But this is kind of what I hope you take away from kind of what we're going to talk about here, the next three hours. And yeah, maybe that is a good thing to call out. So it's a good segue, where this is this workshop is scheduled to last three hours. That's a long time we're gonna be here for quite a while. So, you know, I definitely for your sake and for mine, I'm definitely going to be cognizant of time. And, you know, we've got a couple built in breaks to make sure that we're able to stretch and use the facilities and maybe get snacks or something like that. Because, you know, definitely want to stay hydrated and, you know, keep healthy and fed and all that stuff as well. Here are a few things that we are going to need to make this workshop a success. We definitely want you to have your thinking cap on so you be able to hopefully be in a learning attitude and learning spirit to pick up what we're throwing down. And for the live portion, we're definitely going to want some editor that we're able to use to actually do the live coding portion of this. We are going to be coding in JavaScript. If you don't fully understand it, these concepts aren't exclusive to JavaScript at all. It's just what makes things a little bit easier. And yes, we will also need a CloudFlare sorry, we will need Node installed because we are going to be doing JavaScript coding and a CloudFlare account. I see a question, what kind of account do we need? It's a great question. All you need is a free tier account. And there's a form that I'm going to give you in just a second too. That will give me the information I need to add you to a we need some special privileges to make this work. So if you go here and fill this setup an account and fill out this Google form with the email address that you used to set up the account, then I can add you to a shared account that we're all going to use for this exercise and you'll have the privileges that you need to actually make this work. Let me copy and paste that to chat as well because that'll probably be easier for everybody to follow along with. But the QR code is there if you're able to scan that as well. You can work on that in the background. It doesn't have to happen right now. We got a little bit of time to quite a decent amount to cover before we get there. But if you can have all these things ready by the time we get to the interactive portion, it will really help speed things along here. And I realized I sent that as a direct message. Let me try that again. There we go. Excellent.

2. Understanding Serverless and its Benefits

Short description:

Let's kick off with a poll to understand everyone's background. It seems that most people are comfortable with JavaScript, which is great for what we're doing. Not many are currently using Serverless, but that's expected. People are positive towards AI. Now, let's dive into Serverless. It's a controversial term, but from the customer's perspective, it refers to infrastructure-less deployments that are highly distributed. It's often based on microservices and function as a service. The benefits include ease of deployment and scalability. AWS Lambda is a popular serverless platform.

Cool. So let us actually get started here. And again, I want to encourage you. This is, again, the three hour long class. So you are 100% welcome to ask questions. I'm not as familiar with Zoom, but I assume it's like a raise hand or something like that. I want to make this interactive. We're having a dialogue here. And maybe that's a good segue to kick off a quick poll. I'd like to know a little bit about y'all's backgrounds here. So I'm launching a poll. I'm not exactly sure how this surfaces on your side, but you should be able to see some questions that just kind of give a general feel for what your current knowledge gap is. Let's see. We'll see where people lie here. All right. Let's see what people had to say. Boy, we lucked out. A lot of people are comfortable with JavaScript here. So that's excellent for what we're trying to do. Again, nothing here is exclusive to JavaScript. It's just the stack that we're going to be working with today. The nice thing about even some of the APIs that we're using is that they're pretty language agnostic, especially in the AI space. Python seems to be one of the defacto standards at least for prototyping and whatnot. So there's definitely options there. But it looks like y'all are pretty comfortable with JavaScript. So love to see that. Yeah, this is going to be the interesting one, I feel like, because I realize this is the DevOps conference, and Serverless is trying to abstract away a lot of that opsiness, but not in a bad way at all. So it looks like most people aren't currently using Serverless. And that's honestly what I expected, especially at this conference. I am not an advocate one way or the other, but I do, I do, it's right tool for the right job kind of thing. One of those situations. People are pretty positive towards AI. I would not blame you at all if you weren't. There's always, there's definitely conversations to be had on both sides there. But there's, again, time and a place, and it's worth it. This is an AI conference, or AI workshop. So I kind of figured people would be a little bit more positive towards it. But but not going to advocate one way or another, I'll let you guys do that yourself. And I think that is fine by me. We'll call that good. This is a Yeah, this is that's a good, a good now we kind of know what the common commonality is and what people's backgrounds are. It's it'll set the stage a little bit for where we're going to get to here.

Let's get into Serverless. So yeah, that'd be amazing. What is Serverless? And it turns out this is a pretty controversial. To come up with a definition here. I asked a couple of co workers, they showed them these slides, and they they had their own opinions. And I guess it also depends a little bit, you know, if you're looking from the point of view of the platform or the customer. But at least in my eyes, this definition kind of seems to fit, where it's basically, you know, infrastructure list deployments of whatever application you're trying to get out almost always in a highly distributed fashion. I put infrastructure lists in an asterix and gave us the old. Give the old Oh, sure, whatever you say about a gift here of Jennifer Lawrence, because, you know, infrastructure lists, there is...it's a lie. It's always somebody else's computer at the end of the day that you're running on. But it's infrastructure list from the point of view of the customer. It's almost always some micro or micro runtime that lives on a platform as a service. That also runs on whatever network that that that platform owns. I highlighted the micro here, especially because one because it's kind of fun to say runtime sounds like sounds buzzworthy. But a lot of times you're running microservices on these serverless deployments. So you're really kind of doing function as a service is most often what you're kind of most often here there. When you're targeting a serverless style deployment there. Why would you ever want to do something like that? That's a good question. Especially the abstracted away part. Well, I think one of the definite benefits is the ease of deployment. We can actually give it a diagram of what a traditional especially AI deployment looks like a little bit. And we'll kind of see it's pretty complicated. serverless in general does remove a lot of the cognitive burden there, just by making it usually like even a one line operation to like one terminal command operation to do a deployment and get things out. scalability is kind of built into the definition of serverless, you know, especially the distributed network part. Yeah, you're really making sure that sorry, I really bad butchering names here. pattern new function as a service of land this is exactly it. Yep. AWS Lambda is a very popular serverless platform to run on top of, but function as a service is definitely kind of what you're looking at there.

Watch more workshops on topic

Leveraging LLMs to Build Intuitive AI Experiences With JavaScript
JSNation 2024JSNation 2024
108 min
Leveraging LLMs to Build Intuitive AI Experiences With JavaScript
Featured Workshop
Roy Derks
Shivay Lamba
2 authors
Today every developer is using LLMs in different forms and shapes, from ChatGPT to code assistants like GitHub CoPilot. Following this, lots of products have introduced embedded AI capabilities, and in this workshop we will make LLMs understandable for web developers. And we'll get into coding your own AI-driven application. No prior experience in working with LLMs or machine learning is needed. Instead, we'll use web technologies such as JavaScript, React which you already know and love while also learning about some new libraries like OpenAI, Transformers.js
Llms Workshop: What They Are and How to Leverage Them
React Summit 2024React Summit 2024
66 min
Llms Workshop: What They Are and How to Leverage Them
Featured Workshop
Nathan Marrs
Haris Rozajac
2 authors
Join Nathan in this hands-on session where you will first learn at a high level what large language models (LLMs) are and how they work. Then dive into an interactive coding exercise where you will implement LLM functionality into a basic example application. During this exercise you will get a feel for key skills for working with LLMs in your own applications such as prompt engineering and exposure to OpenAI's API.
After this session you will have insights around what LLMs are and how they can practically be used to improve your own applications.
Table of contents: - Interactive demo implementing basic LLM powered features in a demo app- Discuss how to decide where to leverage LLMs in a product- Lessons learned around integrating with OpenAI / overview of OpenAI API- Best practices for prompt engineering- Common challenges specific to React (state management :D / good UX practices)
Working With OpenAI and Prompt Engineering for React Developers
React Advanced Conference 2023React Advanced Conference 2023
98 min
Working With OpenAI and Prompt Engineering for React Developers
Top Content
Workshop
Richard Moss
Richard Moss
In this workshop we'll take a tour of applied AI from the perspective of front end developers, zooming in on the emerging best practices when it comes to working with LLMs to build great products. This workshop is based on learnings from working with the OpenAI API from its debut last November to build out a working MVP which became PowerModeAI (A customer facing ideation and slide creation tool).
In the workshop they'll be a mix of presentation and hands on exercises to cover topics including:
- GPT fundamentals- Pitfalls of LLMs- Prompt engineering best practices and techniques- Using the playground effectively- Installing and configuring the OpenAI SDK- Approaches to working with the API and prompt management- Implementing the API to build an AI powered customer facing application- Fine tuning and embeddings- Emerging best practice on LLMOps
Building AI Applications for the Web
React Day Berlin 2023React Day Berlin 2023
98 min
Building AI Applications for the Web
Workshop
Roy Derks
Roy Derks
Today every developer is using LLMs in different forms and shapes. Lots of products have introduced embedded AI capabilities, and in this workshop you’ll learn how to build your own AI application. No experience in building LLMs or machine learning is needed. Instead, we’ll use web technologies such as JavaScript, React and GraphQL which you already know and love.
Building Your Generative AI Application
React Summit 2024React Summit 2024
82 min
Building Your Generative AI Application
WorkshopFree
Dieter Flick
Dieter Flick
Generative AI is exciting tech enthusiasts and businesses with its vast potential. In this session, we will introduce Retrieval Augmented Generation (RAG), a framework that provides context to Large Language Models (LLMs) without retraining them. We will guide you step-by-step in building your own RAG app, culminating in a fully functional chatbot.
Key Concepts: Generative AI, Retrieval Augmented Generation
Technologies: OpenAI, LangChain, AstraDB Vector Store, Streamlit, Langflow
High-performance Next.js
React Summit 2022React Summit 2022
50 min
High-performance Next.js
Workshop
Michele Riva
Michele Riva
Next.js is a compelling framework that makes many tasks effortless by providing many out-of-the-box solutions. But as soon as our app needs to scale, it is essential to maintain high performance without compromising maintenance and server costs. In this workshop, we will see how to analyze Next.js performances, resources usage, how to scale it, and how to make the right decisions while writing the application architecture.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Scaling Up with Remix and Micro Frontends
Remix Conf Europe 2022Remix Conf Europe 2022
23 min
Scaling Up with Remix and Micro Frontends
Top Content
This talk discusses the usage of Microfrontends in Remix and introduces the Tiny Frontend library. Kazoo, a used car buying platform, follows a domain-driven design approach and encountered issues with granular slicing. Tiny Frontend aims to solve the slicing problem and promotes type safety and compatibility of shared dependencies. The speaker demonstrates how Tiny Frontend works with server-side rendering and how Remix can consume and update components without redeploying the app. The talk also explores the usage of micro frontends and the future support for Webpack Module Federation in Remix.
Full Stack Components
Remix Conf Europe 2022Remix Conf Europe 2022
37 min
Full Stack Components
Top Content
RemixConf EU discussed full stack components and their benefits, such as marrying the backend and UI in the same file. The talk demonstrated the implementation of a combo box with search functionality using Remix and the Downshift library. It also highlighted the ease of creating resource routes in Remix and the importance of code organization and maintainability in full stack components. The speaker expressed gratitude towards the audience and discussed the future of Remix, including its acquisition by Shopify and the potential for collaboration with Hydrogen.
Understanding React’s Fiber Architecture
React Advanced Conference 2022React Advanced Conference 2022
29 min
Understanding React’s Fiber Architecture
Top Content
This Talk explores React's internal jargon, specifically fiber, which is an internal unit of work for rendering and committing. Fibers facilitate efficient updates to elements and play a crucial role in the reconciliation process. The work loop, complete work, and commit phase are essential steps in the rendering process. Understanding React's internals can help with optimizing code and pull request reviews. React 18 introduces the work loop sync and async functions for concurrent features and prioritization. Fiber brings benefits like async rendering and the ability to discard work-in-progress trees, improving user experience.
Building a Voice-Enabled AI Assistant With Javascript
JSNation 2023JSNation 2023
21 min
Building a Voice-Enabled AI Assistant With Javascript
Top Content
This Talk discusses building a voice-activated AI assistant using web APIs and JavaScript. It covers using the Web Speech API for speech recognition and the speech synthesis API for text to speech. The speaker demonstrates how to communicate with the Open AI API and handle the response. The Talk also explores enabling speech recognition and addressing the user. The speaker concludes by mentioning the possibility of creating a product out of the project and using Tauri for native desktop-like experiences.
AI and Web Development: Hype or Reality
JSNation 2023JSNation 2023
24 min
AI and Web Development: Hype or Reality
Top Content
This talk explores the use of AI in web development, including tools like GitHub Copilot and Fig for CLI commands. AI can generate boilerplate code, provide context-aware solutions, and generate dummy data. It can also assist with CSS selectors and regexes, and be integrated into applications. AI is used to enhance the podcast experience by transcribing episodes and providing JSON data. The talk also discusses formatting AI output, crafting requests, and analyzing embeddings for similarity.
The Rise of the AI Engineer
React Summit US 2023React Summit US 2023
30 min
The Rise of the AI Engineer
Watch video: The Rise of the AI Engineer
The rise of AI engineers is driven by the demand for AI and the emergence of ML research and engineering organizations. Start-ups are leveraging AI through APIs, resulting in a time-to-market advantage. The future of AI engineering holds promising results, with a focus on AI UX and the role of AI agents. Equity in AI and the central problems of AI engineering require collective efforts to address. The day-to-day life of an AI engineer involves working on products or infrastructure and dealing with specialties and tools specific to the field.