Generative Ai In Your App? What Can Possibly Go Wrong?

Rate this content
Bookmark

Utilizing generative AI models can result in a lot of varied and even unexpected outputs, making them less deterministic and harder to test. When trying to integrate these models into your app, it can be challenging to ensure that you maintain a high level of quality from these AI outputs, and even ensure that their results don’t crash the flow of your app. Come relive my journey of discovery into how I was able to drastically improve results from OpenAi’s ChatGPT API, for use within my company’s product. In this talk I will share many tips that will help enable you to more effectively utilize the power of AI models like ChatGPT within your own apps, including testing strategies and how to avoid many of the issues I ran into.

This talk has been presented at TestJS Summit 2023, check out the latest edition of this Tech Conference.

FAQ

GenreBI is used in applications to enhance product features, particularly in the context of writing tests and integrating AI into codebases. It helps in understanding and planning the flow of apps, managing dependencies, and improving the overall quality of the product.

Severity's mission is to make better decisions, focusing on improving decision-making processes within product teams, which include engineers and product managers.

Integrating OpenAI models into applications can be challenging due to the models' limitations. Understanding these limitations is crucial, as the models are very good at certain tasks but might perform poorly in others.

Before the official ChatGPT API was available, an experimental setup used Docker and Puppeteer on an EC2 instance to create a makeshift API. However, this setup faced issues like Captcha and automatic logouts, making it unreliable for production use.

Severity encountered difficulties with JSON parsing using OpenAI's API, as the AI would often return non-standard JSON formats. They experimented with different prompts and settings to improve JSON parsing accuracy.

Severity employs integration testing and contract testing to manage costs and improve testing efficiency. They also explore dynamic consumer-driven contract testing to handle the non-deterministic nature of AI responses.

The introduction of GPT-4 improved response accuracy from OpenAI's API to 80%, but challenges with JSON parsing and response consistency persisted, prompting further adjustments and tests.

Severity uses function calling APIs to increase control over AI responses, explores different data formats like markdown, and implements retry strategies to handle errors and inconsistencies in AI outputs.

Todd Fisher
Todd Fisher
29 min
07 Dec, 2023

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Today's Talk discusses the application of GenreBI in apps, using Docker to make ChatGPT work on any machine, challenges with JSON responses, testing AI models, handling AI API and response issues, counting tokens and rate limits, discovering limitations and taking a reactive approach, reliability and challenges of AI APIs, and the use of GPT and AI Copilot in software development.

1. Introduction to GenreBI in Apps

Short description:

Today, I want to talk about GenreBI in our apps as a product feature. We'll discuss its application in codebases and the importance of quality. Understanding the flow of an app and its dependencies, specifically the limitations of OpenAI models, is crucial for better decision-making and overall product quality.

Awesome. So today I want to talk a bit about GenreBI in our apps as kind of a product feature. So a little bit different than the previous talk that Olga gave, which was pretty awesome, you use GenreBI to write tests. In this case, we want to talk a bit about kind of the other side of where GenreBI might be applied in your codebase, which essentially trickles into your test, right?

So with that said, I want to briefly kind of cover our, the company I currently work at, Severity, been here for about a year. We do a bunch of AI stuff. Our ultimate mission is to make better decisions, kind of from the human existential, let's make better decisions wherever we can. But of course, that's a big picture thing. So we really want to focus right now on making better decisions on product teams. So engineers, product people, that type of thing. So if that sounds at all interesting, go to this website. If not, that's fine too. That's more just context of, we've been working with this thing for about a year. And with that, there's of course a lot of kind of GenreBI stuff that we've been using. And so that's kind of just a bit of context here as far as what I'm about to share today, because really there's a lot of cool things you could do with AI, but there's a lot of weird stuff when you actually put AI into codebases and try to make things work from a product perspective.

So, with that said, quality is very important. There's a lot of aspects of quality in codebases out there. Testing is one aspect of many aspects. There's a lot of different ways we could test. We won't get into that because I think we all have some good kind of sense of what those things are. Unit, there's manual testing, automated testing, and a billion other things. Now, another important aspect of quality is this idea of kind of planning out or architecting the flow of an app. Understanding that, it's very key. Understanding how users will flow through your app, understanding where things may break down, that type of thing. So, I'm going to touch a bit on that as well today. And then this third one is understanding dependencies, specifically their behaviors and limitations. So, when you start using AI in your app as a product feature, you have to really understand those dependencies. In this specific case, the case study I'm sharing with you today with the very stuff is we're using the OpenAI models. And with that, it's very critical to understand that the OpenAI models are very good at certain things, but very bad at other things. And so, the better that we understand the limitations, the overarching the better quality we could give, because ultimately, at the end of the day, whether it be tests, whether it be designing things out, quality is really what we're trying to solve for. And so, with that said, this actually can apply to other LLMs, other models out there, but specifically, we have been using OpenAI, and so these examples will be in OpenAI format.

2. Using Docker to Make ChatGPT Work on Any Machine

Short description:

One of the features of Verity was the ability to create and generate documents. We also had an example code that demonstrated the usage of ChatGPT. However, there was no ChatGPT API for codebases, which was a big downside. To address this, we came up with the idea of using Docker to make it work on any machine. Let's explore this concept further.

So, with that said, one of the features of Verity that we built early on before ChatGPD came out was this idea of creating a document, generating or drafting a document, and that worked out pretty well. Let me just go to the next one.

So, example code here. I'll just kind of briefly cover this to see if this works good. Oh, look at that. Cool. Awesome. So, typical import of SDK stuff up here, and then we have a completions call. That's essentially how OpenAI will actually call the AI. And then in here we have this prompt right here. This is essentially what you would see in any ChatGPT type of thing, where you type in some sort of command, some sort of prompt. That is covered right here. So, with that said, we have this example here. And of course, if you ran that, you would have something like local host is my server, what's yours, to complete the roses are red, violets are blue. Sorry, I feel like this thing is in the way. Let me get up here more, and so forth, right? And so, like I said, so if you go to the playground in ChatGPT, kind of same idea there. So, those are kind of the two things that are the same there.

So, with that said, so if we rewind back to November of 2022, ChatGPT came out. It was a big rush. Everyone's like, so excited. Oh, ChatGPT, it's going to revolutionize the world, all that good stuff. And I think that for the most part they were right. And so, one downside though, with that announcement was there was no ChatGPT API for codebases to actually use cool AI stuff, at least not yet. And so, that was a big bummer. And so, you know, thinking about like this playground where you could type in any prompt and generate whatever the heck you want, we didn't have that in the codebase, right? And so, we had this cool idea. So, anyone ever use Docker in here? So, Docker is kind of a cool thought, right? And this is one of my favorite memes about Docker. It's like, well, it works on my machine. What if we figure out a way to make it work on not my machine? And so, let's actually take this. So, this is something that we tried. So, we have essentially the browser window.

QnA

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Building a Voice-Enabled AI Assistant With Javascript
JSNation 2023JSNation 2023
21 min
Building a Voice-Enabled AI Assistant With Javascript
Top Content
This Talk discusses building a voice-activated AI assistant using web APIs and JavaScript. It covers using the Web Speech API for speech recognition and the speech synthesis API for text to speech. The speaker demonstrates how to communicate with the Open AI API and handle the response. The Talk also explores enabling speech recognition and addressing the user. The speaker concludes by mentioning the possibility of creating a product out of the project and using Tauri for native desktop-like experiences.
AI and Web Development: Hype or Reality
JSNation 2023JSNation 2023
24 min
AI and Web Development: Hype or Reality
Top Content
This talk explores the use of AI in web development, including tools like GitHub Copilot and Fig for CLI commands. AI can generate boilerplate code, provide context-aware solutions, and generate dummy data. It can also assist with CSS selectors and regexes, and be integrated into applications. AI is used to enhance the podcast experience by transcribing episodes and providing JSON data. The talk also discusses formatting AI output, crafting requests, and analyzing embeddings for similarity.
The Rise of the AI Engineer
React Summit US 2023React Summit US 2023
30 min
The Rise of the AI Engineer
Watch video: The Rise of the AI Engineer
The rise of AI engineers is driven by the demand for AI and the emergence of ML research and engineering organizations. Start-ups are leveraging AI through APIs, resulting in a time-to-market advantage. The future of AI engineering holds promising results, with a focus on AI UX and the role of AI agents. Equity in AI and the central problems of AI engineering require collective efforts to address. The day-to-day life of an AI engineer involves working on products or infrastructure and dealing with specialties and tools specific to the field.
Web Apps of the Future With Web AI
JSNation 2024JSNation 2024
32 min
Web Apps of the Future With Web AI
Web AI in JavaScript allows for running machine learning models client-side in a web browser, offering advantages such as privacy, offline capabilities, low latency, and cost savings. Various AI models can be used for tasks like background blur, text toxicity detection, 3D data extraction, face mesh recognition, hand tracking, pose detection, and body segmentation. JavaScript libraries like MediaPipe LLM inference API and Visual Blocks facilitate the use of AI models. Web AI is in its early stages but has the potential to revolutionize web experiences and improve accessibility.
Building the AI for Athena Crisis
JS GameDev Summit 2023JS GameDev Summit 2023
37 min
Building the AI for Athena Crisis
Join Christoph from Nakazawa Tech in building the AI for Athena Crisis, a game where the AI performs actions just like a player. Learn about the importance of abstractions, primitives, and search algorithms in building an AI for a video game. Explore the architecture of Athena Crisis, which uses immutable persistent data structures and optimistic updates. Discover how to implement AI behaviors and create a class for the AI. Find out how to analyze units, assign weights, and prioritize actions based on the game state. Consider the next steps in building the AI and explore the possibility of building an AI for a real-time strategy game.
Code coverage with AI
TestJS Summit 2023TestJS Summit 2023
8 min
Code coverage with AI
Codium is a generative AI assistant for software development that offers code explanation, test generation, and collaboration features. It can generate tests for a GraphQL API in VS Code, improve code coverage, and even document tests. Codium allows analyzing specific code lines, generating tests based on existing ones, and answering code-related questions. It can also provide suggestions for code improvement, help with code refactoring, and assist with writing commit messages.

Workshops on related topic

AI on Demand: Serverless AI
DevOps.js Conf 2024DevOps.js Conf 2024
163 min
AI on Demand: Serverless AI
Top Content
Featured WorkshopFree
Nathan Disidore
Nathan Disidore
In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.
Leveraging LLMs to Build Intuitive AI Experiences With JavaScript
JSNation 2024JSNation 2024
108 min
Leveraging LLMs to Build Intuitive AI Experiences With JavaScript
Featured Workshop
Roy Derks
Shivay Lamba
2 authors
Today every developer is using LLMs in different forms and shapes, from ChatGPT to code assistants like GitHub CoPilot. Following this, lots of products have introduced embedded AI capabilities, and in this workshop we will make LLMs understandable for web developers. And we'll get into coding your own AI-driven application. No prior experience in working with LLMs or machine learning is needed. Instead, we'll use web technologies such as JavaScript, React which you already know and love while also learning about some new libraries like OpenAI, Transformers.js
Llms Workshop: What They Are and How to Leverage Them
React Summit 2024React Summit 2024
66 min
Llms Workshop: What They Are and How to Leverage Them
Featured Workshop
Nathan Marrs
Haris Rozajac
2 authors
Join Nathan in this hands-on session where you will first learn at a high level what large language models (LLMs) are and how they work. Then dive into an interactive coding exercise where you will implement LLM functionality into a basic example application. During this exercise you will get a feel for key skills for working with LLMs in your own applications such as prompt engineering and exposure to OpenAI's API.
After this session you will have insights around what LLMs are and how they can practically be used to improve your own applications.
Table of contents: - Interactive demo implementing basic LLM powered features in a demo app- Discuss how to decide where to leverage LLMs in a product- Lessons learned around integrating with OpenAI / overview of OpenAI API- Best practices for prompt engineering- Common challenges specific to React (state management :D / good UX practices)
Working With OpenAI and Prompt Engineering for React Developers
React Advanced Conference 2023React Advanced Conference 2023
98 min
Working With OpenAI and Prompt Engineering for React Developers
Top Content
Workshop
Richard Moss
Richard Moss
In this workshop we'll take a tour of applied AI from the perspective of front end developers, zooming in on the emerging best practices when it comes to working with LLMs to build great products. This workshop is based on learnings from working with the OpenAI API from its debut last November to build out a working MVP which became PowerModeAI (A customer facing ideation and slide creation tool).
In the workshop they'll be a mix of presentation and hands on exercises to cover topics including:
- GPT fundamentals- Pitfalls of LLMs- Prompt engineering best practices and techniques- Using the playground effectively- Installing and configuring the OpenAI SDK- Approaches to working with the API and prompt management- Implementing the API to build an AI powered customer facing application- Fine tuning and embeddings- Emerging best practice on LLMOps
Building AI Applications for the Web
React Day Berlin 2023React Day Berlin 2023
98 min
Building AI Applications for the Web
Workshop
Roy Derks
Roy Derks
Today every developer is using LLMs in different forms and shapes. Lots of products have introduced embedded AI capabilities, and in this workshop you’ll learn how to build your own AI application. No experience in building LLMs or machine learning is needed. Instead, we’ll use web technologies such as JavaScript, React and GraphQL which you already know and love.
Building Your Generative AI Application
React Summit 2024React Summit 2024
82 min
Building Your Generative AI Application
WorkshopFree
Dieter Flick
Dieter Flick
Generative AI is exciting tech enthusiasts and businesses with its vast potential. In this session, we will introduce Retrieval Augmented Generation (RAG), a framework that provides context to Large Language Models (LLMs) without retraining them. We will guide you step-by-step in building your own RAG app, culminating in a fully functional chatbot.
Key Concepts: Generative AI, Retrieval Augmented Generation
Technologies: OpenAI, LangChain, AstraDB Vector Store, Streamlit, Langflow