So let's go to application iteration two and this PIP install, you do not need to do it anymore because we installed all of the dependencies via the requirements text. So now let's add to just drawing text some input fields. So we need input fields in order to ask questions and we also need output for our question. And the way we achieve that is with stream, still with Streamlit only. So we leverage the chat input function and yeah, so this allows us to ask a question and we draw the question and we also draw a answer.
So right now there is no connection to any LLM, so it is just still UI, but we will extend the application with each of these steps. All we need to do is to copy the Streamlit command line again, stop what we have up and running, start application two, and now you can see we have a input field, what's up is the question and now we can ask a question. But the only thing the application is doing is to print out what the question was. And it doesn't have any history yet. But we come to that later.
So let's stop that application again. Let's go back to our instructions. And yeah, in order to remember the chat bot interactions, we use Streamlit session state component. So a Streamlit application works like this. So it is a Python script and so it is executed from the top to the bottom. But the next time it is executed, we do not store any state or default. In order to store state, we leverage components like the session state component to store the question and the answer that we got so far. And we will have multiple questions and it is for our benefit to have the whole history of that. So let's copy Streamlit app three. We can also have a look here at the code in GitHub. And yeah, so this is what we already have spoken about, the input field for the question and then there is the session state component. And we append our messages to our session state in order to have that accessible over each of the Python script executions. So let's execute our app three. And now I add a test one. Add a test two. Add a test three. Okay, so we have the functionality to store the state implemented.
Let's go to the instructions and let's integrate our application now with OpenAI, with the chat models available from OpenAI. And so what we use here is, we use hash data that is also a Streamlit component in order to store our prompt template that we leverage in our chatbot application. The prompt consists of a template and in a prompt, you define what the large language model should do. You define what the role of the large language model should be, how the LLM should behave. And in this case, it is, you are a helpful AI assistant tasked to answer the user's question. You are friendly and you answer extensively with multiple sentences. You prefer to use bullet points. Then within the prompt, there is a placeholder and this is where the question goes in. The whole prompt is what we sent over to the LLM. So we send over the question, we send over instructions and as you will see later, we also send over context about our private data.
Okay, so with this, we have our chat prompt template and another thing we need to do is we load the chat model. And therefore we use from LungChain, the framework, the chat open AI component and this component requires us to provide some properties. So the first thing, the property, important property is the model that we wanna use from open AI. So you might know that open AI offers a number of models. We use GBT 3.5 turbo. There is also GPT 4.0 and other large language models that we can use. But for this workshop, we use this one then there is a temperature and you might wonder, hey, what is this about this temperature? So you can give the temperature number between zero and one zero means that you do not give the LLM any freedom to come up with a response. So the response should be really on point and the LLM shouldn't hallucinate too much and shouldn't generate a random response. The higher the number is, so if it's one, then you allow the LLM to provide you with a more random response. And it might be the case that the response you get from the LLM is wrong or is not to the point, right? So the lower the number, the more accurate the response will be. And then we enable streaming due to the fact the LLM, so it is a large language model. It is a machine learning model and what LLM does, it does a prediction or tokens. So we send over a prompt and with this prompt, the LLM generates a response, but it predicts the response token by token in order to see the response as it is generated, we wanna stream it in our UI and that is why we set it to true and we will leverage that later when we implement the streaming functionality in our chatbot.
Comments