So what we want from the AI is formatted as JSON. This is really cool. You can tell. Sometimes it takes a little bit of coercing. But you can tell these AI models—ChatGPT and whatever. You can say, okay, I want this data, but give it to me formatted like X, Y, and Z. So you can say, give me the output formatted as JSON. And I've got it to a point where I can confidently JSON.parse the input from the AI. I've still RapidTry catch around it, just in case. But probably running it 300 times, it's never not given me straight up JSON.
So we have the transcript. Then what we do is we say, summarize the podcast transcript into very succinct bullet points each containing a few words. So the actual prompt to that is much longer, takes a little bit of time, but basically you do, you give it the timestamp, you give it the speaker and you give like... They're called utterances, things that were said. However, there's a bit of a problem in that these models only are allowed to take a certain amount of input. It's called tokens. And a token is kind of like a word, but it's a little bit different and periods and quotation marks also are tokens as well. So a one-hour podcast is 15,000 tokens and that's beyond the limit of most accessible AI. So GPT 3.5, that's 4,000 token limit, four is about 8,000 tokens. Then there's a couple of these last three are not accessible to most mortals right now. Anthropic is saying they're going to allow you to use 100,000 tokens, which is going to be wild because you could literally send it your entire codebase, well it depends on how big your codebase is, but you can send it quite a bit of context for it to actually understand how it works. But here we are, even if we've got the money for GPT 4, you only have 8,000 tokens and that includes the reply that it's sending you. So in reality you can really only send it 6,000, we have 15,000. So the answer to AI not being able to fit it is AI, which is kind of scary that the answer to a lot of AI problems is also AI but the way that it works is you condense or you summarize what you have. So we take the input transcript which is how it came out of my mouth and we say, please condense this to be about 80% short or 50% short or whatever but do not give up any details, right? Don't give up any details and surprisingly I have a lot of filler words that I say and it can do a really good job at bringing it down to 50, 30% of the actual input without getting rid of it. I just kept reading through them, I'm like, yeah, it didn't really leave anything out. At that point you have the transcript that's been condensed. Every utterance is smaller without leaving out any information and then we write this massive prompt that says summarize the provided transcript into succinct, blah, blah, blah. Additionally, please create the following for the episode, one to two sent to the subscription, tweets, blah, blah, blah, all kinds of information. Return each of them in JSON so that it looks like that.
Comments