Video Summary and Transcription
This talk explores the ways AI is being used to shape the future of applications. It emphasizes the importance of an AI-first approach and the potential for AI to enhance various industries, such as aviation. The talk also contrasts the limitations of the AI-on-top approach with the continuous learning and user-centric focus of the AI-first approach. It discusses the importance of building trust through safety, transparency, and browser-based processing, and highlights the potential of AI to address user experience issues and improve accessibility.
1. Building AI Applications for the Future
I'm Evan Seaward, head of engineering at HansonTable, and today I want to talk about exploring the ways we are building with AI and creating AI applications that can shape the future. Currently, many AI applications are limited to chatbots, but we can do so much more. By prioritizing AI and adopting a first-centric approach, we can redefine the future of applications and push boundaries.
I'm Evan Seaward, head of engineering at HansonTable, and we had a great introduction to what we are but we're basically a data editor that has a really cool spreadsheet, UI, UX, but that's not what I'm here to talk to you about today. What I'm here to talk to you about today is I think that, through a change of mind-set, we can sort of explore the ways we are building with AI and create AI applications that are going to, okay, basically, for a change of mind-set, we are all going to be able to build AI applications that are going to create the future together, basically. It is sort of what I'm trying to talk about today, but it is a bit difficult to explain it, so let's instead go through it together, basically.
Actually, I'm on the wrong thing. This is slightly broken. I'm not basically happy with the current state of future together. I believe many of you might share this same sentiment with me. I remember recently watching Back to the Future, and I was quite sad when I realised that we were meant to have flying cars by the year 2015, and it's been nearly ten years since then. I was basically wondering to myself, where is my hover car? Personally, I'm really into planes, and flying, and space, and things like this, and I also remember to myself that it's been 57 years since we had the last flight on the moon. What is this all tying into? It is a good question, but we're going to get there.
One cool thing I think about this AI image that I generated was that it made the kids in this picture all holding the steering wheel, which was quite funny. I don't own a hover car, and I don't have a house on the moon, nor do I vacation on Alpha Centauri, and sometimes, it feels like the best you might achieve is building a supercomputer that calculates the answer to the meaning of life, the universe, and everything. Where is AI basically taking us today? Most people that are building with AI I would say are just building chatbots which has very limited potential, because if all you do is put a chatbot into your application, it is just going to be a chatbot and doesn't make much sense. ChatGPT was one of the fastest-growing, I think probably the most fastest-growing consumer app. It achieved 100 million users in about two months, which is crazy. This is why we have so much traction behind AI and why everything is growing so rapidly and why everybody is doing that. I think we can do much more than just chatbots. Is it really the pinnacle of what we can do in 2024?
This is why, before I was talking about hover cars and space, I believe we can really push the boundaries, and when building with AI, we should really be thinking about how can we really change the future, because we are currently not doing that, I think, if all we are doing is adding a chatbot into our app. So, basically, aim higher to use AI to enhance our life, or solve real-world problems, so reignites our push for innovation and push boundaries, basically. I think one really good thing to look back at when it comes to this is looking back at the mobile first shift which occurred. I think it's a little bit similar. It's hard to draw the similarity first, but basically, applications that adopted mobile first reshaped the industry, and we were able to push much quicker with that, but when it comes to applications that were building mobile on top as an afterthought, it didn't really work very well. So, this is sort of the shift that we are starting to see. We're sort of re-exploring this space, because probably as we are all developing applications now, we like to think we are somewhat mobile first, but we're not really most of the time. We sort of just still mostly an afterthought, but we know these UX patterns and design patterns that work well now. So, what we sort of need to figure out is how to do this with AI, and how to build an application that really takes advantage of all of this sort of stuff. I think it's another way of framing it as like a first-centric approach is what we need to do, and basically, if we configure this out, we can define a new future of applications and AI in general. Basically, we need to prioritise AI to sort of lead us into this new future of applications and not fall behind like a lot of companies were that didn't really adopt mobile or stay there like old ways. I think this is the simplest way of boiling it down, is there's two ways. There's building on top, so this is how most people might be introduced to ChatGPT.
2. Exploring AI Interfaces and Enhancing Industries
An AI-first approach involves continuously evolving and assuming that what we build will change. Similar to the mobile-first shift, we need to prioritize interfaces with AI and explore ways to improve various industries, such as aviation. By embedding models and using voice transcription, we can enhance pilots' jobs and push the boundaries of AI applications.
I remember when ChatGPT came out, and it got really popular, it was like, let's put it in our application, and now we have an AI start-up, is the bit of the joke there. But this doesn't really involve much imagination of how we can actually go ahead and build an application, and I think it's very difficult to figure this out yet because nobody's really doing it too much, and we just need to find its way. An AI-first approach is basically we need to assume the AI is going to continuously evolve and change, and it's hard to predict the future, but we need to assume what we're building is going to change. I think it's a little bit like the mobile-first shift from the start.
There's a building on top, like adding things in half-baked, even though we probably all successfully make a website with this mobile-first added in last. This was me many years ago, at age 16, before I decided to even drive a car, I decided to fly a plane, and it's fascinating to think that there could be 16-year-olds above us right now flying planes, but probably most of us don't really know, and probably scared a few people, but one interesting thing is we wouldn't really trust a 16-year-old kid to go fly a plane, but it's happening, and the question is would we trust an AI to completely fly our planes and all these type of things? I personally would, but probably it terrifies a lot of people. What I'm going to do now is show a little thing I was building.
What's interesting about this is many things. What I originally wanted to do for this talk was show you how to take my voice and split it up into little two-second chunks and send it off into the cloud and get it transcribed, and then I would be talking to, say, Chachi or whatever LLM we might want, but I was afraid of the connectivity here, and would it fail, et cetera, so instead what I ended up doing is you probably notice in this little browser tab here we have voice recording going on, but what I have is I have a model actually embedded into my browser and it's going to transcribe my voice as I'm talking. I might need to duck down a tiny bit to actually talk to the computer as I go ahead and do this but so what is this? Basically the idea behind this was I like flying planes and I was curious what would a pilot be, like what would a pilot need for a co-pilot? I know as a programmer what I would want from a GitHub co-pilot but how could we improve a pilot's job with flying? This is just a hypothetical thing. It's a little bit hard to imagine what a pilot might be doing in a plane but usually they have a tablet strapped to their leg and this has all their data on their flight and all this sort of thing, so imagine this is strapped to the pilot's leg and hopefully this works fine so I'm going to... Could you please tell me the flight plan today? So basically, yeah, perfect. That was complete, the transcription was bad but anyway I'll start again. Could you please tell me the flight plan today? Yeah, I need to get a bit closer. Could you please tell me the flight plan today? Okay, so that's simple. I'm just talking to an LLM but that's not really what I'm trying to talk about, I'm saying get away from chats. So I'm just showing you a chat. Let's go into a little bit more. What I'm trying to think of is how can we use interfaces with AI, but one more note. What's really cool about this is this transcription is happening purely in your browser, so you could in theory do this completely offline if you were able to also have like another LLM embedded or even on your computer. Okay, so the next thing I'm going to do is... I'm looking at two screens. Okay. Could you run me through the pre... I'm clicking the wrong button. This is my debugging chat as a thing. Could you please run me through the preflight... I missed the button. There we go. No, I didn't.
3. Expanding AI Capabilities in Aviation
The AI acts as a copilot, assisting the pilot in various tasks such as preflight checklists, communication with air traffic control, and in-flight monitoring. In case of emergencies, the AI can take control, allowing the pilot to focus on the emergency checklist. This demonstration showcases the potential of AI in aviation.
Oh, God. Could you please run me through the preflight checklist? It's going to be good. There we go. There we go. Yeah, so now basically you can imagine the pilot's going to want to go through his checklist and basically the AI is responding. I've tried to make things simple for maybe people to understand in the UI, but anyway.
So now I go ahead in the flight and... Could you please announce departure to ATC? Good enough. I have a command in case... The chat's not working perfectly because we're being on stage. Yeah, so now basically you can imagine if we were actually all flying together in a plane, we would be doing actual, like, the AI could be communicating to air traffic control for us, but we're not in a plane. There's a bit too many people here today to be able to do that. But the next step would be... Could you do in-flight monitoring for me? I forget that I need to press this button. It was a bit far out of the scope of my presentation to also detect when the user stops talking. But yeah, so now you can imagine we're in the sky flying a plane and the AI's monitoring all our systems. So this is the idea. This AI is going to be like a copilot. It's going to assist you. So let's do, like, a little disaster. So yeah, you can imagine at this point something's gone very wrong in the plane, and we're like... Hopefully it's not a foreshadowing for my trip back to Wroclaw in a few days. But anyway, so now the AI would be taking control of a lot of things, talking to air traffic control for me. So now I, as a pilot, would be able to just run through my emergency checklist and not really have to worry about too many things. But yeah. Okay. This is a quick little cool thing I thought I wanted to show. So yeah. That's me again.
4. AI First vs On Top Approach
The AI on top approach has limited capabilities, fixed responses, and lacks continuous learning. It often results in outdated, ineffective applications. In contrast, the AI first approach continuously learns from data, improves accuracy and decision making, and aims to augment the user's abilities. Applications built with an AI first mindset actively take the initiative to assist users, going beyond simple chatbot functionality. This approach paves the way for fully autonomous AI in the future.
That's me again. And... Okay. Okay. Yeah. There we go. So yeah.
Earlier I was mentioning about the Google Cloud Platform. So, yeah, earlier I was mentioning that there's two strategies, building on top and AI first. Sort of what I was thinking to myself is what would this demo look like if it was a built on top strategy, and what would be an AI first mindset? In this building AI on top mindset, it's added in as an afterthought. You could think of it... That what I was demoing is sort of not the on top part, but, yeah, anyway, just a thought experiment. This AI with the on top approach would have limited capabilities, fixed response, and a lack of continuous learning.
It often would result in outdated, ineffective applications. You can think of a lot of start ups right now that are adding AI in. All they really do is add in a prompt, and they end up getting just the response of this trained data. So this sort of on top approach that sort of doesn't really think to the future and use the full potential of the capabilities, is basically this AI on top approach. Basically, it's just a glorified chatbot. It is what the AI on top approach is a lot of the time, or it's a cobbled together many different types of chatbots and say image generators and all these other things.
So the AI first approach of how you could take this idea is basically... I tried to build it in an AI first approach, and personally what I'm going to do is take this and actually hook it up to a flight simulator to actually make it work the way I want, but the whole point of this is it's augmenting the pilot's abilities, and I think this is a key word that we need to think about when it comes to building AI in the future is we're trying to augment the user. We're trying to push their abilities. So this AI first approach, we basically continuously learn from data. We'd be continuously improving accuracy and decision making over time because it's going to like continuously relearn on itself and retrain. It's basically not a chatbot. It's going to go beyond this. It's basically the AI is trying to take the initiative, and we need to build applications that are going to take the initiative. It shouldn't be, I ask an AI a question and it responds to me. Instead, based on parameters and the scenario, it should be actively trying to help me. This is what I'm trying to say when I mean augmentation, and we're going to be able to have a day where AI in a lot of cases can be fully autonomous, and this is what I'm very excited about.
5. Building Trust and Embracing Innovation
The AI can continuously learn and be augmented with multimodal capabilities. We should focus on enhancing user experiences and reducing UI complexity. Building AI applications with safety, transparency, and browser-based processing can ensure trust and avoid burdening users. Rather than relying on horizontal interfaces, we should adopt vertical solutions for specific user problems. Embracing bold innovation and transformative AI applications will shape the future.
So how does that set us up for this idea I was showing then? The AI basically is going to continuously learn, and what you could even think you could do is you could even set up a multimodal type AI which would have vision embedded in. So imagine this in the plane with a pilot. It could be viewing the cockpit and it could be reading the instruments, so if there's any reason or any disparity between the readings, it could make decisions on this. Basically, really try to think out of the box, and we could really have this system, when it comes to this idea, something like Skynet but we don't really want that. It depends on the person, I guess.
Anyway, so what are some of the ways we can start building AI first? I think we really need to think about augmenting, not building AI chatbots. We need to gradually enhance, and so I think a really good example of this sort of augmentation approach is something like GitHub Copilot. When you're writing code, predicting what you're next going to say, and this is just a simple way for programmers to understand what augmentation would be, so we need to start thinking in our applications what are our users doing, and how can we do it for them ahead of time? Basically, reducing what's in the UI and getting rid of a lot of things.
I wanted to quickly just... I don't have time to build something, so I will quickly show this other version, but sort of the question should be asked, how can I start building something right now for my users? There's sort of a use case in a lot of big companies that I'm finding is they want to generate a lot of reports, so I think one way you can move away from just building a chatbot right now that's very simple, like this example here is insanely simple. It's like a request software form for a company and someone's job is to do an automation. So, example, I can just type in VS Code. Obviously, it's going to respond to me with this response, and then it's going to go ahead and build out a table, quite a cool table, and then basically you can imagine this as a report in a company, so this is one very simple way you can start building sort of a slightly more towards the future, but anyway, it's more realistic for now in comparison to this co-pilot sort of demo I was showing before.
One thing I want to talk about is sort of the ethics behind this, and how can we really trust the AI, the plane being piloted by AI, because I think that's the idea of what I was talking about before in that domain will happen eventually, and how will developer tools move forward? Will we be replaced? I don't think so. We're going to have sort of like a new AI friend, co-worker that we can start yelling at basically, like maybe a rubber duck friend, we can call it, but I think one way of building safety and transparency is basically we'll be doing a lot of this stuff in the browser, so an LLM in your browser, which I basically showed before, so the transcription wouldn't be going to a cloud, it would be purely client-side and safe for you. So I think one thing we need to think about is the medium is the message, and many AI interfaces currently are basically overly open-ended, offering like a text box that leaves users to figure out what they can do. This approach is misguided, and likely a temporary phase, sort of like a little tiny thing on our radar right now. In a world dominated by chatbots, we place the burden on the users to explore capabilities. Basically, we have currently this sort of horizontal interface is the way I've heard it being called, and it does many things, and it's a burden on the user to explore our interface, but they will end up doing nothing with this interface. So we need to instead look at more of a vertical, like a specific thing in a way of solving a user's problem, like making a report or in the case of this copilot thing. So basically, I think we really need to explore a lot of these things, and really figure out to ourselves what is the future? How can we get there? Really sort of embrace some sort of bold innovation, and build some form of transformative AI application and basically really talk to each other and how can we solve this problem? The world I think we are going to is maybe purely via voice we can do a lot of things rather than using computers, and it's difficult to imagine this, because of how we currently interact. It's hard to imagine what the future would be a bit like. I hope you liked the talk, and I'm Evan, and I really had a fun time talking in front of all of you, and, yes. Cheers!
6. Addressing LLM Issues and Co-pilot Integration
To work around issues with current LLMs, it's important to question what you're building and focus on the user experience. Bringing back more context and playing with parameters can help address problems like hallucination. Using the AI SDK and function calling allows the co-pilot application to navigate interfaces and execute APIs.
Let me see, which one do I want to start with? Maybe because there are three questions about this. Which model did you use for the speech-to-text? For this, I was using Whisper. It's basically, I think it's called Whisper base, and this is basically put in the browser. It's slightly experimental at the moment, so you have to get quite a whacky NPN package, but if you send, I've posted in Discord, I guess, what I was using in particular. There's actually not much code to get us set up right now, and it's really usable. I think that would be useful for, I mean, you have other social accounts, so if you could share that, I think that would be really, really helpful.
Let me see. Let's put this to the top. All right. You're described AI first approach on something more like product vision. Is there anything specific that could be done from the developer point of view? I would think it's questioning what you're building all the time, because I think it's bad as a developer to just build what you're being told to design. You should really be thinking about what's the user experience, and I think through that, you're going to be able to question what is being proposed to you from management, and really build something that the user can enjoy. Perfect. Thank you.
Let's see. Oh, right. Probably a question on many people's minds. How can we work around issues with current LLMs, like hallucination? I mean, in your starting slide with the children's hands, that was kind of creepy. Yeah, it was. How can we work around that? I didn't even notice that until today. I thought it was a very funny point. One way is basically, you need to bring back more context into your AI, and you can basically play around with a lot of different parameters, but I would say providing more context from, say, maybe a vector database or something like this, and pulling it into your prompt. Perfect. Thank you. Still removing a lot of the ... All right, all right. Let's go for this one maybe. How does the co-pilot application intercept the LLM response to know which interface to navigate to and what API to execute? Yeah, so essentially I'm actually using the AI SDK, is what it's called in NPN. Basically I'm using function calling, and the LLM knows what function to call, and basically we're able to route to the correct user interface on the screen. All right.
7. AI-generated UI and Accessibility
The AI can generate the UI or pick from predefined components. AI can help make applications more accessible, especially for those who can't use touchscreens or keyboards. While AI won't solve every accessibility problem, it can make a significant difference. AI's ability to learn quickly raises questions about its role in replacing human skills.
All right. Trying to find questions that have nothing to do with the model that you used, because you already answered that question. All right, maybe this one. Does the AI generate the UI, or does it pick from predefined components? Currently, my demo is a bit of both, it is. Part of it is picking from predefined and generating. I was wondering to myself what sort of difference is just having an if else, but I think it's a really cool area to explore streaming from the server side, a component that you don't have installed locally.
All right. We're back to the accessibility topic. Could AI help make applications more accessible, for instance, for people who can't use touchscreens or keyboards? Yeah, this is actually exactly where I think it could be used. If we could have a world where everything is controllable by voice, we could at least be able to solve the problem for a larger majority of people. It won't solve every problem of accessibility because some people can't talk, but it helps the situation move a little bit forward. Bruce, was this your question secretly? Are you secretly popping in questions? You might be. Do we have anything cool?
All right. Maybe as a fun one, AI can learn faster than I could ever. Why should I teach it what I best at? It's going to probably replace you either way, I would say. Currently, a lot of people are talking about the next few years, everybody will be replaced. I'm unsure what's going to happen, but sort of, yeah, I guess we just have to have fun, I would say. Right. I love that. Let's give it up for Evan one more time. Thanks.
Comments