We see, for example, that scrolling down on the results, we had some massive performance issues, the JS thread averaging, you know, lots of CPU usage. So here the tool is just telling us, oh, you should consider using React DevTools to debug. Sure. Also, since it's a scrollable view, it's just telling us, well, you know, React Native, you should probably use FlashList instead of FlatList. So all in all, we got the AI to automatically explore, kind of messed up on liking the image, but we get, at the end, automated performance audit of our app.
So going back to the slides, I had some videos just prepared in case everything was going wrong. There you go. So just want to talk about how it works. Basically, you know, you've probably used chatgbt copilot in, well, you're probably using it. Come on. But you can also use, of course, OpenAI APIs or any other model, but I've used OpenAI for this. And the hell of OpenAI is you send basically an array of messages to the API and it replies with something. What I actually didn't know about is you can get an answer telling the AI to call a JS function that you write, which is pretty cool. Basically, you pass JSON schema describing the functions you want the AI to call and you can basically enhance the AI with capabilities.
So for example, I have a function called tap, function called scroll, function called type that I implemented myself to interact with the phone, and I asked the AI, hey, explore the app, call one of those. And so in the beginning, essentially, if we have our logging screen, for example, with the two inputs filled, I print out a hierarchy of the view, which you can do quite easily on Android, could also take screenshots, but I choose to do that because it was a bit simpler. And I have the bounds for the view, for example, for the login button, I know that it's clickable. I know the accessibility label, which is login, because, well, text is login. And so I send that to the AI. I say, well, these are your goals. So what do you want to do? And so the AI just says, OK, I need to click the login button. So should just tap on those coordinates. And then we repeat until the AI has accomplished its goal. So fairly simple in itself. And just to show that it's actually really powerful and really easy to use, like the open AI APIs, you can do so much really cool stuff.
Well, it's hard to talk about AI without talking about hallucinations, though. So I'm just going to talk about some tricky issues that I had to deal with. For example, it happened to me that, you know, I was enhancing the telling the AI, you know, call either tap, scroll or type. And the AI was like, I'll call the function called input text, which, of course, does not exist. But, you know, since you're talking with an AI just like cat GPT, basically, you're managing the area of messages yourself, you can just add a message saying, you know, that doesn't exist.
Comments