OK, now there's a question that has so many votes. So let's go for this one. How do you guys make sure if a Chrome-only API works to make sure it also lands in other browsers? We don't want browsers logging for APIs. Yes, so neither do I. So first, of course, I mentioned it before. It is now also starting to land in Edge. We do have a lot of conversations with the Mozilla team, with the Safari team, asking for them to join the conversation early on to get their opinions on what do they think about those APIs? How can we put them on a standards track? So everything is working in the open, so there's a paper trail of all of those different steps where we ask at the standards positions repositories for what is the other browser vendor's opinions. And yeah, so in the end, of course, even within Chrome, so on Chrome Android, today you can't use those APIs because Android doesn't support running the model yet. Our response so far is a Firebase AI logic SDK, where essentially if the model is supported locally, it will use the local model, or else it will run on the server, which obviously comes with privacy differentiation. So if before your pitch was everything is processed locally, of course, when you go to the server, yeah, this can have an impact if you can still run your application with that. Of course, our objective is in the end to make the APIs run on all of our platforms, so including Android, including Chrome OS. So it works on macOS, Linux, and Windows. And yeah, so as I said, it's our objective. My personal objective, I always tell people I'm not doing Chrome deferral that much, I'm doing web deferral. So I hope that this becomes an interoperable API relatively soon. Nice. That's nice to hear.
OK, so the next question is, what kind of sacrifices have to be made in terms of performance on these on device models? Can we expect to be able to run these on mobile devices without ESus at some point? So yeah, so the models are relatively big. So it's 4.39 gigabytes on disk, which means it requires a minimum of 6 gigabytes of RAM, GPU RAM, to be able to run. There is an approach called early exit, where essentially you have different layers of the LLM. And if the model determines that it has a good enough response, it can exit early, which means it's less computationally expensive. So there's some work to make it happen on mobile as well so that you can just early exit, get a little less quality compared to if you were to run through all of the layers. But eventually, the pipelines get better. The models get better. Devices slowly get better. So I know Alex, and he's fully right, it will be a long, long time until this will be runnable on a, what was the price tag, $250, I think, was the mobile phone cost average or something. So it will be a long time until we reach those devices. But for those devices, the Firebase SDK is there, and you can still use the cloud. OK, OK. Well, at least at some point, we will have something to run there.
Comments