And it's very similar to something along the lines of a chat GPT. It's all central cloud-based, lots of horsepower in the infrastructure to build those models and to run those models, but ultimately it's all in one place. It's not much closer to the user than perhaps the closest infrastructure component.
So client devices may be waiting a while to reach out to get that workload performed, perhaps in the case of, for example, chat GPT, you put in a queue and you have to wait your turn to interact with it, because there's limitations on what that cloud architecture can accomplish.
The next iteration of that then is to build a big model using the high horsepower infrastructure of the cloud, and then to push that model out and to retrain the model or augment that model with data that comes from more local or regional clients. So you may have, for the example, a store, but think of a massive multinational corporation that sells lots of products at thousands of outlets around the world. They know a lot about the general purchasing behaviors of their audience. They know when it's cold, people buy jackets, when it's hot, people buy flip-flops. That is perhaps, one would hope, universal, except it's not because there are certainly regions where having your feet exposed in a flip-flop is considered rude and so we don't sell that many flip-flops. So a machine learning model that was trained on the global data set perhaps wouldn't be the best source of recommendations for a population in an area where cultural or, you know, weather or lots of other factors are in play but they don't have access to that when they're building that one central model.
So you take that model and you ship it out to the edge and then you retrain it or you augment it with local or regional data and that offers a more customized experience. You did most of the work then in the cloud on that high horsepower engine and now at the edge you can use more distinct resources, more discrete resources to retrain and still provide, you know, that result in a timely manner to clients. But we could build this out even further because, again, knowledge much like the human brain is a hierarchical process. The human brain will take in an image through your eyes. It will immediately try to classify the shape. I see an outline of the darkness if I'm in the jungle and the shape looks like it might be a tiger. All I see though is the profile of that tiger, the silhouette. What the human brain will do is it will look at that edge or the edges of that profile. It will rotate that profile around and see if you can classify, I've ever seen that shape, that silhouette before then it will imagine, you imagine sound as an input. Is there what I would consider to be a growl of a tiger? Is it moving in a way that I traditionally associate with a tiger? Is it getting closer to me? Then as I begin to do those classifications, the interesting thing about the human brain as it goes through this process is there's lots of levels where it's able to take in different sensory data, but any single one of them can trigger the chemical reaction that says run or reach for that gun against the tree or make a loud noise or kind of give up because it's over. Hierarchical knowledge and hierarchical classification and recommendation is how our bodies work. If this on the screen is sort of the edge-based one, two tier, you can imagine that an iterative solution may be able to add layers and layers of knowledge on top of a model that may be generated centrally and then distributed out and continues to get more and more refined the closer it gets to the edge. We call this ensemble learning. When we talk about the flow of training a model in the cloud, it is more performant to say use a TensorFlow that's written in Python. That becomes more performant, but the further out you get and the lower the resources available and the closer to the edge and perhaps the more restrictive or sandbox the environment is, you begin to see lots of places you can use JavaScript to continually refine those models and then ultimately deploy them, so JavaScript becomes an excellent tool as you get out across the far edge or on-prem or even running on clients, right? So I can I can train something using the camera on my phone, I can retrain that model and now all of a sudden I can tell it the thing that you're looking at is in fact a tiger. So I can tell it that that model will exist on my phone and anytime I see a silhouette like that it'll be classified as a tiger and all of that will happen locally. However, you can see that this is a lot of moving parts, it looks like it's synchronous and I can just or it's effectively copy and paste and, you know, when you're creating a PowerPoint presentation, you're absolutely copying and pasting but there are refinements that need to be made at every level of this to ensure that that model is both relevant and that it's performant enough at that edge because the devices that are calling in, there may be hundreds of thousands of them or millions or billions of them and they're calling in and they want a recommendation, how do you have a model that's localized enough and performant enough to deliver those results out at the edge? So when we look at the systems and methods by which we would implement a solution like that, there's a lot of considerations. So the ML tooling for JavaScript has some limitations in terms of what it can perform and we're trying to design this iterative system to make as much use of the most appropriate client wherever we can. So on a server you have to look at host resources and you have to look at the complexity, like how many factors am I considering when I'm putting in data that's going to make a recommendation or a classification? And then sheer data volume, right? How much data am I using to train my model and to test it? And do I have the capacity to store that out on the edge on a phone or in a browser or easily accessed? Or am I leveraging terabytes of data in the cloud and pushing it out to the edge? And then when I get out to the edge, what is my phone capable of? What is the browser capable of? What's that sandbox? What do those restrictions mean for me? And finally, it comes down ultimately to user experience. Is it fast enough? Is it good enough? Is the accuracy high enough? And as I built that model in the cloud with lots of resources, how interoperable is it out at the edge? There's, for example, TensorFlow.
Comments