This can also be something that we need to kind of think about in standardized, let's take a few examples of where these challenges pop up. If you think about fetching use, or the total number of orders that each user has, right? There's an aggregate property that is kind of added to the user pack. Right? And so, you know, how do we want to think about this? Right? Do we want to add this as an attribute to the user type, which is what you would do conventionally, but what if we wanted to have certain arguments to that so that you only fetch those aggregates with a particular condition? Like only orders created or what if you want to layer on other aggregate properties? What if the order service is a totally different service in the user service, right? How do we want to expose aggregations from a different service being kind of quote, unquote, federated into a model definition that comes from a different service altogether, right? And so, we kind of need a way to think about the design here.
When you think about querying a parent by a property of the child, right? It's not a property of the parent, but a property of the child. Again, in situations where this is kind of federated, it becomes a complicated thing to design. And you want to have kind of a standardized design when you think about dealing to these kinds of workloads as well.
When you think about mutation, you have two broad types of mutations, right? You can do kind of this Kradesh style of mutations. And that's nice because you can compose well. You can say that I'm inserting an object and a bunch of related objects together. Or I'm updating them or inserting them or deleting them and stuff like that. Or you can think about a non-Kradesh style, where it's kind of more CQR style of mutation, where you have a unique mutation for each type of action that you want to do. And although this is nice, it does make it hard to standardize. Right? So, you want to be able to handle both of these flavors well.
So if you think about wanting to address these challenges, given that we're going to do it anyway, when we think about data API, let's take a look at what benefits we would get from a kind of more rigorous technical foundation, right? So, let's take a look at what would happen if we had better query planning. So, this set of examples, I'm going to continue using this e-commerce example, where I have users and orders and logistics service, right? So, we use a service, an order service, and a logistics service, and the user model has orders and items. And track order is kind of like an API call that performs a business logic to interact with say a FedEx API, or UPF API to fetch the order status. If I look at a simple query like fetching orders, and the user for that order, depending on how that data is laid out, I might have three different types of query plan. I might have this nice plan where I do n plus one. I have a serialization, deserialization overhead, I might do batching where it's possible to batch. And it's like a data loader, where I have to make two I O calls at least. And then perform JSON serialization, deserialization. Or in the best case, where this kind of data is laid out in a way that I can make a single call. And then the JSON serialization, deserialization just in one place and steam that back, right? And what we talked about, when we think about the benefit of this query plan, right? And we look at what this looks like in practice, benchmarking B99s at about 1000 RPS, increasing the depth of fetching data that is coming from two different services. If we look at this kind of approach, where we let better query planning happen, and not just default to kind of a data loader style and plus one is a massive amount of performance gain that we can get. Right? You can see that the blue is where there's query planning. And the red are where I just have kind of a GraphQL gateway that is federating out to two different GraphQL services that have been fetching from those underlying sources, right? The benefits kind of add up where I'm fetching just the right slice of data from the underlying database, where I'm performing minimal amount of repeated JSON serialization and then finally getting to the shape of data that I want to have. When we think about the benefit that we want on the authorization side, let's take an example where we have this Trap Order API, right? Where I'm fetching information about the delivery status or the order status for a particular order ID. Now, typically, if you wanted to have authorization logic that guaranteed that when I'm placing the order, I'm only looking at the order that I can look at as a user, the order belongs to me. What we'd have to do is we'd have to push down that authorization logic and ask the folks who've implemented the Trap Order function to implement that authorization logic, right? So, what I would do today is that in the logic or in the controller that does the Trap Order, I would want to fetch the order information, get the user information from that, fetch the user information that is related to that order, and then check a condition about the current session and whether that is connected and how that's related to the order and that order is used. And then depending on that condition, I would then actually validate it and make an API call to the actual logistics service and do whatever business logic needs to be done to return that order status, right? This authorization logic in orange is what I would like to be able to externalize, right? Because we're kind of doing this graph traversal where we're saying if the input.order.user is current user, right? Some condition like that, I would like to evaluate that condition and then make this business logic happen.
Comments