Taz Singh - Moderator
Panelists:
Tanmai Gopal
Ankita Masand
Jonny Green
Sam Scott
The 90-90 rule mentioned by Taz Singh in software development refers to spending the first 90% of the time building the server and the last 10% on writing authorization rules, which paradoxically takes up another 90% of the time.
Guild.IO approaches GraphQL authorization by implementing role-based and attribute-based access checks. They ensure that users are authenticated and have the appropriate roles before accessing sensitive data, like hotel bookings or financial information.
Role-based access control (RBAC) assigns permissions based on user roles within an organization, allowing access to resources based on those roles. Attribute-based access control (ABAC) grants permissions based on policies that combine multiple attributes of users and resources, offering more granular access control.
Common challenges include managing the complexity of permissions as schemas evolve, ensuring data security while maintaining efficient data access and handling, and integrating authorization seamlessly with existing data systems without impacting performance.
Hasura handles GraphQL authorization using a built-in policy engine inspired by database systems with Row Level Security (RLS). This model allows specifying authorization policies at the model level, ensuring consistent application of security rules across all queries that touch the model.
The panel discusses several strategies including returning null for unauthorized fields, using GraphQL errors to indicate unauthorized access, and implementing role-based schema visibility where users only see parts of the schema they are authorized to access.
Organizations ensure consistency by using centralized roles and privilege management systems to define what actions a user can perform, which are validated both on the frontend for user experience and on the backend for security.
Hey, everyone! I'm Taz Singh, the founder of Guild IO, and I'm excited to be joined by a group of experts for this authorization panel. Building a GraphQL server involves spending 90% of the time on the server itself and the remaining 10% on writing the authorization rules. Today, we will deep dive into this topic and have a discussion with Ankita, Johnny, Sam, and Tanmay. Join us in the discussion room for more interaction.
Awesome. Hey, everyone! I'm Taz Singh, and I'm the founder of Guild IO. I'm happy that you're joining us today for this authorization panel. It's probably one of my favorite parts of building a GraphQL server. I feel like it's one of those parts where it kind of follows the 90-90 rule of software development, which if you aren't familiar with that, it's where you spend the first 90% of your time building a GraphQL server. And the last 10% is spent writing the authorization rules, which counts for the other 90% of the time writing a GraphQL server. Therefore, I couldn't be more thrilled to deep dive on this topic with you today, and I couldn't be happier to be joined by this group of experts on this topic. I'd like to start with Ankita from Treebo, Johnny from Unity Technologies, Sam, the co-founder at Oso, and Tanmay, the CEO of Azura. After this, we're going to have a discussion room on this topic, so if you have any questions for today's panelists or just want to chat or hang out, feel free to join us in the discussion room and we'd love to talk to you more there.
Founder introduces GraphQL server authorization panel with experts. Ankita details role-based, attribute-based, and data-based access checks in GraphQL authorization.
Awesome. Hey, everyone. I'm Taz Singh and I'm the founder of Guild.IO. I'm happy that you're joining us today for this authorization panel. It's probably one of my favorite parts of building a GraphQL server. I feel like it's one of those parts where it kind of follows a 90-90 rule of software development, which if you aren't familiar with that is where you spent the first 90% of your time building a GraphQL server. And the last 10% is spent writing the authorization rules, which counts for the other 90% of the time writing a GraphQL server. Therefore, I couldn't be more thrilled to deep dive in this topic with you today. And I couldn't be happier to be joined by this group of experts on this topic.
I'd like to start with Ankita from Tribo, Jonny from Unity Technologies, Sam, the co-founder at Osso, and Tanmay, the CEO of Hasura. After this, we're going to have a discussion room on this topic. So if you have any questions for today's panelists, or just want to chat or hang out, feel free to join us in the discussion room and we'd love to talk to you more there. But without further ado, let's dive into our first question, because I'm sure you'd love to hear from the panelists instead of hearing from me. So let's go around and talk about how each of you have worked with GraphQL authorization, just to kind of set the stage and provide some ideas for the panel, some topics of discussion we can talk about. So let's start with Ankita.
So Ankita, can you describe how you work with GraphQL authorization? Yeah, sure. Hi. Hi, everyone. So to start with, we have two kinds of data, one that is publicly accessible. For example, hotel name, location, email, contact. This is just a small piece of graph and you can access this even if you're not authenticated. This other piece of data like bookings done on a hotel or you creating a booking, finance rated information of a hotel. To view this information, you should be authenticated and you should have a required role on this particular hotel, right? So we first check all these role-based access, they are on the GraphQL layer. We check if you have a valid role in the system and then we allow you to go to the GraphQL layer. Otherwise, we'll error out the entire operation in the context object itself saying that you're not authorized.
Also, to solve for insecure direct object reference, which means that you are a valid authenticated user and you are authorized to access the details of Hotel A, but you are not authorized to access details of Hotel B, which means that you have to check role this user on a resource, right? These are all role based access. And next comes is attribute based access, which means a user has certain permissions to, let's say, you know, take an example of, like we say, you don't have access to create a booking on a hotel. These kinds of checks we have on GraphQL fields. The third type of authorization checks that we have in our system is data based access checks, which means that you can refund an amount, but how much you can refund it depends on some particular business logic. Now this business logic is in the backend services.
We have two kinds of data: publicly accessible and authenticated access. We check role-based and attribute-based access on the GraphQL layer. In our system, we also have data-based access checks. We ensure consistency across services by handling authorization at the gateway level.
Without further ado, let's dive into our first question, because I'm sure you'd love to hear from the panelists instead of hearing from me. So let's go around and talk about how each of you have worked with GraphQL authorization, just to kind of set the stage and provide some ideas for the panel, some topics of discussion we can talk about.
So let's start with Ankita. So Ankita, can you describe how you work with GraphQL authorization? Yeah, sure. Hi. Hi, everyone. So to start with, we have two kinds of data, one that is publicly accessible, for example, hotel name, location, email, contact, this is just a small piece of graph, and you can access this even if you're not authenticated. This other piece of data like bookings done on a hotel, or you creating a booking, finance-related information of a hotel. To view this information, you should be authenticated, and you should have a required role on this particular hotel, right? So we first check all these role-based access. They are on the GraphQL layer, we check if you have a valid role in the system, and then we, allow you to go to the GraphQL layer, otherwise we error out the entire operation in the context object itself, saying that you're not authorized. Also, to solve for insecure direct object reference, which means that you are a valid authenticated user, and you are authorized to access the details of hotel A, but you're not authorized to access details of hotel B, which means that you have to check role for this user on a resource, right? These are all role-based access. Next comes this attribute-based access, which means a user has certain permissions to, let's say, take an example of, like we say, you don't have access to create a booking on a hotel. These kind of checks we have on GraphQL fields. The third type of authorization checks that we have in our system is data-based access checks, which means that you can refund an amount, but how much you can refund, it depends on some particular business logic. Now, this business logic is in the backend services. On the GraphQL side, we handle role-based access checks and attribute-based access checks, the high-level system that we have on authorization.
Awesome. Thank you so much, Ankita and Johnny. Absolutely. We deal with authorization with a federation perspective as well. So, yeah, we've kind of approached authorization from both the service level and also a gateway level within a federated system. So, within Unity, basically, we have a system where we've got the gateway and then behind that sits services that are somewhat controlled by us, and some are not necessarily controlled by us, they're controlled by other teams. So, for us with authorization, our main focus is consistency across all the services. And, yeah, basically, we've gone on a journey from doing authorization at the service level, so allowing services to control the authorization themselves, but we've actually moved that to more of the gateway level. And that basically gives us a lot more control in terms of the mechanism and how we update the auth mechanism. Also, ensuring consistency across services. So, in terms of what services need to worry about, they just need to worry about configuration. And that's pretty much it. As long as that configuration complies with the configuration definition, that's all they need to worry about. They don't need to worry about defining the mechanism at their level.
Ankita and Johnny discuss authorization approaches in federated systems. Sam shares insights on working with GraphQL authorization at Oso.
So on the GraphQL side we handle role-based access checks and attribute-based access checks. This sort of high-level system that we have on authorization.
Awesome. Thank you so much, Ankita and Johnny. Absolutely. Yeah, so we kind of deal with authorization with a federation perspective as well. So we've approached authorization from both the service level and also a gateway level within a federated system.
Yeah, happy to discuss that further. Awesome. Thanks so much, Johnny. And I almost forgot to mention that Johnny actually has a talk on this later today. So I'm sure he'll do much more of a deep dive during your talk, but thanks so much. I guess, moving on to Sam. Would you like to describe how you've worked with GraphQL authorization?
It's all handled at the gateway. We work on an open source framework for authorization. We've worked with Wayfair closely. The flexibility of GraphQL requires rigorous authorization. At Hasura, we provide GraphQL as a service with an authorization policy engine.
It's all handled at the gateway. So, in terms of, yeah, that's the angle where we come from. And, yeah, happy to discuss that further.
Awesome. Thanks so much, Johnny. And I almost forgot to mention that Johnny actually has a talk on this later today. So, I'm sure he'll do much more of a deep dive during your talk. But thanks so much.
I guess moving on to Sam. Would you like to describe how you've worked with GraphQL authorization?
Yeah, so at Oso, we work on an open source framework for authorization. So, it's a library you add to your app. It's got a policy language, things like that. So, we work with many different people on authorization. And more and more we're seeing people who are trying to solve the intersection of authorization and GraphQL. So, you know, Wayfair is one of our... We've worked with them closely on their authorization system, they're using GraphQL. You know, so, we've done a lot of writing on in general, you know, best practices for authorization, like the different places you can apply it, the different kinds of models you have. And so, it's been really interesting seeing how that stuff kind of slots into the GraphQL ecosystem and like the best ways to take those concepts and think about them in GraphQL land.
I think, you know, for me, like one of the... I think like kind of most interesting parts of authorization and GraphQL is, you know, because you have this incredibly flexible ability to query any piece of data, it really requires you to be quite rigorous with your authorization because you... You know, whereas in the past you can maybe be, you know, a little bit sloppier and just kind of protect an endpoint, now you're like, actually, you know, I need to make sure that every piece of data that a user could possibly request has been secured. And that really like forces you to kind of step up your game, which is, I think, you know, fun.
That's awesome. And last but certainly not least, Tanmay. Hey thanks, thanks everybody. Great to be here. At Hasura, we kind of... Because we provide kind of GraphQL as a service, right? As a managed component or as a managed API. We have an authorization policy engine that's kind of built into Hasura. And the model that we followed here or the inspiration that we've taken has been...
Handling authorization at the gateway. Oso works on a framework for authorization, collaborating with various entities, including Wayfair, on GraphQL authorization practices.
It's all handled at the gateway. So in terms of... Yeah, that's the angle where we come from. And yeah, happy to discuss that further. Awesome. Thanks so much, Johnny. And I almost forgot to mention that Johnny actually has a talk on this later today. So I'm sure he'll do much more of a deep dive during your talk, but thanks so much.
I guess, moving on to Sam. Would you like to describe how you've worked with GraphQL authorization? Yeah, so at Oso, we work on an open source framework for authorization. So it's a library you add to your app, it's got a policy language, things like that. We work with many different people on authorization. And more and more we're seeing people who are trying to solve the intersection of authorization and GraphQL.
So Wayfair is one of our— we've worked with them closely on their authorization system, they're using GraphQL. We've done a lot of writing on, in general, best practices for authorization, the different places you can apply it, the different kinds of models you have. It's been really interesting seeing how that stuff kind of slots into the GraphQL ecosystem, and the best ways to take those concepts and think about them in GraphQL land.
Our databases have had a similar authorization problem for decades. We follow a policy engine at the model level, attaching authorization policies to models. This minimizes redundancy and duplication, keeping policies in one place and allowing for easy changes.
It's fairly similar to another system, another well-known system that exists in the world, which has to deal with a very similar problem of there's lots of different models and you can arbitrarily access anything, right? You cannot access any shape, any combination, any particular piece of or even any computation on that piece of data. And what I'm referring to as a database. So our databases for decades have had a very similar authorization problem. And mature databases have a system of authorization called RLS, which is low level security, which is a style of authorization that encompasses or that you can use to do everything from RBAC to ABAC to entity level access control, to even actually very almost business logic type authorization that Ankita was referring to earlier. So that's kind of the model or the policy engine that we follow. And we've kind of, what we've done is we've embedded that kind of a policy engine at the hospital level itself to to be able to specify rules like that at the model level. From a GraphQL point of view, the way that we like to think about authorization or encourage users to think about authorization is to think of authorization policies attached to a model, right? So you have users and transactions or users and orders or whatever, right? You're thinking about authorization attached to a model, but not at the resolver level or not at the schema level, at the kind of model level. So now what happens is that whenever that model is touched by a GraphQL query, so you do query user or query article.user or query order.user, like however you end up accessing that user model, the same authorization policies are applied. So in a way, you're keeping it as the least amount of redundancy, the least amount of duplication. You're making sure that the policies are specified in one place. And if they change, they change only in one place for that model. So that's roughly at a very high level, and we'll get into the details I guess over the next few minutes, but that's kind of the approach.
Exploring the challenges of GraphQL authorization and the model of authorization policy engine in Hasura, inspired by database security measures like RLS, offering a centralized approach to authorization policies at the model level.
And so it's been really interesting seeing how that stuff kind of slots into the GraphQL ecosystem, and the best ways to take those concepts and think about them in GraphQL land. I think for me, one of the most interesting parts of authorization in GraphQL is, because you have this incredibly flexible ability to query any piece of data, it really requires you to be quite rigorous with your authorization, because—whereas in the past you could maybe be a little bit sloppier and just kind of protect an endpoint, now you're like, actually no, I need to make sure that every piece of data that a user could possibly request has been secured. And that really forces you to step up your game, which is, I think, fun.
That's awesome. And last but certainly not least, Tanmay. Hey, thanks. Thanks everybody. Great to be here. At Hasura, because we provide GraphQL as a service, as a managed component or as a managed API, we have an authorization policy engine that's kind of built into Hasura. And the model that we followed here or the inspiration that we've taken is fairly similar to another system, another well-known system that exists in the world, which has to deal with a very similar problem of there's lots of different models and you can arbitrarily access anything. You cannot access any shape, any combination, any particular piece of or even any computation on that piece of data. And what I'm referring to is the database.
So our databases for decades have had a very similar authorization problem and mature databases have a system of authorization called RLS, which is Row Level Security, which is a style of authorization that encompasses, or that you can use to do everything from RBAC to ABAC to Entity Level Access Control, to even actually very almost business logic type authorization that Ankita was referring to earlier. So that's kind of the model or the policy engine that we follow and we've kind of, what we've done is we've embedded that kind of a policy engine at the hospital level itself to be able to specify rules like that at the model level.
There are various acronyms and methods used in GraphQL authorization, such as RBAC, ABAC, PBAC, and GraphQL Federation. The complexity of applying authorization on top of GraphQL servers is a significant challenge.
Awesome. Thanks so much, Tanmay. And just right off the rip there, I mean, there's so many different acronyms, so many different methods being thrown out. I mean, we got RBAC, which is Role-Based Access Control. We got ABAC, which is what Attribute-Based Access Control. PBAC, which is Policy-Based Access Control. There is GraphQL Federation where you're kind of trying to align authorization within the services. There's kind of more provider models. There's hotel model that Ankita mentioned. I mean, this is sounding, you can kind of see why I think it's a 90-90 rule of GraphQL servers, where that last 10% of the time takes up 90% of your time just kind of applying authorization on top. This is awesome.
Exploring GraphQL model-level authorization policies and the complexity of aligning authorization in GraphQL servers efficiently, including various access control models like RBAC, ABAC, PBAC, and challenges of the N plus one problem in data retrieval.
From a GraphQL point of view, the way that we like to think about authorization or encourage users to think about authorization is to think of authorization policies attached to a model, right? So you have users and transactions or users and orders or whatever, right? You're thinking about authorization policies attached to a model so that, but not at the resolver level or not at the schema level, at the kind of model level. So now what happens is that whenever that model is touched by a GraphQL query, so you do query user or query article.user or query order.user, right? Like whatever, however you end up accessing that user model, the same authorization policies are applied. So in a way you're keeping it as kind of, it's the least amount of redundancy, right? It's the least amount of duplication. You're making sure that the policies are kind of specified in one place. And if they change, they change only in one place for that model. At a very high level we'll get into details, I guess, over the next few minutes, but that's kind of our approach.
Thanks so much Tanmay. And just right off the rip there, I mean, there's so many different acronyms, so many different methods being thrown out. I mean, we got RBAC, which is Role-Based Access Control. We've got ABAC, which is Attribute-Based Access Control. PBAC, which is Policy-Based Access Control. There is GraphQL Federation where you're kind of trying to align authorization within the services. There's kind of more provider models, there's a hotel model that Ankita mentioned. I mean, this is sounding, you can kind of see why I think it's the 90-90 rule of GraphQL servers where that last 10% of the time takes up 90% of your time just kind of applying authorization on top. This is awesome. Well, anyway, kicking off with our next question, something that is often kind of talked about in the GraphQL realm is how to provide efficiencies.
When you're going to think about like the efficiency challenge or the N plus one challenge, it's exactly what I was talking about, right? You have this kind of, this problem, I think shows up in a few specific scenarios, right? So let's, I'll just take those two scenarios. The first scenario is when we, especially when you're fetching lists of items, right? So you're, you're, you're fetching through a list and you want a common policy to be applied across the list of items. It's a little bit different from an access control ACL type model, where you have a tuple that defines whether they have access to a resource or not. Right? So you have art, like you have an ID, like a hotel ID that I'm talking about, right? Hotel ID one or article ID one. And you know whether a particular user of some rule or whatever can access that particular entity, right? So you know very specifically that for a particular resource where that you can access, whether this user can access it or not. But this problem becomes more challenging when you're accessing a list, right? So you want to fetch a list of items. So how are you going to fetch a list of items, right? You're going to have to...
In the GraphQL realm, providing efficiencies is often a challenge, especially when it comes to the N plus one problem. Layering authorization on top of data resolution can further impact efficiency. Let's discuss how experts have tackled this issue and gained efficiency in their GraphQL servers with or without authorization.
Well, anyway, kicking off with our next question, something that is often kind of talked about in the GraphQL realm is how to provide efficiencies. And oftentimes, in the context of GraphQL servers, that's often said to be the N plus one challenge when writing a GraphQL server. So essentially what they mean by that is for each kind of level, essentially that your GraphQL query or your GraphQL operation is going down, like that could be one more operation getting stacked on top of another when trying to resolve that operation. Arguably, layering on authorization on top of that data resolution makes it even less efficient, arguably, if you're just kind of using conventional logic. I'd love to hear about how you've tackled this and how you've tried to gain efficiency when writing your GraphQL servers with authorization or not.
Exploring challenges in fetching lists of items efficiently in GraphQL, requiring authorization checks to be pushed down to the persistence layer for scalability and improved performance.
When you're going to think about like the efficiency challenge or the N plus one challenge, it's exactly what I was talking about, right? You have this kind of, this problem, I think shows up in a few specific scenarios, right? So let's, I'll just take those two scenarios. The first scenario is when we, especially when you're fetching lists of items, right? So you're, you're, you're fetching through a list and you want a common policy to be applied across the list of items. It's a little bit different from an access control ACL type model, where you have a tuple that defines whether they have access to a resource or not. Right? So you have art, like you have an ID, like a hotel ID that I'm talking about, right? Hotel ID one or article ID one. And you know whether a particular user of some rule or whatever can access that particular entity, right? So you know very specifically that for a particular resource where that you can access, whether this user can access it or not. But this problem becomes more challenging when you're accessing a list, right? So you want to fetch a list of items. So how are you going to fetch a list of items, right? You're going to have to...
You can't fetch one billion elements of the list and then filter out the one billion elements of the list at the application level, at the GraphQL level, to decide what data to send back, right? That's not going to be feasible. You're going to have to push down that authorization check to make sure that you're fetching the efficient list directly from your persistence layer, from your databases, right? So that's kind of understanding that constraint has been important for us when we're thinking about efficiency. Because when you're fetching a single object versus when you're fetching a list of items and how you want to deal with that, you want to make sure that the system, the authorization system is scaling well to both of those, right? And, you know, this is obvious in hindsight, but it was just an interesting point there.
When fetching lists of items, it's important to apply a common policy across the list. Unlike an access control ACL model, where access to a resource is determined by a tuple, the challenge arises when accessing a list. To efficiently fetch a list, the authorization check should be pushed down to the persistence layer, ensuring scalability for both single objects and lists.
And I guess let's start, let's go in reverse, let's start with Tanmay in this one. When we kind of think about like the efficiency challenge or the infrastructure challenges, it's exactly what I was talking about, right? You have this kind of, this problem I think shows up in a few specific scenarios, right? So, let's, I'll just take those two scenarios. So, the first scenario is when we, especially when you're fetching lists of items, right? So, you're fetching through a list and you want a common policy to be applied across the list of items. It's a little bit different from an access control ACL type model where you have a tuple that defines whether you have access to a resource or not, right? So, you have an ID, like a hotel ID that Ankita was talking about, right? Hotel ID1 or article ID1. And you know whether a particular user of some role or whatever can access that particular entity, right? So, you know very specifically that for a particular resource, whether you can access, whether this user can access it or not. But this problem becomes more challenging when you're accessing a list, right? So, you want to fetch a list of items. So, how are you going to fetch a list of items, right? You're going to have to you can't fetch one billion elements of a list and then filter out the one billion elements of the list at the application level, at the GraphQL level to decide what data to send back, right? That's not going to be feasible. You're going to have to push down that authorization check to make sure that you're fetching the efficient list directly from your persistence layer, from your databases, right? So, that's kind of understanding like that constraint has been important for us when we're thinking about efficiency. Because when you're fetching a single object versus when you're fetching a list of items and how you want to deal with that, you want to make sure that the system, your authorization system is scaling well to both of those, right? And, you know, this is obvious in hindsight, but it was just an interesting point, there.
Efficiency gains in GraphQL servers by evaluating authorization rules at the database layer and pushing policies into SQL filters for performance and data safety.
I think when we think about N plus one style situations, what really helps is that kind of model level specification of the authorization rules. So, when you're fetching a parent and a child and applying authorization, you have authorization policy for both objects in a query. Specifying authorization rules at the model level makes it easier to reason about efficiency. For instance, by compiling authorization rules into a single SQL query, redundant rules can be identified and removed, simplifying the fetch process. This integrated approach of model-level policies and predicate pushdown enhances performance by closely coupling data fetching and authorization.
Efficiency gains in GraphQL servers with authorization lie in evaluating authorization rules at the database layer for list indexes. By pushing authorization policies down into SQL filters, regardless of using GraphQL or other technologies, the benefits of efficient data retrieval are leveraged. This approach not only improves efficiency but also ensures data safety by applying pre-filters at the data layer, offering performance gains and data security simultaneously.
When the authorization rules are specified at the model level, it becomes easy to reason about efficiency. Fetching data becomes the act of authorization itself, eliminating redundancy. This solves the N plus one problem and improves authorization. The close integration of model-level policies and data fetching enhances performance.
I think when we think about N plus one-style situations, right? What really helps is that kind of model level specification of the authorization rules. So, when you're fetching a parent and a child, and you're applying authorization, you have authorization policy for the parent object and for the child object in a query. When the authorization rules are specified at the model level, it actually becomes quite easy to reason about efficiency because, for example, when speaking to a data system where a user can compile it into a single, let's say, for example, a SQL query to fetch that data of the parent of the child, we're able to take the authorization policy rules and make it a part of the data fetch itself, and even do the analysis to remove redundant rules, right? Like I'll take a very simple example. You're fetching articles and authors, right? So you're fetching article.title, article.id, and then you're also doing article. offer.id, article.offer.name, article.offer.email. Now, let's say the constraints for you to access these objects, right, are dependent, meaning that I can only access an article if I'm the author, right? But then you're also accessing the articles author, which is the same check, right? You're, it's the same check that you're doing for both of them. So this is redundant because the act of fetching that data itself is the act of authorization, right? And in this case, it's redundant, right? So then we're able to kind of do static analysis and remove it and make it one fetch, right? So that becomes fairly easy. Now, this obviously ends up solving the N plus one problem. Anyway, this also makes the authorization better, and you don't even need two calls, not even N plus one, but not even two. And it becomes the same check. So that kind of model level specification of policies and integrating that with a predicate push down and integrating that deeply with the data fetch, like the data fetching and the authorization in my mind are quite closely coupled. And you want them to be close for performance reasons. So that's kind of been a thing that's helped.
Efficiency gains by integrating model-level policies with predicate pushdown for closely coupled data fetching and authorization.
So then we're able to kind of do static analysis and remove it and make it one fetch, right? So that becomes fairly easy. Now, this ends up solving the n plus one problem anyway. This also makes the authorization better and you don't even need two calls, not even n plus one, but not even two. And it becomes the same check. So that kind of model level specification of policies and integrating that with a predicate push down and integrating that deeply with a data fetch, like the data fetching and the authorization in my mind are quite closely coupled, right? And you want them to be close for performance reasons. So that's kind of the thing that's helped.
Yeah. Awesome. I feel like we could talk for ages on this, just let alone Tanmay. Every time I speak to you, like this panel and last year's panel, I just want to talk to you so much more. But anyway, I'd love to talk more with Sam. So Sam, can you talk about how you've gained efficiencies when building GraphQL servers with authorization? Yeah. I mean, I think Tanmay hit on the main thing, right? Like it's all about that ability to evaluate the authorization rules of that database layer for list indexes. That's just the one where it comes up all the time. So yeah, so we have a similar concept. We have the ability to kind of take your authorization policies, push them down into SQL filters. So yeah, whether you're using GraphQL or anything like, you get to leverage those benefits. And that's, yeah, I mean, that is where the biggest one comes from.
Tanmay and Sam discuss gaining efficiencies in building GraphQL servers with authorization by evaluating the authorization rules at the database layer and leveraging SQL filters. This approach provides both performance-efficiency wins and a safety blanket for apps. Johnny shares a different situation where their federation system doesn't have direct access to the database, leading them to use the gateway and an authorization endpoint for overall authorization across the organization.
Yeah. Awesome. I feel like we could talk for ages on this just alone, Tanmay, every time I speak to you, this panel and last year's panel, I just want to talk to you so much more. But anyway, I'd love to talk more with Sam. So, Sam, can you talk about how you've gained efficiencies in building graphQL servers with authorization?
Yeah. I mean, I think Tanmay hit on the main thing, right? It's all about that ability to evaluate the authorization rules of that database layer for list indexes, that's just the one where it comes up all the time. So, yeah, so we have a similar concept. We have the ability to kind of take your authorization policies, push them down into SQL filters. So, yeah, whether you're using graphQL or anything else, you get to leverage those benefits. And that's, yeah, I mean that is where the biggest one comes from. I mean, yeah, beyond just being an efficiency thing, it's such a wonderful kind of, like, safety blanket for your apps as well. It's kind of, you know, there's, they kind of, by doing it that, like, data layer, you've sort of already, like, applied this, like, kind of read-only filter, effectively, to, like, all the data. Like, you just, you know, anything that a user is going to pull from the database is you kind of have that pre-filter applied of, like, oh, it's going to be authorized. And that just gives you, like, yeah, both that, you know, performance-efficiency wins of, like, as you say, not fetching a billion rows, but also that kind of comfort in knowing that, you know, the data you pull is going to be the data the user can see.
That makes a lot of sense. You're both attaining essentially some sort of efficiency gain, but you're also minimizing the risk of basic data leakage across different operations. So, it makes a ton of sense. Awesome. Johnny. Yes, this is a, actually, for me, it's really interesting chat, because it's just, it's really highlighting different situations, in my opinion, kind of require different solutions. So, for us, like, one of the key differences is we don't have a database. We don't have access to the database. The services that in our federation system talk to, they are not necessarily controlled by us. So, it kind of presents a whole different authorization problem. How do we kind of have that overall, capitalist sort of authorization that everyone can coherently use across the organization? So, for us, we really thought about this and we kind of made the decision that for this cache on this federation system, it made a lot of sense to go to the gateway. So, kind of the opposite of what Tanmayo was saying. And basically, the reason why we did that was basically for two reasons. One is we actually have like an endpoint which does authorization and that lived right next to our gateway. So, performance reasons, it just made sense to just talk to that directly.
Efficiency gains through applying read-only filters at the data layer to ensure authorized data retrieval and prevent data leakage.
I mean, yeah, beyond just being an efficiency thing, it's such a wonderful like safety blanket for your apps as well. It's kind of, you know, it's, you know, by doing it that data layer, you sort of already like applied this read-only filter effectively to like all the data. Like you just, you know, anything that a user is going to pull from the database, you can kind of have that pre-filter applied of like, oh, it's going to be authorized. And that just gives you like, yeah, both that performance efficiency wins of like, they say not fetching a billion rows, but also just that kind of comfort in knowing that, you know, the data you pull is going to be the data the user can see.
That makes a lot of sense. You're both attaining essentially some sort of efficiency gain, but you're also minimizing the risk of basic data leakage across different operations. So makes a ton of sense. Awesome. Johnny. Yes. This is a, actually for me, it's really interesting chat because it's just, it's really highlighting different situations. In my opinion, kind of required different solutions. Say for us, like one of the key differences is we don't have a database. We didn't have access to the database, the services that in our federated system talk to, they are not necessarily controlled by us.
So for us, we, we really thought about this and we kind of made the decision that for this catch on its federated system made a lot of sense to go to the gateway. So kind of the opposite of what Tanmayo was saying. And basically the reason why we did that was basically for two reasons. One is we actually have like an endpoint which does authorization and that live right next to our gateway. So performance reasons, it just made sense to just talk to that directly. And basically services that needs to reach out, you know, across the world to talk to this endpoint. It's like this performance gains immediately.
For performance reasons, services can talk directly to the endpoint without reaching out across the world. Consistency is also a key factor, as teams only need to focus on configuration, without worrying about the underlying mechanisms.
So, performance reasons, it just made sense to just talk to that directly. And basically services don't need to reach out across the world to talk to this endpoint. It's like this performance against it immediately.
The other one is just consistency. So, I kind of alluded to it earlier, like services, all they need to care about is configuration. And that's it. They don't need to worry about the mechanisms. So, for teams onboarding with us, they don't need to worry about having this SDK for RISC and for Rust or Node.js for whatever they're using. They just define the configuration in their GraphQL schema. And our gateway understands that and is able to apply the auth mechanism as the catch-all blanket. So, it's kind of like completely the opposite of what Tamwa was saying. But for me, it kind of highlights maybe it's situational for what the solution issues.
So. You got it. If I may ask a question to John. The authorization at the gateway is more about the authorization of the schema visibility and specific entities of a user. Our RBAC system correlates the GraphQL type and entity to define coherent and consistent authorization. It's not just us using this RBAC system, other people use it too. Convention-first approach is the most efficient, followed by configuration, and customization being the least efficient. Ankita will share more about achieving efficiencies with GraphQL authorization.
So. You got it. If I may ask a question to John. I'm very curious, correct? Yes. Okay. So, when you... The authorization that you're doing at the gateway, then is that more about kind of authorization of the schema of what entities of the schema you can fetch? What I mean is, we call it role based schemas, for example, right? It's schema visibility in a sense, right? It's like certain aspects of the schema visibility and certain aspects are not. Is that what the authorization covers? And does the authorization also cover things like what specific entities are of a user, right? User ID one is available, user ID two is available. I'm guessing that that kind of data fetching is done by the service and the gateway is doing more of a schema visibility thing. But that's what it sounded like. What does your authorization look like at the GraphQL gateway there?
You're absolutely right, the data fetching is handled by the service. We use an RBAC system. Our authorization is basically on the GraphQL type that is essentially you know, the field group that we're trying to protect, and the type, the entity is the type. We strongly correlate that because it just, we have a convention based approach and that allows us to yeah, just define really coherent and consistent authorization, because, you know, you know the entity, like you alluded to, you know, the field you want to protect. So from there we can apply that to our RBAC system. Makes sense. And yeah, like it's not just us. So this RBAC system, it's not just us using it. It's actually other people using it as well. So it's super flexible, in a way, that's tough. Yeah, that's very interesting. And yeah, I'm looking forward to learning more in your talk later as well, Johnny. We'll deep dive on that more, because I'm also like, have a lot of questions as well in terms of like how it all manifests. But, I guess something you touched on there, which is configuration and like the most efficient thing you can do is convention. Failing that configuration, failing that the least efficient thing you can do is arguably customization. And so a custom approach everywhere is arguably the least efficient. So I like your convention-first approach. And yeah, I'm looking forward to learning more about that.
Ankita, would you like to tell us more about how you're achieving efficiencies with GraphQL authorization? Yeah. Hi.
In the GraphQL server, unauthorized access is handled by moving the authorization checks to the data source level, reducing the number of checks for each user. Checks are also implemented at the batching level to prevent the N plus one problem. Handling unauthorized access on the server level involves determining whether the schema is entirely nullable or if GraphQL errors are thrown. Jonny will provide more insights into their approach.
So more of what, Tanmay already covered is, I'd like to give an example to understand more. So let's say there's a one front desk manager of a hotel. He's trying to fetch details of users, you know, he's trying to fetch details of a user in a booking. But he's not authorized to view, let's say, the preferences of a user. He's not authorized to view more details about a user, but he's still trying to... And one booking has, let's say, 25 users. Okay? So you will get a list of 25 users, but your... He's not authorized to view some, some particular fields in the query.
Now, one way to go about it is you have authorization checks on the resolver level, which means you check if this particular role has this privilege, has this permission to view this particular data, and then you do it at a resolver level, which means you are going to do these checks 25 times for every user. But what we have done is we have moved this check on the data source level, data source level, as in where you are actually, where you are, you know, let's say fetching information about this user from downstream services. So that means you are going to call this only once. You're going to check only once for this user, right? If you have batching in place. Now, again, from, from downstream services, even it's not that, you know, from day, even I agree with Tanmay that on the persistence layer, on the database site, you should not be fetching 1,000,000 rows even though you don't have access to. So we are yet to implement this properly, but we do have checks at batching level and sort of preventing this N plus one problem.
Well, cool. Well, I guess we talked a little bit about how we went about implementing authorization and how you work with authorization. I'm curious, how does some sort of unauthorized access manifest in your GraphQL server? Essentially, how do you know that someone is basically unauthorized to do something? Essentially, is your schema entirely nullable, for example, where if you don't have access to something, it just returns a null, or do you throw a GraphQL error if something is unauthorized? Do you then error out the entire operation, or do you error out with a path to essentially where you have unauthorized access? I'd love to learn more about how you're handling this on the server level. And I guess why not start with Jonny. Absolutely. Yeah, so just a bit of context, actually. So our teams, we've been going for about just two years now. I joined about a year and a bit ago. So we're in the process of scaling up and developing all our conventions. And one of the approaches we've kind of started off with is just, you know, keeping it simple.
We treat a request as if the user doesn't have full permissions to access all the data, the request will fail. We expose a top-level GraphQL error with specific codes to indicate an authorization error.
Absolutely. Yeah, so just a bit of context, actually. So our teams, we've been going for about just two years now. I joined about a year and a bit ago. So we're in the process of scaling up and developing all our conventions. And one of the approaches we've kind of started off with is just, you know, keeping it simple. And we treat a request as if the request, a user of the request doesn't have full permissions to access all the data, the request will fail. And that's kind of like an early decision we've taken initially, just to kind of keep things simple. And basically, we expose a top-level GraphQL error. This GraphQL error has a really strong convention for the data it's exposing. So people can look for particular codes to indicate there's an authorization error and so on, and kind of make decisions in the UI based on that.
We appreciate the nullable approach and would like to hear the thoughts of others on this. Different kinds of authorization can be applied at the data and schema layers. The user experience and developer experience are key considerations. Filtering out data that users can't see, without revealing its existence, is important. However, when users attempt actions they are not permitted to do, returning an error may not be the best approach. There are different paths to consider in handling authorization scenarios.
We do appreciate that, you know, you can take a more partial approach and nullable approach. So actually I'd love to hear from, you know, the others on what they think about that approach. I can see Sam nodding a lot in that approach! It's because I love this topic. It's a great one. I think there's an interesting mix. Because there's so many different kinds of authorization you can do, you know, we've heard a couple, right? Where you do authorization, like at the data layer, and it sort of filters out the data, and there's authorization at the schema layer, which pieces are you allowed to see. I think there's kind of a couple of really key elements for it, and a lot of it, I think, for me, revolves around the user experience at the end of the day, or even the developer experience. So, I think, for me, it's, like, if you're applying that kind of road level, read filter sort of authorization, then that's great. It kind of filters out the data you can't see, and you don't even know what doesn't exist, which I think can be really important when even knowing that something exists is actually potentially, like, an authorization hole. You know, you don't want someone to try and probe for data and find out it exists, whether they get an error or not. So, kind of applying that, like, if you can't read it, you don't even get to know it exists is great, but I think as soon as you're into the world of, like, you know, a user's tried to do something that they're not permitted to do, they've never been permitted to do, you know, they can never see this field on the schema, then it really doesn't make sense to return that as an error, because you need to kind of get into that world of, like, how am I going to handle that, you know, there's potentially going to be a bug in the app that it was requesting in the first place, like, could we avoid even letting the user kind of go down that path in the first place, I think we might touch on that later. So, it's kind of like, there's a lot of different sort of branching paths you kind of get into in this world, and you sort of really need to think around, you know, what does it mean if the user cannot do that thing, is it, you know, do we want to be returning a nice, healthy, you know, good error, it's like, hey, you know, you just tried to go and, you know, delete this thing through mutation, like, you're not allowed to do that, maybe you need to go and talk to your, like, system admin or something to get that permission, you know, that's like one scenario you might want to try and achieve. On the other hand, yeah, you might just want to be like, for this user, that piece of data should just like never exist, and they should never see it, and let's like, let's just treat it that way. Yeah.
We use role-based schemas to hide parts of the schema and data from users who don't have access. Users have an application role that determines their level of access, and they can only see the subset of the schema that they can access. On the data side, we use scoping to ensure that users can only access the data they have permission for. This prevents leaking information about the existence of data. We also use nullable fields and error responses to handle authorization at the field level. Role-based schemas are specified at the model level, where each model can have multiple roles that define accessibility and field permissions.
I think just kind of adding to what Sam was saying, right, I think the, it's, for us, when we think about the, how we hide things in the schema and in the data, right, like we do, we've been thinking of it as like role based schemas. So you create an application role or an application scope. And they have a schema that only they can see so they have a subset of the entire schema, but they only get to see that part of the schema that they can access, right.
So if you're in an application, you have an application role, which represents the role of, you know, somebody logged in and their user, somebody logged in and their manager. And that context now this user on the application is a user or a manager or an admin or front desk manager, right? And they have a particular subset of the full graphical schema that they're able to access. So the rest of the schema doesn't even exist for them. And then similar to that on the data side, like a business logic kind of authorization errors is part of the schema itself or, you know, has nice error codes so that people understand what's happening. And from a data level, it's kind of like the 404 thing that Sam was talking about, right? You don't even want to tell them that this data exists. So you make up where you fetch a list of items, that list is itself scoped to only what you can access. So if you can't access anything, you get an empty list or you get a null if you're accessing a single element. So that way you're not leaking information about like, does this data exist or not, or do you not have access to it? It's like you're going to a GitHub repo that you don't have access to, right? You'll see a 404. You won't get any information about whether this repository even actually exists or not, you're just seeing a 404. So that's kind of on the data side that something that's worked. But yeah, that's kind of how we've been thinking about visibility aspect. Just like one little subtlety I want to add to that as well. If you think about it, it's like, yeah, you go and visit the page, you try and read the repository, it doesn't exist, but you might even just try and go and make a mutation directly to it, go and try and push to it. But that scenario you also need to say, I don't know, you're pushing something that doesn't exist. So it's not even just as simple as looking for the redesult, somebody's mutation while it's coming. I don't know what you're talking about. That thing never existed. What are you talking about?
Ankita, we'd love to hear from you. Yeah. So we follow this approach of, you know, having nullable fields by default, we'll make it null only if you cannot think of a use case or any probable use case that this field cannot be, you know, like it will always have value then only it does. Then only we put that exclamation mark that, okay, in all the front end, let's say we have four to five front end clients that are talking to your GraphQL layer, for some reason your databases or something, you know, the entire GraphQL will give you, start giving you errors just because that exclamation mark. So we actually, you know, we actually design it in a way that this particular field should be non-null only, you know, like you cannot think of a use case. Like for authorization we'll, for all these fields, our field will you know, we'll send null for that particular field and give you an error response that you're not authorized and the path and on the front end client. So it's like, you know, like Tanvi mentioned that, it's not that we have, we don't have this value in the database, it's just that we cannot give it to you because you don't have access to this field, right? So this is how, you know, like, it's not that this particular field is not there in the database, yeah, sort of. Actually, I have one question for Tanmay if you may, you were mentioning about role-based schemas. How do you like, what are role-based schemas? How do you do that? So it's kind of again specified with the model level, right? Like each model, you attach this thing called, each model might have multiple roles, right? And that role, says whether this model is accessible, whether this model is accessible and what fields are accessible, right? Like for example, like create a user model.
Role-based schemas determine the accessibility and fields accessible for each model. For example, a front desk manager can see user.id and user.name but not user.email or user.address. The GraphQL schema is derived from this model information, and introspection queries with the role of front desk manager will only return user ID and name.
How do you like, what are role-based schemas? How do you do that? So it's kind of again specified with the model level, right? Like each model, you attach this thing called, each model might have multiple roles, right? And that role, says whether this model is accessible, whether this model is accessible and what fields are accessible, right? Like for example, like create a user model. Now for the user model, in your example, like you have the front desk manager, right? So front desk manager can see user.id, user.name, and not user.email, right? So user.user.name not user.email, not user.address, not none of that, right? So now that we have this information, we will now compose the GraphQL schema from this information. So the user number specifies a GraphQL schema explicitly. We derived the GraphQL schema from this model information. So now when we try to derive the GraphQL schema, when you're looking at the user model, and you're seeing that the role is front desk manager. So when you try to run an introspection query with the role front desk manager. If your authorization claim, your JWT token contains a role called front desk manager, or there's a header that says I'm a front desk manager. That introspection query itself will show you only the user, like any resolver that returns a user is only going to return user ID and name, right? It's not going to return email. And whether you're fetching user from the top level or whether you're doing like booking.user, even if you do booking.user, you're only going to get ID and name. Booking.user will also not give you ID, name, email, whatever. Right? So that's how we think about like role or scope-based schemas, whatever you want to call it.
How do GraphQL clients work with authorization and unauthorized states? Our favorite pattern is to extend data models to include permissions for each record, allowing easy customization of the UI based on user permissions. We provide APIs that return all the things users are allowed to do on a specific piece of data. We make a call to our roles and privileges microservice from the front-end to retrieve the privileges for a role and enable or disable buttons accordingly.
Good. Awesome. Thanks. I'm just happy you didn't call it optical lenses for GraphQL schemas, Tanmay. I think a Haskell developer understood that joke. Anyway, I guess moving on to another question. We talked a lot about how these errors manifest from the server side, but of course one of the main benefits of using GraphQL is for the client. Now, I'm kind of curious how do GraphQL clients work with authorization and also unauthorized states. For example, in the case of you having nullable fields, do you offer any sort of advice to your clients for dealing with that? In the case of throwing an error, do you offer any advice for the clients dealing with that? And I guess along with the same question, in the circumstance that you're an admin and you can only do certain admin actions, how is your front end or your client aware of non-admin actions and you cannot perform that admin action in order to not show that button? Are you kind of duplicating authorization now? Yeah, essentially that realm of questions. I guess the only person we haven't started with yet is Sam. So, Sam, would you like to take it away?
Sure. Yeah, so our favorite pattern for this kind of thing is to actually extend your data models to include a field that contains, say, the permissions that the user has on that particular record, that particular piece of data. And the reason that's really nice is, as you said, depending on the role you have, maybe you have read permission on a particular thing, you have delete permission, edit permission, whatever it is. You can kind of return that list of permissions back to the front ends. And given that, it's very easy now to customize your UI around what the user can do. So you go and fetch your list of things for the dashboard, you display the dashboard and you have maybe the delete button grayed out if it's not something the user can do. The logic for that is you can wrap this up in a react component really simply, like if the delete permissions in that thing, show it or not. And what that kind of allows you to do, which is really nice, is you don't end up duplicating any logic anywhere. You just have that one simple check. You go and add a new role, go and change permissions, whatever you do, the front end is going to be pretty happy with it. So, yeah, that's kind of like the, that's like our favorite pattern to apply this. Just do a little bit of a pitch for us here, like, you know, we kind of have like APIs that allow you to say like, what are all the things that users allowed to do on this particular piece of data? So, it's just, you know, that's kind of like the work is done for you basically.
Awesome. And Kida? Yeah. So, what we have is, let's say we have, so, we have one micro service roles and privileges. Now, this, what we do from front-end, we make one call to the service and send role that please send me all the privileges that this role has. And on the front-end, you know, we have, you have a privilege to create a booking. So, we enable that button for you. So, you can click it from the front-end. And if you don't have that particular privilege, we'll gray it out.
Client side checks are like never sufficient to do authorization. You always have to do that backend check as well. The idea of role-based schemas automatically gives you a sense of what actions are allowed on the application. Metadata and declarative configuration can be used to understand privileges and scope. Embedding authorization rules in front-end code helps achieve consistency between the front-end and back-end. Different approaches, such as using an authorization system or a central database of roles and entitlements, can be used to enable or disable UI components based on privileges and actions.
Right? Similarly for other things as well. But let this is on the UI side, but now you go to postman and you hit create booking. Even if you're not authorized to some other role, you know, you do a call request on graphQL and you have prevented front-end access checks, but you're directly accessing graphsQL. So, these privilege checks should be on your graphQL server as well. Like I mentioned earlier that these should be on the data sources, right? We again, check that if you are actually authorized to create a booking so in a way, you know, on the front-end side, we make one API call, but on the backend, on the graphQL side, we also check again that you really have permission to do this. And if you're not, you know, coming doing curl command or doing something else to, yeah, this is how it is. And the like the second question was, let's say you're doing some mutation and you're not authorized to do it. We straightforward show it to the user. You're not authorized to perform this action, but there is some data in the list that we don't even want to tell user that this particular piece exists. We don't display it, don't display that section. Yeah, thanks for bringing that one up because I really want to emphasize how important that point is. Client side checks are like never sufficient to do authorization. Yeah. It's a thing which can make your user experience nicer to like inform the user what they can do, but you always have to do that backend check as well. So yeah, it's a really important point. Thanks for bringing that one up. Yeah. And I also personally enjoyed how you focus on consistency there. I mean, you'd hate for the front end to show you that you have an action allowed and then the backend to reject it or vice versa in some way. So I really like that you focus on consistency there as well. Tanmay. I think to a degree, the, the, one of the things that we talked about earlier, like the idea of role-based schemas, it kind of automatically, when you introspect the schema and it only gives you a certain subset of mutations or queries that you can do, you automatically have a sense of, you know, what's even allowed. Like, what is my GraphQL API is an accurate reflection of, you know, what set of actions I can do right on the app, on the application itself. So that, that, that has a little bit of information built in, but I think to kind of both Sam and Ankita's points, right? Like the, there is also metadata that's available that any application can also use an embed so that they understand what the scope, what are the privileges that they have? And, you know, what is the scope of that, that poor model, or poor role, you know, what is allowed, what is disallowed and how is it allowed, disallowed, right? But the nice thing is because it's declarative kind of configuration, right? These, these policies, you can make an API called a fetch dynamically, but you can also embed it as a static asset and use that in the code in, in kind of like your front-end code to see, you know, to see what those authorization rules for accessing things are not, are. You'd be a little bit careful with it because you don't want to leak information. So how you embed that, et cetera, and what information you extract from it is important, but it helps you get that consistency between the front-end and the back-end. And like I said, there are many different ways of doing it, right? It could be a, it could be a something provided by your authorization system or it could be, even if it's an entitlement's model that you maintain somewhere, right? You maintain a central database of roles and entitlements and what privileges and actions they have. And you use that as a source of truth to enable or disable buttons or enable or disable UI components as well. Of course, using that in the actual authorization check on the back-end itself but also on the front-end. So those sound like really good ideas to me too.
We use directives to define what permissions are required for a field. Schema evolution should be carefully considered when adding or removing roles. Working at the model level simplifies the process, as the GraphQL schema is automatically composed based on the models and their relationships. Thank you all for being part of the panel and sharing your insights. Join the discussion room and attend Johnny's talk to learn more.
Awesome. And Jony? Yes, I thought of answering a question that she said, kind of what approach we've also taken is we really like the idea of just lots of documentation-driven stuff and be able to, much like some cloud providers, they tell you straight away what permissions you've got from a documentation perspective in order to access this field. So one approach we've taken is actually using directives. So we define directives to define what permissions are required for this field.
And on that note of permission evolution and role evolution, I want to turn my quickly actually. With schema evolution, does that coincide with role-based evolution as well? So if you're adding or removing roles to the schema and that sort of thing. Yeah. Yeah. You'd have to factor that in. So you'd have to factor in how you evolve the schema has to be done carefully, right? Because it's going to... Yeah. It's going to be role-based in evolution as well. Right? So yeah. Generally the idea that you're working at the model level helps, right? Because you don't actually care about the GraphQL schema. You don't care about what the GraphQL schema looks like. You just care about a model and whether that model is exposed true, false, or whether partially exposed. And then the schema is composed automatically by putting the models and any models that are related together anyway. So that's a layer of like simplicity, but yeah, you have to think about that too.
Awesome. And actually that ends the time that we have. So thank you all so much for being a part of the panel. I learned a lot. Now I won't spend 90% of my time writing authorization rules anymore because of everything I learned in this panel. And for all of you out there watching, if you enjoyed this, we have a discussion room to follow where you have all of these lovely panelists there as well. And make sure that you attend Johnny's talk and check out everyone else's lovely projects as well. Until next time, we'll see you then.
Efficiency through configuration-focused approach in GraphQL authorization ensures coherent and consistent protection of schema entities using RBAC system.
It's like this performance gains immediately. The other one is just consistency. So, you know, I kind of alluded to earlier like services, all they need to care about is configuration and that's it. They don't need to worry about the mechanisms. So for teams onboarding with us, they don't need to worry about having this SDK for REST for whatever they're using. They just define the configuration in their GraphQL schema. And our gateway understands that and is able to apply the auth mechanism as the catch-all blanket. So, it's kind of like completely opposite to what Tamar was saying, but for me it kind of highlights, maybe it's situational for what the solution you choose. So...
Got it. If I may ask a question to John, I'm very curious. Could I? Yes. I will go for it. Okay. So, John, when you, the authorization that you're doing at the gateway, is that more about kind of authorization of the schema? What entities of the schema you can fetch? What I mean is we call it role-based schemas, for example, right? Schema visibility in a sense, right? Certain aspects of the schema are visible to you and certain aspects are not. Is that what the authorization covers? And does the authorization also cover things like you know, what specific entities are off a user, right? Like User ID1 is available, User ID2 is available. I'm guessing that like I'm guessing that kind of data thing is done by the service itself and the gateway here is doing more of a schema visibility thing. But what does it that's what it sounded like. But what does your authorization look like at the GraphQL?
Yeah, absolutely. You're absolutely right, with the data fetches, data fetching is handled by the service. And what we yeah, we use an RBAC system. So authorization is basically on the GraphQL type that is essentially you know, the field that we're trying to protect in a type the entity is the type and we strongly correlate that because it just we have a convention-based approach and that allows us to just define really coherent and consistent authorization because you know the entity that you alluded to, you know the field you want to protect. So from there we can apply that to our RBAC system. Makes sense. And yeah like it's not just us so this RBAC system it's not just us using it's actually other people using as well so it's super flexible in the way that it's done. Yeah that's very interesting and yeah I'm looking forward to learning more in your talk later as well, Johnny. We are going to do a deep dive on that more because I'm also like, I have a lot of questions as well in terms of like how it all manifests but I guess something you touched on there which is configuration and like the most efficient thing you can do is convention. Failing that configuration, failing that the least efficient thing you can do is arguably customization, right? And so a custom approach everywhere is arguably the least efficient. So I like your convention first approach and yeah I'm looking forward to learning more about that. Ankita, would you like to tell us more about how you're achieving efficiencies with GraphQL authorization? Yeah, hi.
Efficient data authorization through batch checks at the data source level to prevent repetitive resolver-level validations for improved efficiency in GraphQL authorization.
So more of what Tanmay already covered is, I'd like to give an example to understand more. So let's say there's a one front desk manager of a hotel, he's trying to fetch details of users of, he's trying to fetch details of a user in a booking but he's not authorized to view let's say the preferences of a user, he's not authorized to view more details about a user but he's still trying to, and one booking has, let's say, 25 users. So you will get list of 25 users but he's not authorized to view some particular fields in this query.
Now one way to go about this is you have authorization checks on the resolver level which means you check if this particular role has this privilege, has this permission to view this particular data and then you do it at a resolver level which means you are going to do these checks 25 times for every user. But what we have done is we have moved this check on the data source level. Data source level as in where you are actually where you are, let's say, fetching information about this user from downstream services. So that means you are going to call this only once. You're going to check only once for this user right, if you have batching in place.
Now again, from downstream services even, it's not that you know, even I agree with Tanmay that on the persistence layer on the database side you should not be fetching 1 million rows even though you don't have access to. So we are yet to implement this properly but we do have checks at batching level and sort of preventing this n plus 1 problem. Okay, awesome, very cool. Yeah. There's a myriad of so many different approaches. I'm really enjoying learning from all of you, this is so awesome. Well, cool. Well I guess we talked a little bit about how we went about implementing authorization and how you work with authorization.
Unauthorized access management in GraphQL servers involves determining how to handle unauthorized actions, whether through nullable schemas, GraphQL errors, or failing the operation. Different strategies exist, from straightforward error reporting to nuanced error conventions for UI decisions and user permissions. The approach to authorization error handling varies, with considerations for simplicity, data exposure conventions, and user experience implications.
I'm curious like how does like some sort of unauthorized access manifest in your GraphQL server? Like, like essentially how do you know that someone's like it's basically unauthorized to do something? Do you like, like essentially is your schema entirely nullable for example where if you don't have access to something it just returns a null? Or do you throw a GraphQL error of something if something is unauthorized? Do you then error out the entire operation or do you error out with like a path to essentially where you have unauthorized access? I'd love to learn more about kind of how you're handling this on the server level and I guess why not start with Johnny?
Yeah absolutely. Yeah so just a bit of context actually. So our team's you know we've been going for about just two years now, I joined about a year and a bit ago so we're kind of we're in the process of scaling up and developing all our conventions. And one of the approaches we've kind of started off with is just you know keeping it simple and we treat requests as if a user of the request doesn't have full permissions to access all the data, the request will fail. That's kind of like an early decision we've taken initially, just to kind of keep things simple. And basically we expose a top level GraphQL error. This GraphQL error has like a really strong convention for the data it's exposing, so people can look for particular codes to indicate that it's an authorization error and so on, and kind of make decisions in the UI based on that. We do appreciate that you can take a more partial approach, a nullable approach. So actually I'd love to hear from the others on what they think about that approach. I can see Sam nodding a lot. What's that approach? Let's go with Sam.
Different levels of authorization in GraphQL, from data to schema, impact user and developer experiences. Implementing row-level read filters can hide inaccessible data entirely, preventing potential authorization vulnerabilities. Handling unauthorized actions may involve returning errors for user guidance and system integrity. Utilizing role-based schemas restrict data visibility to authorized users, enhancing data security and user privacy. Business logic errors in schemas and data levels ensure clear communication of authorization status without leaking sensitive information like data existence.
It's because I love this topic. It's a great one. I think there's an interesting mix, because there's so many different kinds of authorization you can do. We've heard a couple, right, where you do authorization at the data layer and it sort of filters out the data and there's authorization at the schema layer, which piece are you allowed to see? And I think there's kind of a couple of really key elements for it, and a lot of it, I think, for me, resolves around the user experience at the end of the day, or even the developer experience. So I think for me it's like if you're applying that kind of row level read kind of filter sort of authorization, then that's great. It kind of filters out the data you can't see and you don't even know what doesn't exist, which I think can be really important, when even knowing that something exists is actually potentially like an authorization hole. You know, you don't want someone to try and probe for data and find out if it exists, whether they get an error or not. So applying that, like if you can't read it, you don't even get to know it exists. It's great. But I think as soon as you're into the world of users try to do something that they're not permitted, they'd never be permitted to do, you know, that they can never see this field on the schema, then it really does make sense to return that as an error because you need to kind of get into that world of like, how am I going to handle that? You know, this potentially even a bug in, you know, in the app that it was requesting in the first place. Like, could we avoid even letting the user kind of go down that path in the first place? I think we might touch on that later. So it's kind of like there's a lot of different sort of branching paths you kind of get into in this world. And you sort of really need to think around the, you know, what what does it mean if the user can or cannot do that thing? Is it, do we want to be returning like a nice healthy, you know, and a good error? It's like, hey, you know, you just try to go and, you know, delete this thing through mutation. Like, you're not allowed to do that. Maybe you should go and talk to your, like system admin or something to get that permission. You know, that's like one scenario you might want to try and achieve. On the other hand, yeah, you might just want to be like, for this user, that piece of data should just like never exist and they should never see it and let's just treat it that way. I think just kind of adding to what Sam was saying, right? I think it's for us when we think about how we hide things in the schema and in the data, right? Like we do we've been thinking of it as like role-based schema. So you create an application role or an application scope and they have a schema that only they can see. So they have a subset of the entire schema. But they only get to see that part of the schema that they can access, right? So if you're in an application, you have an application role, which represents the role of somebody logged in and their user, somebody logged in and their manager. And that context now of this user and the application is a user or a manager or an admin or front-desk manager, right? And they have a particular subset of the full graphical schema that they're able to access. So the rest of the schema doesn't even exist for them. And then similar to that on the data side, like a business logic kind of authorization error is part of the schema itself or, you know, has nice error codes so that people understand what's happening. And from a data level, it's kind of like the 404 thing that Sam was talking about, you don't even want to tell them that this data exists. So you fetch a list of items. That list is a telescope to only what you can access. So if you can't access anything, you get an empty list or you get a null if you're accessing a single element. So that way you're not leaking information about like, does this data exist or not? Or do you not have access to it? It's like you're going to a GitHub repo that you don't have access to, right? You'll see a 404. You won't get any information about whether this repository even actually exists or not.
Data-side visibility concerns include handling 404 errors and unauthorized mutation attempts. Defaulting to nullable fields and enforcing non-null values only for essential cases ensure data integrity and error handling. Role-based schemas restrict data access based on user roles, enhancing security and privacy without exposing unauthorized data access.
You're just seeing a 404. So that's kind of on the data side that something that's worked. But yeah, that's kind of how we've been thinking about like visibility aspect. Just like one little subtlety I want to add to that as well. If you think about it, it's like, yeah, you go and visit the page, you try and read the repository, it doesn't exist. But you might even just try and like go and make a mutation directly to it, go and try and push to it. But that scenario, you also need to say, I don't know, you're pushing a sign that doesn't exist. It's not even just as simple as looking for the read as well, it's the mutation, well that's stuff. I don't know what you're talking about. That thing never existed. Why, what are you talking? What are you on about?
Yeah. Ankita, we'd love to hear from you. Yeah. So we follow this approach of having nullable fields by default. We'll make it non-null only if you cannot think of a use case or any probable use case that this field cannot be, you know, like it will always have value then only we put that exclamation mark. Okay, in all the front-end, let's say we have four to five front-end clients that are talking to your GraphQL layer. For some reason, your database is not something, you know, the entire GraphQL will start giving you errors just because of that exclamation mark. So we actually designed it in a way that this particular field should be non-null only if you cannot think of a use case, like for authorization for all these fields, a field will be sent null for that particular field and give you an error response that you're not authorized and the path and on the front-end client.
So, it's like, you know, like Tanvi mentioned that it's not that we don't have this value in the database, it's just that we cannot give it to you because you don't have access to this field, right? So this is how, it's not that this particular field is not there in the database. Actually, I have one question for Tanmay if you may. You were mentioning about role-based schemas. How do you, like, what are role-based schemas? How do you do that? So, it's kind of again specified at the model level, right? Like each model, you attach this thing called… each model might have multiple roles, right? And that role says whether this model is accessible and what fields are accessible, right? Like for example, I create a user model. Now, for the user model, in your example, like the front desk manager, right? So, the front desk manager can see user.id, user.name and not user.email, right? So, user.user.name, not user.email, not user.address, none of that, right?
The GraphQL schema is composed based on user roles, restricting access to specific fields. Introspection queries for different roles ensure selective data exposure in the schema.
Now, for the user model, in your example, like the front desk manager, right? So, the front desk manager can see user.id, user.name and not user.email, right? So, user.user.name, not user.email, not user.address, none of that, right? So, now that we have this information, we will now compose the GraphQL schema from this information. We derive the GraphQL schema from this model information. So, now when we try to derive the GraphQL schema, when you're looking at the user model and you're seeing that the role is front desk manager. So, when you try to run an introspection query with the role front desk manager, right? If your authorization claim, your JWT token contains a role called front desk manager or there's a header that says I'm a front desk manager, that introspection query itself will show you only the user, like any resolver that returns a user is only going to return user ID and name, right? It's not going to return email. And whether you're fetching user from the top level or whether you're doing, like, booking.user. Even if you do booking.user, you're only going to get ID and name. Booking.user will not give you ID, name, email, whatever, right? So, that's how we think about, like, role or scope based schemas, whatever you want to call it.
GraphQL clients handling authorization and unauthorized states. Suggestions for dealing with nullable fields and error handling. Front end logic for role-based actions and avoiding duplication.
Even if you do booking.user, you're only going to get ID and name. Booking.user will not give you ID, name, email, whatever, right? So, that's how we think about, like, role or scope based schemas, whatever you want to call it.
Go ahead. Awesome. Thanks. I'm just happy you didn't call it Optical Lenses for GraphQL schemas, Tanmay. I think a Haskell developer understood that joke. Anyway, I guess moving on to another question. We talked a lot about how these errors manifest from the server side. But, I mean, of course, one of the main benefits of using GraphQL is for the client. Now, I'm kind of curious how do GraphQL clients work with authorization and also unauthorized states. For example, in the case of you having nullable fields, do you offer any sort of advice to your clients for dealing with that? In case of throwing an error, do you offer any advice for the clients dealing with that? And I guess, along with the same question, in the circumstance that you're an admin and you can only do certain admin actions, how is your front end or your client aware of non-admin actions and you cannot perform that admin action in order to not show that button? Are you kind of duplicating authorization now? Yeah, kind of essentially that realm of questions. I guess the only person we haven't started with yet is Sam. So, Sam, would you like to take it away? Sure. Yeah, so my favorite pattern for this kind of thing is to actually extend your data models to include a field that contains, say, the permissions that the user has on that particular record, that particular piece of data. And the reason that's really nice is, as you said, depending on the role you have, maybe you have read permission on a particular thing, maybe you have delete permission, edit permission, whatever it is. You can return that list of permissions back to, say, the front end. And given that, it's very easy now to customize your UI around what the user can do. So you go and fetch your list of things for the dashboard. You display the dashboard. And you have maybe the delete button grayed out if it's not something a user can do. The logic for that is you can wrap this up in a React component really simply. If the delete permission's in that thing, then show it. And what that allows you to do, which is really nice, is you don't end up duplicating any logic anywhere. You just have that one simple check. You go and add a new rule. You go and change permissions. Whatever you do, the front end's going to be pretty happy with that. So yeah, that's kind of like our favorite pattern to apply this. Just a little bit of a pitch-frozo here.
APIs for user permissions and roles. Microservice roles and privileges for front end authorization. Importance of backend checks in GraphQL authorization.
We kind of have APIs that allow you to say what are all the things that users are allowed to do on this particular piece of data? Sorry. That's kind of the work is done for you basically. Awesome.
And Kida? Yeah. So, what we have is let's say we have one microservice roles and privileges. Now what we do from front end, we make one call to the service and send role that please send me all the privileges that this role has. And on the front end, you have a privilege to create a booking. So we'll enable that button for you so you can click it from the front end. And if you don't have that particular privilege, we'll gray it out, right? Similarly for other things as well, but this is on the UI side.
But now you go to Postman and you hit create booking even if you're not authorized to some other role, you do a call request on GraphQL. And you have prevented front end access checks, but you're directly accessing GraphQL. So these privilege checks should be on your GraphQL server as well, like I mentioned earlier, that these should be on the data sources, right? We again, check that if you are actually authorized to create a booking. So in a way, you know, on the front end side we make one API call, but on the backend on the GraphQL side, we also check again, that you really have permission to do this. And if you're not, you know, doing curl command or doing something else. To yeah, this is how it is.
Importance of backend checks for authorization. Role-based schemas for GraphQL API accuracy. Embedding metadata for understanding privileges and scopes.
And the, like the second question was, let's say you're doing some mutation and you're not authorized to do it. We straightforward show it to the user. You are not authorized to perform this action. But there is some data in the list that we don't even want to tell user that this particular piece exists. We don't display it. Don't display that section. Yeah. Thanks for bringing that one up. Because I really want to emphasize how important that point is. Client-side checks are like never sufficient to do authorization. It's it's a thing which can make your user experience nicer to like inform the user what they can do. But you always have to do that back-end check as well. So yeah, it's a really important point. Thanks for bringing that one up.
And I also personally enjoyed how you focus on consistency there. I mean, you'd hate for the front-end to show you that you have an action allowed and then the back-end to reject it or vice versa in some way. So I really like that you focus on consistency there as well. Tanmay? I think to a degree, one of the things that we talked about earlier, like this idea of role-based schemas, it kind of automatically when you introspect the schema and it only gives you a certain subset of mutations or queries that you can do, you automatically have a sense of, you know, what's even allowed. Like what is my GraphQL API is an accurate reflection of what set of actions I can do on the app on the application itself. So that has a little bit of information built in. But I think to kind of both Sam and Ankita's points like there is a sort of metadata that's available that any application can also use and embed so that they understand what the scope, like what are the privileges that they have and, you know, what is the scope of that poor model or poor role, you know, what is allowed, what is disallowed, how is it allowed or disallowed, right.
Metadata for understanding privileges and consistency. Using entitlements model for UI components. Documentation-driven approach with schema and role evolution. Simplifying schema evolution for role-based permissions.
But I think to kind of both Sam and Ankita's points like there is a sort of metadata that's available that any application can also use and embed so that they understand what the scope, like what are the privileges that they have and, you know, what is the scope of that poor model or poor role, you know, what is allowed, what is disallowed, how is it allowed or disallowed, right. But the nice thing is because it's declarative kind of configuration, right, these policies, you can make an API call to fetch it dynamically but you can also embed it as a static set and use that in the code, in kind of like your front-end code to see what those authorization rules for accessing things are not. You'd be a little bit careful with it because you don't want to leak information, so how you embed that, et cetera, and what information you extract from it is important, but it helps you get that consistency between the front-end and the back-end.
And like I said, there are many different ways of doing it, right. It could be something provided by your authorization system or it could be, even if it's an entitlements model that you maintain somewhere, right, you maintain a central database of rules and entitlements and what privileges and actions they have and you use that as a source of truth to, you know, enable or disable buttons or enable or disable UI components as well, right. Of course using that in the actual authorization check on the back-end itself, but also on the front-end, so that, those sound like really good ideas to me.
Yes, I kind of answered her question. So kind of one approach we've also taken is we really like the idea of just lots of documentation-driven stuff, and be able to, you know, much like some cloud providers, they tell you, you know, straight away what permissions you've got from a documentation perspective in order to access this field. So one approach we've taken is actually using directives. So we define directives to define what permissions are required for this field. And kind of on that note of like permission evolution and kind of role evolution, I want to tell my quickly actually with schema evolution, does that coincide with like role based evolution as well? So if you're adding or removing roles to the scheme and that sort of thing? Yeah, yeah, you have to factor that in. So there's, you have to factor in how kind of the like how you evolve the schema has to be done carefully right? Because it's going to Yeah, it's going to be rolled a scheme evolution as well, right? So yeah, generally, the idea that you're kind of working at the model level helps, right? Because you don't actually care about the graphical schema. You don't get about what the graphical schema looks like you just care about like a model, and whether that model is exposed to false or whether partially exposed, right? And then the schema is kind of composed automatically by putting the models and any models are related together anyway. So that's a layer of simplicity, but yeah, you have to think about that too. Awesome, and actually that ends the time that we have. So thank you all so much for being a part of the panel, I learned a lot now. I won't spend 90% of my time writing authorization rules anymore because of everything I learned from this panel. And for all of you out there watching, if you enjoyed this, we have a discussion room to follow, where you have all of these lovely panelists there as well, and make sure that you attend Johnny's talk and check out everyone else's lovely projects as well. Until next time, we'll see you then. Bye.
We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career
Comments