Video Summary and Transcription
The video discusses how to secure GraphQL endpoints in five minutes using Tyke. The talk highlights key problems like authorization, schema security, and denial of service attacks. Tyke offers built-in features such as query depth limiting, field-based permissions, and various authentication modes including OAuth 2.0 and mutual TLS. The video explains how to use Tyke's dashboard to proxy existing GraphQL APIs and enforce security measures like authentication tokens. Additionally, it covers the advantages of using native GraphQL as a database query language and the benefits of StackHawk for application security testing. StackHawk's tool simplifies finding and fixing security bugs in GraphQL endpoints, ensuring every pull request is tested for vulnerabilities before production. The talk also mentions common vulnerabilities in GraphQL applications like SQL injection and information disclosure.
1. Securing GraphQL Endpoints with Tyke
Hello everyone and welcome to this lightning talk about how to secure your GraphQL endpoints in five minutes. We're gonna be doing that using Tyke. So let's look at a few problems that we're gonna solve within securing GraphQL. First one is adding authorization. Securing the schema, making sure that only specific users have access to specific fields. And then protecting against denial of service attacks. We have batteries included security, which means everything within our gateway is included. We'll add field-based permissions to secure the schema and query depth limiting for denial of service attacks. Let's get right to it. I'm in the Tyke dashboard. I'm gonna show you what I wanna secure. There's this TrevorBlades countries API, GraphQL API, that right now is completely open. I'm gonna proxy to that through Tyke and then secure it using Tyke.
How to Secure GraphQL Endpoints in 5 Minutes
Hello everyone and welcome to this lightning talk about how to secure your GraphQL endpoints in five minutes. And we're gonna be doing that using Tyke. So my name is Matt Tanner. I am a product evangelist here at Tyke. I'm gonna be walking you through this.
So getting right down to it, since we have a limited amount of time, let's look at a few problems that we're gonna solve within securing GraphQL. First one is adding authorization. So authorization, authentication, adding in those mechanisms quickly. Securing the schema, so making sure that only specific users have access to specific fields. And then also looking at protecting us against denial of service attacks.
How do we do that? Well, we have batteries included security, which is a phrase that we like to use at Tyke to say everything that's within our gateway is included. There's no plugins or anything like that that you need to add. And for that, we're gonna add that right in. Then we're going to, as part of that, put in some field-based permissions to secure the schema. And then we're gonna add some query depth limiting to it as well for those denial of service attacks.
So let's see how it works. Let's just get right to it. I'm gonna jump out of this. And here I am in the Tyke dashboard. What I'm gonna do is, first I'm gonna show you what I wanna secure. There's this TrevorBlades countries API, GraphQL API, that right now is completely open and I can hit it. There's no security, no type of security at all. What I'm gonna do is proxy to that through Tyke and then secure it using Tyke. So I'm gonna grab this. This is as if it was your API. You come over into Tyke and we come over to APIs, add new API. I'm gonna call it countries. It is a GraphQL API. We're going to proxy to an existing GraphQL service. And you'll see that I have the TrevorBlades countries URL in there.
2. Authorization and Authentication Modes
At this point, we already have some authorization built in. We're enforcing an authentication token, specified in our setup. We support various authentication modes like authentication tokens, mutual TLS, OAuth 2.0, and JOTS.
Now, at this point, believe it or not, we already have some authorization built in. We've now proxied to it. If I come over to the playground, which is built into Tyke, and I run, if I just hide this here, hide meeting controls. If I come over here and grab this query, and I come over back to here and run this query, you'll see that it says authorization field is missing. That's great. That means we're already enforcing an authentication token. Where is that specified? Well, in our setup right down here, we support quite a few different things, but today we're gonna use authentication tokens just for brevity. We also support mutual TLS, OAuth 2.0, JOTS, all of those good type of authentication modes.
3. Generating Keys and Creating Policies
To access the API, we need to generate keys and create a policy. We set per API limits and quotas, and enforce query depth limiting and field-based permissions. After creating the policy and adding a key, we can test the API by adding an authorization header and running the query.
So in order to access this now, I need to generate some keys. And in order to have some keys, I need to have a policy created. So let's save this, jump over to policies, which is down here in the corner, add policy, and I'm going to cover my country's API, and come over to configurations here. I'm just gonna call this country's policy. My keys that I generate are never going to expire. And then I'm gonna hop back over here to access rights.
And there's a few things that we're going to do. So set per limits, set per API limits and quota. I'm gonna turn this on. And this here would allow us to enforce rate limiting, throttling, usage and quota, sorry, usage quotas, all that stuff. We won't worry about that today. What we are gonna worry about here is this query depth limiting. And what I'm gonna do is I'm gonna make my maximum query depth five. And I'll demonstrate that to you in a moment here. And just with that, now that'll be enforced. And what I'm also gonna do under field-based permissions, I'm not gonna allow any of my users of this policy to access, as you can see, you can see all the types available through this API, as well as all the fields individually. I don't want them to have access to continent code or country code. Then I'm gonna create policy. There we go. The policy has been created. And now I'm gonna hop over to keys. I'm gonna add a key for this policy, create key. And with that, my key is created. Now, if I come back over to APIs, I'm gonna open a new tab. Come over here and come to countries, which is our created GraphQL proxy playground. Remember, again, that we weren't able to issue that query. I'm gonna add a request header with an authorization header that includes our key. I'm gonna come back and grab our query that we had. And I'll paste it in here. And now I'm just gonna take out code because we blocked that field and I want this to work.
4. Securing APIs with Tyke and GraphQL Features
Now we have access to the API using the authentication token. Let's add our code, which initially we don't have access to. After removing the restrictions, we can run the query and get the desired data. The query depth limiting is also enforced, preventing excessive nesting. This showcases how easily we can secure APIs with Tyke and our GraphQL features. Thank you!
And I'm gonna run this. As you can see, now we have access to the API. I'm using that authentication token in order to access it. Now let's add in our code, which we don't have access to. So countries code and continent, we don't have access to these fields. What happens if I try and hit them? Code is restricted on type continent. So if I get rid of that, next, my code is restricted on type country. And I can take that out and away we go. Now I'll be able to do that. And lastly, what I want to show you is I have a query here that is nested. And I'm gonna demonstrate that query depth limiting that we put here to enforce as well. Paste this in, as you can see, I've got some redundancies in the query. I do that, oh, I need to come back here and run this. Am I missing another bracket? I must be. There we go. Okay, so as you can see, field code is restricted on type continent. So let's just get rid of those quickly. Code, code, code. Now we run this. And we get some data back. Now, what if I go one more here and I say countries, and I do name and run this. Depth limit exceeded. So now you can see that at the gateway level. So without even going to that backend service, things are getting cut off. And that is how easily it is to secure APIs with Tyke and our GraphQL features. That's all, thank you very much. Hi everyone, I'm Bricht. I'm super excited today to talk at the GraphQL Galaxy Conference. I'm DataBricht on Twitter. I work for the FANA database.
5. Native GraphQL: FANA's Approach and Advantages
Today I'm going to talk about native GraphQL or GraphQL as a database query language. Native GraphQL means that the GraphQL query translates into one FANA query, providing significant advantages. To understand these advantages, we need to explore how GraphQL resolvers work. Resolvers create a chain of functions, known as the n plus one problem. This problem arises when each resolver triggers a separate database call. To solve this, batching and in-memory caching are used, but they introduce complexity and potential data inconsistencies. FANA's approach avoids these problems altogether.
And today I'm going to talk about native GraphQL or native GraphQL. So I'm going to talk about native GraphQL. And I'm going to talk about the different ways that you can use native GraphQL. And I'm going to talk about the different ways that you can use native GraphQL. And I'm going to talk about the different ways that you can use native GraphQL.
So today I'm going to talk about native GraphQL or GraphQL as a database query language. Now, if we talk about native GraphQL at FANA, what does it mean? Well, first of all, we have a FANA query language which we call FQL. And basically native GraphQL means that the GraphQL query is going to translate into one FQL query. That's one-to-one translation has huge advantages. So first of all, you might wonder which advantages. We'll look into that. And question two, why doesn't everyone do this if there are such advantages? To answer these questions, we actually have to answer other questions like how do GraphQL resolvers work? So let's take a detour.
How do GraphQL resolvers work? Well, typically if you have a query like this with getList, to do the title, every field in here, like getList, and to do's, and title, our fields will map on a function. So getList will be a function and that will delegate to the to do's function, that will delegate again to the title function, for example, to the title attributes. This is a resolver chain, which is a chain of functions, but it's actually more of a resolver tree of function calls because here there's one function that calls n functions. And if we turn this around, we get n plus one, and it's basically a problem. And this is actually called the n plus one problem, that's why I turned it around. And when is this a problem? Well, basically, if you're going to call the database for each of these resolvers, because then you get n plus one database calls, which is not efficient. So question four, how can we solve the n plus one problem? Well, there are multiple solutions. Solution one is batching or n in-memory caching. So in that approach, we're going to hook into these functions, for example, to do.titles, and just wait until all the to do.titles are called, and then combine these. So instead of going to do n calls for these to do.titles, we're going to do one call, so in total, two calls. That's batching, and that's often combined with caching. So if a similar call comes in, then instead of going to the database, we can go to an in-memory cache, so we don't hit the database at all. A very popular implementation is Facebook Data Loader, which you can just plug in on top of your resolvers, and it will handle it for you. However, there's a problem with this solution as well. It should, in fact, be a last resort. Well, why? Your data is no longer live, it's no longer consistent, you can't apply it on everything, you can't patch everything, so you will have still multiple calls. What about caching validation, memory pressure that you have to deal with suddenly? So it introduces complexity. So the first question, which advantages that FANAS approach provides? Well, it doesn't deal with these problems because it doesn't have these problems.
6. Advantages of Native GraphQL
It is live by default, consistent, requires no extra work, and has no memory constraint problem. Solution two is to generate one query, but this approach can be complex and inefficient. JoinMonster is an impressive implementation that solves the N plus one problem. The query language and execution plan may not fit the problem, but SQL does due to its graph-like properties. FQL allows for graph-like reversal and easy translation from GraphQL.
It is live by default, it is consistent, it requires no extra work, and there is no memory constraint problem. It just works out of the box, so you don't have to do this.
Solution two, generate one query, which is what FANAS does behind the scenes, but why doesn't everyone do that? If we would look at SQL, for example, and let's say we would select a star from lists where ID is equal to something, then we would go to the to-do calls and do the same and try to concatenate that query. Of course, we'll have to do it for multiple to-dos, so we'll end up with a join.
And the problem is if we go deeper like that in a GraphQL traversal, we might end up with a lot of joins. Now, not only is this super complex to analyze this query and then generate SQL from it and then transform the results back to a GraphQL format, it might also be inefficient depending on the joins. You might overfetch a lot and then have to throw away things. And then how are we going to paginate this? Limit 100 might not be exactly what you're looking for.
So the problem here is that what joins solve, which is a join between two tables, is a different problem than what the actual problem is, which is more like a tree traversal problem or a graph-like problem. So joins are maybe the wrong tool for the job. So there is an implementation, a very impressive implementation called JoinMonster, which actually comes from the problem they're trying to solve, a monster join that might be the result of a GraphQL query. If you look at the work involved, you can see that it's a complex problem to solve. That's question four. How can we solve the N plus one problem? So the two solutions.
That brings us back to question two. Why doesn't everyone do this? Well, we just showed it. The query language might not fit the problem or the execution plan might not fit the problem. Then, of course, why does SQL does fit the problem? Well, we do it quite differently because it's a different query language and has quite graph-like properties. So if we would look at the same query, we would start by getting a list with match index and then the list ID. We would immediately wrap it in paginate. So we actually will have pagination on every level and very sane pagination with an after and before cursor that is always correct. Then we just map over the results of these lists and we would call a function. That's actually just like a normal programming language where you would just map over something and then call the function. In that function, we can do whatever we want. And if we look at the get to dos there, well, what is this? That's just a JavaScript function because I'm using the JavaScript driver for FQL where we just throw out more FQL, pure function composition. Then we see the same pattern, paginate and map. So we have the second level of pagination immediately and map and again, a function that will be called. This is actually a graph-like reversal that we're just implementing in FQL. Because that's possible, it was super easy for Fana to implement that one-to-one translation from GraphQL to FQL. So what is actually happening here? If we see and look at the query execution is that we map get over all the lists then we paginate that immediately.
7. Native GraphQL Advantages and Fana.com
We continue map getting and paginate on every level without the monster join problem. FQL fits the problem and brings advantages such as combining with FQL for flexibility and power, while using GraphQL for ease of use. Native GraphQL offers multi-region, scalability, 100% consistency, and transactionality. Try it out for free at fana.com.
And then we just continue map getting and paginate on every level. There is no monster join problem because we do it completely different. So we don't have to solve that problem.
So question five, that's why FQL actually fits the problem. Back to question one, which advantages does that bring? Because we have mentioned the advantages but there are others. Because we have the same advantages as the rest of the normal native FQL language, we can actually combine that with FQL and use FQL for the flexibility and power and GraphQL for the ease of use. We have multi-region out of the box, scalability out of the box. We have 100% as it and transactionality out of the box. So that's what native GraphQL is. I hope you like that idea. And if you want, try it out for free at fana.com.
8. Introduction to StackHawk and GraphQL Testing
Hi there GraphQL Galaxy. I'm Ryan Severins, one of the founders and COO of StackHawk. We are an application security testing tool that makes it easy for developers to find and fix security bugs. We specialize in testing GraphQL endpoints and actively look for potential security vulnerabilities. Our belief is in automation in CI, CD, ensuring that every pull request is tested for vulnerabilities before going into production. It all starts with a YAML configuration file where you describe what to scan, including server-side HTML, single page apps, REST APIs, and GraphQL.
Hi there GraphQL Galaxy. I'm Ryan Severins, one of the founders and COO of StackHawk. I'm here to tell you a little bit about what we do at StackHawk. We are an application security testing tool. We make it easy for developers to find and fix security bugs. And in particular, we have some really cool things around GraphQL. So I'll run you through that.
So like I said, we do application and the application security testing. We do testing of the underlying APIs as well. And part of that is GraphQL. If you're not familiar with application security testing, there's really three main types. One is software composition analysis. So it's looking at the open source components, looking for vulnerabilities there. Another is static code analysis. So it's looking at the code, looking for known error types within whatever language you're using. And what we do here at StackHawk is called dynamic application security testing. So we're running active tests against your application, against a running version of your application. And we test server-side HTML, REST APIs, single page applications, and we test GraphQL as well. We are the only product that does active automated testing of GraphQL. There's a handful that do some best practices checking, making sure you're doing certain things that are known to be best practices from a security standpoint. But we're the only one to actively run a test against your GraphQL endpoints and look for potential security vulnerabilities.
Big belief for StackHawk is automation in CI, CD. We believe that every time you open a pull request, a application security test should run. Make sure that you're not introducing any new vulnerabilities before it passes the build and goes on to production. And ultimately we make finding and fixing the security vulnerabilities very simple. Let me tell you a little bit about how it works. So it all starts with a YAML configuration file. Like I said, we describe what to scan. We have server-side HTML, single page apps, REST APIs, GraphQL. You describe what to scan.
9. Configuring and Running StackHawk for GraphQL
If you have authentication for your application, you can configure that here. The beauty about GraphQL is that the configuration is really simple. You kick off a scan with this docker run hawkscan command. It's super fast app sec testing. You jump into the StackHawk app, where we make it really easy to figure out the context of the bug, provide links to fix documentation, and all the information needed to recreate the issue.
If you have authentication for your application, you can configure that here. We also have all kinds of other customization in terms of how the scanner runs. The beauty about GraphQL is that the configuration is really simple. You can see in the image here, you mark GraphQL enabled equals true and point it to the schema path of your introspection endpoint. Can also control certain things around operation, which operations you're testing, the depth of recursion. There's a lot that you can customize there.
Then you kick off a scan with this docker run hawkscan command. And so this GIF will cycle through and show us a preview of it. The beauty of it running in Docker is it can run anywhere. It can run locally on your machine as you're developing. Super easy to implement in CICD. And you can even point it at a production application. I would say use caution because this is running an active security test and trying to find input validation errors among other things. So it does try to input data into your database, which is why we always advise test this on, test this pre-production in a CICD environment. It's super fast app sec testing. You can see results in the terminal and which is great for CICD logs. And then it always has a link out to the, to the findings within the StackHawk web app, which I'll show you next.
And that helps for where you go when you actually need to fix a bug. So you jump into the StackHawk app. First thing I say is we are big believers in integrating with developer tools. We integrate with Slack, with Jira. The alerting in Slack, manager issues in Jira and really only land in StackHawk when there is a vulnerability that you need to fix. And when you do end up there, we make it really easy to jump in, figure out the context of what that bug is. We have a description of what the vulnerability is. We have links to fix documentation so you know how to fix it. And then we provide the, all of the information, the request that was sent to the application, the response and a simple curl command to go recreate that as you step through the code in debug mode to figure out where you're mishandling the data. And then one nice thing is there's finding triage for CICD instrumentation. So it might break the build if you've introduced a new vulnerability. And if it ultimately is low risk, maybe you're not going to fix it, or maybe it's something that will be prioritized in an upcoming sprint, you can send it to Jira. You can mark it as risk accepted.
10. Introduction to StackHawk
This is a quick overview of StackHawk. You can sign up for a free single user account to test your own applications at stackhawk.com. There are also free trials for the team product. Don't forget to visit our booth at GraphQL Galaxy for a chance to win prizes.
The scanner will still find it, but it won't break the build every time. So that's quick overview of StackHawk in a nutshell. We would love for you to come test us out. So you can sign up for a free single user account if you want to test your own applications at stackhawk.com. We also have free trials for our team product. Same product, you just have extra users and collaborate with teammates. And be sure to swing by our booth here at GraphQL Galaxy. We are giving away t-shirts, entries to win a Nintendo Switch, and we would love to chat with you more. All right, thanks so much.
Q&A Lightning Talks
Q&A with all three lightning talks at once. Let's start off with a question for Matt. Is it possible to add more than one GraphQL service? And if so, how do you resolve type conflicts? Yeah, so, okay. Our latest release is actually going to fix that. So there is a way to resolve those type conflicts. One of the things regarding security is query depth. Query depth limiting is one thing that we found when we were doing the research about building GraphQL into our API management product. That's one thing that a lot of people requested. We're able to set a query depth and that applies across. We do have some more features that are coming in the next little bit that are definitely going to enhance our offering in terms of query depth and a couple other metrics that may factor into those types of attacks.
Q&A with all three lightning talks at once. One of the best bits here is of course that if someone has a question, someone else can answer it. I'm not going to moderate this, that's for someone else to do. But let's start off with a question for Matt. Is it possible to add more than one GraphQL service? And if so, how do you resolve type conflicts? Yeah, so, okay. If we're talking about from the fact of the stuff that I went through, so I went through how to add security to it and pull in a proxy that is existing. So you may have some naming conflicts. Our latest release is actually going to fix that. So there is a way to resolve those. It takes some manual workarounds for it, but there are ways for us to resolve those type conflicts. Amazing. Well, it's good to know that you're forward facing. And a second question for yourself. One of the things, this is a question from Bastian. One of the things regarding security is query depth. Usually we do a fixed limit, but I found some insightful scientific insights in GraphQL language at this thing. And he's asking if there's any research on query depth prevention or attacks using query depth. Yeah, so from our side, we added in the query depth limiter, and we just, at Type, we just introduced GraphQL functionality in July. And query depth limiting is one thing that we found when we were doing the research about building GraphQL into our API management product. That's one thing that a lot of people requested. Now, it is pretty simplistic when you think about it right now. We're able to set a query depth and that applies across. Now, the nice part is if you have multiple policies, so let's say I have group A, group B, and group C, I would be able to, for group A, say, okay, you get a query depth of five, B, six, and then seven, sorry, and C might be unlimited query depth. So there are ways to configure that within Type so that even if there's no way to dynamically set that query depth, there is a way to do it based on what policy you set. Now, we do have some more features that are coming in the next little bit that are definitely going to enhance our offering in terms of query depth and a couple other metrics that may factor into those types of attacks. So the query depth, the denial of service attacks. I think you may be on mute. Oh, wow, okay, well, that was embarrassing. This was one of those times when the MC muted himself. A question for Brecht from Fauna.
Future Features and GraphQL Security
The upcoming features for the application include complex conditionals, range queries, and streaming. Streaming will initially be available for FQL and may eventually be used for GraphQL subscriptions. In terms of security, common vulnerabilities in GraphQL applications include SQL injection, information disclosure, and remote OS command injections. The newness of GraphQL contributes to its vulnerability, as there is a lack of robust tooling and automated testing compared to other web applications and API frameworks.
What's next on the feature list for your application? It looks really interesting.
That's a very good question, which actually, maybe we should go to another question and I can answer you in a few seconds because I would have to look it up myself. But one of the things that is requested often is the ability to do complex conditionals, range queries from GraphQL, and the other one is streaming. These are both things that we are considering, whether one will take priority over the other, I'm not certain.
Oh, okay, that's great. It's always hard to queue up the most important features. That's one of the- And that's, of course, only for GraphQL because we're a database and the next big thing that is coming up is actually streaming, that you get push-based streaming, but that will initially only be for FQL. Is that going to incorporate some of the subscription grammar? Not directly, I think, but eventually it will also be- I realize I'm really digging into your future flash. But eventually it will also be used for the subscriptions of GraphQL once it gets to the GraphQL endpoint as well. That's really interesting.
All right, question for Ryan, since you're here from StackHawk and you do security testing, what do you find is the most prominent security, I don't want to use the failure, unpreparedness you find in GraphQL applications on the backend? To be honest, I don't know the answer to the most common. We haven't actually looked into any data. The ones that, more anecdotally, the things that I see, SQL injection, information disclosure, and remote OS command injections are the things that we've seen pop up as we're testing with our customers. So those are probably, again, can't back it up with data, but those are the three that we've seen most frequently.
Do you think there are aspects of GraphQL that makes it weak to any particular kind of attack that would be less prominent in RESTful APIs? I guess the way that I think about that is purely because it's such a new thing that there's not this robust tooling built around the security testing in the way that there are with other web applications and API frameworks. So many of these things are simple mistakes that developers will make and simple fixes, but there's just not automated testing that's been widely adopted to catch those things. And so you're waiting until a quarterly pen test where you hope your pen test firm actually knows GraphQL and is able to dig in to find any potential issues. So just because there isn't this blessed and validated library stack, and there isn't a set of conventions that everyone knows, you find more, I don't want to call them rookie errors, just oversights.
Yeah, absolutely, absolutely. Amazing. Okay, I think we have exhausted our question stack for now. Do you have any questions for each other? This is something we can't do in the other Q&As. That's A-okay. None for me, yeah. All right, great. Well, then we're going to go to a quick break, but put your hands together, send those claps to the GraphQL Q&A chats for our lovely lightning talkers, and we will see you soon. Awesome, thank you. Thank you. See you. Thank you. Thank you very much.
Comments