Video Summary and Transcription
The Talk discusses putting the graph in GraphQL with the Neo4j GraphQL Library. It explores building GraphQL APIs backed by a graph database and the comparison between Cypher and GraphQL. The Neo4j GraphQL library provides powerful features such as CRUD functionality, authorization, and pagination. It also covers topics like database integration, hiring for the GraphQL team, and deploying a GraphQL API with the Neo4j GraphQL library.
1. Introduction to GraphQL and Neo4j
The talk is about putting the graph in GraphQL with the Neo4j GraphQL Library. Will, who works for Neo4j, introduces himself and mentions his online presence on Twitter and his personal site. He also mentions hosting the graphstuff.fm podcast.
Hi everyone. The title of this talk is Putting the Graph in GraphQL with the Neo4j GraphQL Library. You can find the slides for this at dev.neo4j.com slash GraphQL dash galaxy. So my name is Will. I work for a company called Neo4j, which is a graph database. We'll talk about what that is in just a minute. The best way to get ahold of me online is probably on Twitter, which I've linked there as well as my personal site with a blog and a newsletter that I publish. I also host the graphstuff.fm podcast. So if you like podcasts and graph technology, definitely check out graphstuff.fm.
2. Introduction to Neo4j and Graph Databases
Neo4j is a graph database that uses a query language called Cypher. We will discuss building GraphQL APIs backed by a graph database and the graph part of GraphQL. A graph is a data structure where nodes and relationships connect nodes. We use a data model called the property graph to work with data in the graph. Knowledge graphs are an implementation of a property graph for a specific domain.
So Neo4j is a graph database that is similar to other databases that you may be familiar with, like relational databases or document databases. But the biggest difference is that the data model is a graph. So nodes are the entities relationships connect them, and with Neo4j we use a query language called Cypher which we'll take a little bit of a look at today as well.
There are lots of interesting things that we can do with graph databases and implications for their different performance optimizations from other databases. But of course, what we're going to talk about today is building GraphQL APIs backed by a graph database. So really what I want to talk about today is the graph part of GraphQL.
So fundamentally a graph is a data structure where nodes, these are the entities in the graph and relationships connect nodes. To work with data in the graph, we often use a data model called the property graph, where we add labels to nodes that describe the type of thing that they are and key value pair attributes that describe the actual data that we're working with. You also might hear about knowledge graphs. So knowledge graphs, I think, are really an implementation of a property graph for a specific domain that put things in context is how I like to think about it. Google when they announced the Google Knowledge Graph API published this blog post called Things Not Strings that I think did a really good job of explaining what a knowledge graph is how we can work with the data in a knowledge graph.
3. Querying News Graph with Cypher and GraphQL
In this example, we have a graph of news articles and their topics and geo regions. We can query this news graph using Cypher, a graph database query language focused on declarative pattern matching. Cypher has functionality like aggregations, math functions, and graph-specific operations. GraphQL, on the other hand, is a query language designed for working with APIs, with a type system that describes the available data and a selection set to define traversals through the data graph.
So in this example we have a graph of news articles and their topics and the geo regions that are mentioned in these articles. So when we're looking at an article node, we know what it is, we know the attributes of it, but we also have the context around it. We know what geo region it's referring to. We know what people and topic the article is about as well.
So I pulled some data down from the New York Times API and built a knowledge graph of this in Neo4j, so we have information about articles, topics, people mentioned in the articles, this sort of thing. And I thought this was an interesting dataset for a number of different reasons and a number of different applications. But what I want to use this today is to talk about different ways to query this news graph if I'm interested in building an application.
So, this is a question that comes up a lot. What's the difference between Cypher and GraphQL? They both seem to do something with graphs. Well, fundamentally Cypher is a graph database query language, as I said earlier, very much focused on declarative pattern matching. We draw these ASCII art representations of the graph to describe the pattern we want to work with. Cypher has lots of functionality that we would expect in a database query language. So, things like working with aggregations, math functions, database operations, creating indexes, importing data from CSV format. And then lots of graph-specific operations, like the concept of variable-length paths, node and relationship functions, these sorts of things. If we compare that with GraphQL, GraphQL is very much a query language designed for working with APIs. So we have a type system that describes exactly the data that's available to the client, how it's connected. This is the data graph. And then to describe traversals through the data graph, we define a selection set in GraphQL.
4. Comparing Cypher and GraphQL
Cypher is a graph database query language focused on declarative pattern matching. It has functionality like working with aggregations, math functions, database operations, creating indexes, importing data from CSV format, and graph-specific operations like variable-length paths, node and relationship functions.
Let's zero in on comparing Cypher and GraphQL in the context of this news graph data. So, this is a question that comes up a lot. What's the difference between Cypher and GraphQL? They both seem to do something with graphs. Well, fundamentally Cypher is a graph database query language, as I said earlier, very much focused on declarative pattern matching. We draw these ASCII art representations of the graph to describe the pattern we want to work with. Cypher has lots of functionality that we would expect in a database query language. So, things like working with aggregations, math functions, database operations, creating indexes, importing data from CSV format. And then lots of graph-specific operations, like the concept of variable-length paths, node and relationship functions, these sorts of things.
5. Comparison of GraphQL and Cypher
If we compare that with GraphQL, GraphQL is very much a query language designed for working with APIs. We have a type system that describes the data available to the client and how it's connected. In Cypher, we write an ASCII art-like pattern to find and return article nodes. Cypher has functionality for ordering, limiting, and skipping. In Cypher, we can add a more complex graph pattern to find articles and their connected topics. Cypher also has shortest path functionality. Recommended articles can be expressed in Cypher by looking at other articles that similar users are viewing.
If we compare that with GraphQL, GraphQL is very much a query language designed for working with APIs. So we have a type system that describes exactly the data that's available to the client, how it's connected. This is the data graph. And then to describe traversals through the data graph, we define a selection set in GraphQL.
So let's look at some examples. So let's say I want to see all of the articles in the news graph. In Cypher, I write a ASCII art-like pattern, parentheses represent nodes, in this case, find all the article nodes and return them. With GraphQL, in my selection set, I start with the article's query field, and then describe the fields of the articles that I want to return.
What if I want to see the ten most recent articles? Well, Cypher has the functionality for ordering, limiting, skipping, for basic pagination. This isn't built into GraphQL, but we can work with these things as field arguments. So perhaps our articles query field has a sort order argument and a limit argument that allows us to accomplish the same thing.
What if I want to see the ten most articles and their topics? Well in Cypher, I add a more complex graph pattern. So you can see here first we're matching on all of the articles and returning the first by date published. Then we have another graph pattern where we're traversing out from this article node along this has topic relationship to the topic nodes and returning both of those. And now we see the ten most recent articles and their connected topics. In GraphQL, we would just add to our selection set to describe this traversal now from the articles to the topics. So we're starting to create a nested selection set here.
What if I also want, not only the ten most recent articles, their topics, but also what are other articles in those topics? Well, in Cypher I just add on to my graph pattern. So now I want to traverse along this has topic relationship again to find articles that share similar topics. And in GraphQL, I add to my nested selection set now going from the topics to the articles and in this case, returning the title of those articles.
But what if I want something like finding the shortest path in the graph between two nodes, in this case, the National Park Service and the FAA? Well, Cypher has shortest path functionality and variable length path functionality built into Cypher so I can say, find the shortest path connecting these two organizations. In this case, following any relationships. That's what this asterisk in brackets there, it's saying sort of follow any number of relationships to find the shortest path. And I can find it through a couple of articles about labor shortages that both of these organizations are facing. This functionality isn't really built into GraphQL, so GraphQL doesn't have a sort of a native way to express this idea of a shortest path, although we could certainly implement this functionality and expose it through certain fields in GraphQL. But it's not something built in.
What about recommended articles? So a lot of news sites, as I'm reading an article, they show me something like, here are other articles you may be interested in. This sort of thing. Well, in Cypher, there are lots of ways to express those sorts of things. I could look at other articles that similar users are viewing.
6. Graph Traversal and Neo4j GraphQL Library
We discuss traversing the graph from an article to related articles based on author, topics, or geo-regions. GraphQL doesn't expose these concepts, but we can implement a recommended field in our GraphQL API. When building a React app for a news organization, we use GraphQL to provide benefits like authorization and caching. The architecture involves a React app querying a GraphQL API, which then communicates with the database. Neo4j has released the Neo4j GraphQL library, which enables GraphQL first development and generates a full CRUD GraphQL API based on type definitions.
I could look at an overlap of topics. I could look at the overlap of geo-regions based on my reading history, this sort of thing. So here we're describing a traversal through the graph from an article that I'm reading to articles that have either the same author or similar topics or are about similar geo-regions. GraphQL doesn't really expose these concepts. And again, we could implement a sort of a recommended field on the article type in our GraphQL API to expose this. But again, it's not something that is inherently built in.
So graphs, I think, are really everywhere around us in different technologies. We see it in GraphQL. We see it in graph databases. But I think a question comes up for developers is knowing when to leverage the right graph technology at the right time.
So let's say, for example, we're building a React application that is going to be the front end for our news organization. So we want to show articles, we want users to be able to log in to save articles, to view recommended articles, this sort of thing. We've talked about how to query this news graph in Cypher directly from the database or GraphQL. How should we sort of structure the basic architecture of our app to do this? Well, looking at these examples that we saw, we don't really want to expose the database to our clients' applications and have them free to sort of query whatever they want in the database. Instead, that's where GraphQL really shines, that we can give all of the benefits of GraphQL to our client application, but also have this layer that sits between the client and the database where we're able to add things like authorization, caching, custom logic, these sorts of things.
So our architecture looks something more like this, where our React application is querying a GraphQL API. Maybe this is deployed as serverless functions or edge workers, something like that. And then our API layer is the layer actually going out to the database. So to make this type of application architecture easier to build, Neo4j has released the Neo4j GraphQL library. So this is a JavaScript library for building Node.js GraphQL APIs backed by Neo4j. There are a lot of really powerful features in the GraphQL library. Let's go through a couple of those. So one is this idea of GraphQL first development. So we start with our GraphQL type definitions. That defines the data model that we're working with. Then the Neo4j GraphQL library will use that to drive the data model for the database and the API. So I don't need to maintain two separate schemas, one for the database, one for the API. Everything is driven from these GraphQL type definitions. The Neo4j GraphQL library will take those type definitions and then generate a full crud GraphQL API with create, read, update, delete operations. For each type declared in the schema.
7. Neo4j GraphQL Library Features
The Neo4j GraphQL library generates a single database query for any GraphQL request, eliminating the need to implement resolvers. It provides CRUD functionality and allows custom logic through the Cypher GraphQL schema directive. The library offers powerful features such as an authorization model, relay cursor pagination, and working with unions and interfaces.
Of course, there's a lot that can be configured in what is generated. But by default, we get query and mutation fields for each one of our types, ordering, pagination, relay connection, pagination, complex filtering, as well as the geo and date types that are supported natively in the database. And this is how we built that news graph, GraphQL API that I linked earlier.
Now, one of the really powerful features of the Neo4j GraphQL library is generating database queries. So what this means is that for any arbitrary GraphQL request, a single database query is generated by the library. So as a developer, I don't need to implement resolvers. I simply need to define my type definitions and then the library will generate a single database query at query time. And this is great for developer productivity because I don't have to build these resolvers, but also for performance, this basically solves the N plus one query problem where I need to think about batching or caching in my GraphQL implementation so that I'm not making multiple round trips to the database. Well, instead, I can just rely that a single database query is generated, sent to the database, and the database is gonna optimize how to handle that query.
So we've talked about the CRUD functionality that is generated for us. What about custom logic? How do we add that? Well, this is probably my favorite feature of the Neo4j GraphQL library. And this is the Cypher GraphQL schema directive. So schema directives are GraphQL's built-in extension mechanism. So with directives, I can indicate that there's some custom logic that should happen on the server. And there are a lot of different directives that are available for configuring our schema with the Neo4j GraphQL library, but the Cypher GraphQL schema directive I think is the most powerful because it allows us to basically annotate fields in our GraphQL API with Cypher queries. So here we've added a similar field to the article types. This is kind of like that recommendation query we saw earlier. If you're reading this article, what are similar articles you might be interested in? And in this case, we're using Graph Data Science, jacquard similarity to find similar articles based on topics. So this is super powerful. We can basically expose any of the functionality of Cypher through GraphQL using the Cypher schema directives. So that's the basic functionality of the Neo4j GraphQL Library. There's lots of other interesting things in there. There's a super powerful authorization model. I mentioned relay cursor pagination as well as working with unions and interfaces. So lots of interesting, powerful things in the Neo4j GraphQL Library. Let's take a quick look at some code. So this is a link to code sandbox. It was just pulling from this GitHub repos. You can find all the code on GitHub or this code sandbox. But let's take a quick look to see what's going on here.
8. Index.js and GraphQL Type Definitions
The index.js file pulls in dependencies and reads the GraphQL type definitions from the schema.graphql file. The Neo4j GraphQL Library is used to pass these type definitions and create a connection to the database. The type definitions define the nodes in the news graph and their connections. The generated database query is logged to the console, allowing for flexibility in querying specific data. The resolveInfo object is used by GraphQL database integrations to generate database queries from GraphQL requests. Resources for learning more about Neo4j GraphQL are available on the Neo4j GraphQL landing page and the Neo4j Sandbox provides a platform for trying out Neo4j with preloaded data sets.
So this is our index.js file. Just pulling in some dependencies and reading from this schema.graphql file, which is our GraphQL type definitions. We're passing those type definitions to the Neo4j GraphQL Library. We create a Neo4j driver instance just to create a connection to the database. And then we pass that schema that we created with the Neo4j GraphQL Library to Apollo server, which is handling our, serving our GraphQL API.
If we look at our type definitions, this is basically where the interesting bits are. So we've defined types for article, author, topic. So all those nodes that we saw in our news graph and how they are connected. So we didn't write any resolvers. We've just basically defined our type definitions. And here's GraphQL playground. Here's a GraphQL query that I'm going to run. The searching for a hundred most recent articles and authors, photos, Geo. So basically everything connected to those articles and you can see the results that come in.
One thing I want to point out here is we can see the generated database query that's logged to the console here. So that's generated at query time for any arbitrary GraphQL requests. So as we modify the GraphQL query, maybe if we're only querying for articles, we are only going to see in the generated database query that we're only fetching article nodes. So super powerful for developer productivity to get a GraphQL API up and running without writing any resolvers and we're sort of leveraging all the power of the graph model with the Neo4j GraphQL library. And again, all the code for that is linked on GitHub.
As a bit of an aside, you may be wondering how these GraphQL database integrations work under the hood, how they are able to generate database queries from a GraphQL request. And the answer is inside every resolver, one of the arguments passed is the resolveInfo object that contains a lot of information about the GraphQL schema and the currently resolving GraphQL operation. So here you can see sort of all the things that are in the resolveInfo object. And so basically what these database integrations do is inspect this resolveInfo object, look at the nested selection set for the query and essentially iterate through that and generate a database query at the root resolver, which is a super powerful pattern. I gave a talk at GraphQL Summit about this a while ago. So the recording is there if you're interested in digging into that in a bit more detail.
Great, well, I think that's all the time we have for today. I wanna end on just talking about a few resources. If you're interested in learning more. So one place that's good to start is the Neo4j GraphQL landing page that has links to documentation, examples, as well as a Graph Academy, which is a self-paced online training that goes into a lot more detail, all focused on building GraphQL APIs with Neo4j. For trying out Neo4j, the best place I think to go is the Neo4j Sandbox, it allows us to spin up Neo4j instances with preloaded data sets.
Neo4j GraphQL Integration and Q&A
We have GraphQL exposed through links to prebuilt code sandbox examples in Neo4j Sandbox. Neo4j is hiring specifically for the GraphQL team, looking to grow. If interested, please reach out. 60% of the audience is using GraphQL in production. The GraphQL community is maturing, addressing scaling and advanced problems. Surprisingly, 20% are not using GraphQL but attended the conference. Let's jump into the Q&A. The first question is about the position at Neo4j for GraphQL integration work.
We also have GraphQL exposed through links to prebuilt code sandbox examples as well in Neo4j Sandbox. Also wanna mention that Neo4j is hiring, but specifically the GraphQL team at Neo4j is hiring and looking to grow. So if this sounds like interesting things to work on, definitely please reach out. You can find the the postings on our job site or just email graphql.neo4j.com. So thanks so much for joining us today. And again, please reach out to me on Twitter if you'd like to follow up. Cheers.
Are you using GraphQL in production was the question and 60% of our audience is saying, yes, that's amazing. But were you expecting so many people that already are using GraphQL at a conference or just not what you were expecting at all? Yeah, I guess that makes sense. I mean, since, we're at a GraphQL focused conference, I guess there's kind of like two personas of people that are interested in a GraphQL conference. It's like the, I'm using GraphQL in production and I'm ready to kind of like level up and think about like scale and these sorts of things and the advanced things and the other kinds of like, well, I know about GraphQL, it's something we're thinking about. So I kind of want to learn the more introductory type things, so. Yeah, I guess for a GraphQL focused conference, I guess maybe I was thinking there would be a little more than that, but I guess that shows there's a good mix of like both of those kinds of personas, right? Like people that are ready to scale up and in the introductory sort of persona. So yeah, I get that. I guess one thing I've really noticed, I think in the GraphQL community overall, looking at this in the last, I don't know, few years that I've been kind of involved in working with GraphQL is it seems like the GraphQL community is really maturing. If you look at the types of tooling and I guess maybe like best practices and trends that you're seeing, I think people are kind of hitting that issue of, okay, I'm using GraphQL in production and now I need to think about how do I scale? How do I address more advanced problems? So yeah, it makes sense to me. Yeah, I was also quite surprised because there's 20% that just said no and not the no but planning to. So you're not using GraphQL and you're not planning to, but you're here. So, well, still happy to have you, of course. But yeah, that was surprising for me. So enough talk about the poll questions. Let's jump into the Q&A. And I lost my window. Where are you, question window? Here it is. So first question is from a friend. That's a nice name. You mentioned that Neo4j is hiring for the Neo4j GraphQL integration work, but can you describe a bit what the position is like and what kind of background experience is necessary. So what profile are you actually looking for? Oh yeah, that's a great question. Yeah, so in my talk I talked about this Neo4j GraphQL library, which is a Node.js library that makes it easier to build GraphQL APIs backed by Neo4j. We talked about some of the features in that.
Neo4j GraphQL Library Hiring and Next Steps
The team working on the Neo4j GraphQL library is hiring engineers in Europe. They are looking for familiarity with TypeScript, the Node.js ecosystem, and GraphQL ecosystem. The team is also considering the next steps for the library, such as advanced use cases, scaling, and performance. If you have experience in scaling GraphQL in production, the team would be interested in adding you. Neo4j is also hiring for various roles and skills.
So the team that works in that library is hiring engineers in Europe to work on the library. It's written in TypeScript, so some familiarity with TypeScript and kind of the Node.js ecosystem and GraphQL ecosystem as well. I think there's also, you know, thinking about we have this library that you can use to build GraphQL APIs, but kind of what are the next steps, right? Like how, looking at some of those more advanced GraphQL use case, we were talking about earlier, right? Like pushing scale, pushing performance. So those I think are the kind of things that that team is thinking about next.
So certainly if you've scaled a GraphQL in production or that sort of thing, having that kind of experience, I think that team would really be looking forward to add. But yeah, certainly TypeScript is kind of what that team does on a day to day basis. And if that doesn't quite fit, Neo4j is hiring for a lot of different roles and engineering talents and skills. So I would certainly check out that the careers page that I linked in the slides. It's, you'll probably see something that might match your skillset for sure. There's something for everyone. That's true.
Database Company and Diverse Competencies
As a database company, there are various pieces to work with, including core Java engineers, Scala for Cypher query language, and desktop tooling for graph visualization. It's interesting to see the diverse competencies required for this tool.
Well I mean, when you're a database company, like a database is so like central to your infrastructure application that there's really so many different pieces that you have to work with. I mean, we have like core Java engineers who work on optimizing the database. The Cypher query language is written in Scala. We have like desktop tooling, graph visualization tooling, where you're sort of working on high-performance WebGL kind of things. So yeah, there's lots of different skills and tool sets out there. Yeah, that's always funny. I always find it nice to hear like how many different types of competencies you need for such a tool. Like you say, that you have Java and Scala developers working for you or for you at your company. Yeah, it's nice to see so many people coming together from different backgrounds building more products.
Authorization in Neo4j GraphQL Library
There are multiple options for handling authorization when using the Neo4j GraphQL library. You can implement your own authorization layer or use the built-in authorization feature that uses GraphQL schema directives. The Auth GraphQL schema directive allows you to define authorization rules in your schema, such as specifying that only authors of a blog post can edit it. This feature works with JSON Web Tokens (JWT) and supports various identity providers. The flexibility of the Neo4j GraphQL library allows developers to customize authorization and authentication based on their specific needs.
Next question is from user. I guess anonymous user. What about authorization? How would we handle application authorization when using the Neo4j GraphQL library? Is it just up to developer to implement it themselves? Yeah, good question. So there's a few options here. I guess one of the things about the Neo4j GraphQL library design principle, I guess, is to be as flexible as possible. So you can certainly implement your own authorization layer as you would building any other GraphQL service. There's lots of different options there, but there is an authorization feature that's built in to the core of the Neo4j GraphQL library that uses GraphQL schema directives. So in my talk, I showed a few examples of using GraphQL schema directives to kind of configure the API a little bit. We looked at the relationship directive for defining relationships in the schema and also the Cypher GraphQL schema directive for adding custom logic to your API. So there's also a Auth GraphQL schema directive that you can use to define authorization rules in your schema. So for example, you can create a rule that says only authors of a blog post should be able to edit the blog post or if you maybe, if you have the role admin you can also edit that, these sorts of things. And it works with a JSON Web Token, JWT. So you can use any sort of identity provider as long as that's generating a JSON Web Token. So again, meant to be kind of as flexible as possible but still have these features focused on developer productivity. Cause it's quite nice, I think, to be able to define these sorts of things in your GraphQL schema that are quite powerful. So I guess that that would be my first approach would be to look and see at the features supported by this auth GraphQL schema directive with the rules we can create. Does that match your needs for adding authorization and authentication? If so, that can be super powerful feature that's built in. Nice, thanks William.
Deploying GraphQL API with Neo4j GraphQL Library
You can deploy the GraphQL API layer using any JavaScript GraphQL implementation with the Neo4j GraphQL library. Next.js is a good framework for building full stack applications, with its API routes feature allowing you to define an input for your GraphQL API. Deploying a Next.js app can be done in different ways, and Vercel can deploy your API endpoints as serverless functions. This combination of Next.js and Vercel is great for building full stack GraphQL applications. If there are any more questions, feel free to ask. Otherwise, we'll take a short break and be back soon. William will be available in his speaker room on spatial chat to discuss anything about Neo4j. Thank you and have a nice day!
Next question is from Sam. How could we deploy the GraphQL API layer? You said that the Neo4j GraphQL library is for building Node.js GraphQL APIs, but what if I want to deploy it as a surface function? Yeah, so again, going back to this idea of trying to design the library to be focused on flexibility, really you can use any JavaScript GraphQL implementation with the Neo4j GraphQL library to take advantage of the things like the database query generation, the GraphQL schema augmentation process. Basically what you get is a GraphQL executable schema object that you can then use with Apollo server or really any JavaScript GraphQL implementation. So, it's easy to use as a Lambda function or deploy as a serverless function.
I like to use Next.js for building full stack applications. Next.js is this framework that's built on top of React. So I can build my front ends with React and Next.js, but Next.js also has this really cool feature called API routes. So in the same code base, in the same framework, I can define an input for my GraphQL API. I can take advantage of the Neo4j GraphQL library with that. And then when I go to deploy that there are different ways to deploy a Next.js app. But Vercel who kind of works on Next.js will deploy your API endpoints as serverless functions without you kind of having to think about that. So that's a good combination that I like to use for sort of my full stack GraphQL applications is Next.js and Vercel. It's again, I think something, focus a lot on developer productivity, which is super nice if you're building full stack apps.
Yeah, that's a really nice combination, really nice way to work. Also a big fan here. Great, thanks. We have time for some more questions, yeah. Next question is from Daria, what about authorization? How would we handle application authorization? Wait. I think we got that question right. It's the same question, differently asked. Oh, no, wait. No, I'm just reading double. I'm sorry, that were the questions that we have from our audience. So if there's anyone still that wants to know anything from William, now is the time to speak up or forever hold your silence. And otherwise, I'm going to let you go William. We'll have a short break and be back in five minutes. And if you want to talk to William, William is gonna be going to his speaker room on spatial chat to discuss anything you want on Neo4j. William, it's been lovely talking to you and have a nice day. Bye-bye. Great, thanks a lot.
Comments