Exploring the Data Mesh Powered by GraphQL

Rate this content
Bookmark

Different approaches are being explored for building an operational data lake with an easy data access API, as well as a federated data access API, and GraphQL opens up opportunities for enabling these architectures by laying the foundation for a data mesh.

This talk has been presented at GraphQL Galaxy 2022, check out the latest edition of this Tech Conference.

FAQ

GraphQL is a query language for APIs that enables clients to request exactly the data they need in a single API call. This can reduce the load on underlying data systems, increase performance, and help in managing complex data structures efficiently.

A data API layer is crucial for standardizing API interfaces, ensuring performance quality, and providing security compliance across various data sources and client types. It facilitates efficient data management and faster application development.

While GraphQL offers flexibility and efficient data fetching, it also presents challenges such as increased complexity in schema design, authorization, security, and optimal query planning across varied data sources.

GraphQL integrates data fetching with authorization logic, allowing for complex security rules that are tailored to user permissions and data access requirements. This integration helps prevent unauthorized data access and ensures compliance with security policies.

GraphQL APIs can serve a wide range of clients including internal and external clients, clients operating on cloud or on-premises, and those across different teams or regions. This flexibility makes GraphQL suitable for diverse operational environments.

Serverless architectures can significantly enhance GraphQL performance by allowing precise data fetching that reduces unnecessary operations and costs. This architecture adapts well to the scalable and efficient nature of GraphQL.

Yes, GraphQL effectively prevents over-fetching and under-fetching by allowing clients to specify exactly what data they need. This not only optimizes data retrieval but also improves overall application performance and user experience.

Tanmai Gopal
Tanmai Gopal
34 min
08 Dec, 2022

Comments

Sign in or register to post your comment.

Video Summary and Transcription

This Talk discusses the challenges of working with data APIs and GraphQL, including standardization, performance, and security. It emphasizes the need to optimize data fetches and push down authorization logic. The concept of externalizing authorization and using a GraphQL engine is explored. The Talk also covers the generation of GraphQL schemas and APIs, as well as the implementation of node-level security. Overall, the focus is on designing and standardizing GraphQL for data APIs while addressing authorization challenges.

1. Introduction to Data APIs and GraphQL

Short description:

In this part, Tanmay discusses the need for a data API layer to address the challenges of working with different data sources and clients. He highlights the benefits of GraphQL in selecting and structuring data, but also acknowledges the challenges of standardization, performance, and security. Tanmay explains how performance optimization can vary depending on the data sources and shares examples of query plans. He also mentions the discussion around the N plus one problem in GraphQL.

Hi, folks. I'm Tanmay, I'm the co-founder, CEO at Hustler. And I'm going to talk to you a little bit about data APIs powered by GraphQL today. So, increasingly platform teams across various organizations are setting up a data API layer to kind of deal with this problem of saying that you have so many different data sources and so many different types of clients. And you need to solve problems of performance and standardization and security to allow these clients to move quickly.

We have to deal with the fact that these data, the domain data is kind of coming from different sources, databases, services. Clients can be internal or external, they can be at the edge, they can be on the cloud, they can be on-prem, they can be within the same team, they can be across different teams. And we need kind of a data API layer that can absorb and solve for standardizing the API or providing a certain performance quality of service or providing security compliance guarantees. As a Data API, GraphQL can be a great fit and we'll see some of the benefits of GraphQL in addressing some of these challenges as well.

So, GraphQL is a nice API because as we all know, it allows us to select exactly the data that we need in a single API call. This has pretty large impact if the amount of data that we're fetching is fetching models that have been hundreds of attributes where we can drastically reduce the load on the underlying data system. Increasingly as we move to serverless centers and serverless data centers, there's a massive cost saving impact that also happens when we're able to select exactly the data that we need. We all know about the fact that GraphQL has a really nice schema and you have a type of graph that allows us to select exactly that allows us to structure the way that we're getting kind of our output, but also it allows us to structure our input and parameters fairly easily. Right? And that has an impact in our ability to handle increasing complexity. When we think about this query here, when I'm fetching orders, I'm fetching order where the user is greater than a particular value on it, ordered by the user ID in an ascending order. Providing these input parameters and arguments is much more easy with the GraphQL compared to trying to do this with a rest API, for example. Right? And so kind of being able to layer on this complexity becomes much easier. When we think about taking this, these kind of niceties of GraphQL that we're all aware of, and we think about standardizing and scaling this, we kind of run into some challenges at its core. It's because the cost of providing this increased flexibility means that we need to do a little more work in solving for kind of standardization or schema design and guaranteeing performance and solving for authorization and security, right?

Let's take a look at performance, for example. If we think about the types of data sources that we have and the way that we execute a query across those data sources, that optimal data fetching that we do can be very contextual. If we take a simple example of fetching orders and the user for each order as well, the username. Depending on the topology of this data, we might have varying query plans. For example, if it came from the same data source that supported JSON aggregation, if I had to implement a controller that would result and respond with just this data, I could make a single query that would perform the JSON aggregation at the data source itself. That means that I'm not even doing a joint that fetches a Cartesian product, I'm making a more efficient query that is fetching just the order the user are constructing the ship, the JSON, then sending that back to the client. Let's say it's coming from two different databases, in which case I would use something like an inquiry and perform memorization, so that I'm not fetching duplicate entities into this cross database joint. If this was coming from two different services, then I'd have to make multiple API calls, but again, I would do a form of memorization to prevent duplicate entities being fetched within the same request. It's a variation of the data pattern. But the idea that this query plan depends on the kind of data that we have, and the same kind of query plan will not work across these different data sources. There's an interesting thread that popped up on Twitter a few weeks ago, where we talked about how GraphQL doesn't create an N plus one problem, and Nick, one of the creators of GraphQL kind of chimed in saying that, well, GraphQL doesn't create the N plus one problem. But because of the way that we typically think about executing a GraphQL query, it does make it harder to address that problem in kind of a systematic way.

2. Challenges of Data Fetching and Authorization

Short description:

In this part, Tanmay discusses the challenges of integrating predicate pushdown with data fetching and the need to push down authorization logic. He emphasizes the importance of optimizing data fetches and explains the challenges of doing this across data sources.

And that's kind of what we look at, and we see how we can address those kinds of challenges. And we think about authorization. Very common challenge is that we have to integrate predicate pushdown along with our data fetching. Again, if you look at the same query where we're fetching orders user, and let's say this query is being made by a regional manager that can only fetch orders that are placed within the region, within their region. And so if we made a naive kind of request where we selected all of this data, and then after selecting the data, start filtering by region, terrible. We obviously can't do this when you have millions or billions of rows. And what you don't do is you don't want to fetch that data or select from orders, from the orders kind of model or table or whatever, where the region is equal to the current region. This is the predicate and again, pushing down that predicate in our data fetch, right? And we'd want to be able to push down our authorization logic with our data fetch as much as we possibly can. Doing this across data sources can become challenging, right?

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

From GraphQL Zero to GraphQL Hero with RedwoodJS
GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
From GraphQL Zero to GraphQL Hero with RedwoodJS
Top Content
Tom Pressenwurter introduces Redwood.js, a full stack app framework for building GraphQL APIs easily and maintainably. He demonstrates a Redwood.js application with a React-based front end and a Node.js API. Redwood.js offers a simplified folder structure and schema for organizing the application. It provides easy data manipulation and CRUD operations through GraphQL functions. Redwood.js allows for easy implementation of new queries and directives, including authentication and limiting access to data. It is a stable and production-ready framework that integrates well with other front-end technologies.
Local State and Server Cache: Finding a Balance
Vue.js London Live 2021Vue.js London Live 2021
24 min
Local State and Server Cache: Finding a Balance
Top Content
This Talk discusses handling local state in software development, particularly when dealing with asynchronous behavior and API requests. It explores the challenges of managing global state and the need for actions when handling server data. The Talk also highlights the issue of fetching data not in Vuex and the challenges of keeping data up-to-date in Vuex. It mentions alternative tools like Apollo Client and React Query for handling local state. The Talk concludes with a discussion on GitLab going public and the celebration that followed.
Get rid of your API schemas with tRPC
React Day Berlin 2022React Day Berlin 2022
29 min
Get rid of your API schemas with tRPC
Today's Talk introduces TRPC, a library that eliminates the need for code generation and provides type safety and better collaboration between front-end and back-end. TRPC is demonstrated in a Next JS application integrated with Prisma, allowing for easy implementation and interaction with the database. The library allows for seamless usage in the client, with automatic procedure renaming and the ability to call methods without generating types. TRPC's client-server interaction is based on HTTP requests and allows for easy debugging and tracing. The library also provides runtime type check and validation using Zod.
Batteries Included Reimagined - The Revival of GraphQL Yoga
GraphQL Galaxy 2021GraphQL Galaxy 2021
33 min
Batteries Included Reimagined - The Revival of GraphQL Yoga
Envelope is a powerful GraphQL plugin system that simplifies server development and allows for powerful plugin integration. It provides conformity for large corporations with multiple GraphQL servers and can be used with various frameworks. Envelope acts as the Babel of GraphQL, allowing the use of non-spec features. The Guild offers GraphQL Hive, a service similar to Apollo Studio, and encourages collaboration with other frameworks and languages.
Rock Solid React and GraphQL Apps for People in a Hurry
GraphQL Galaxy 2022GraphQL Galaxy 2022
29 min
Rock Solid React and GraphQL Apps for People in a Hurry
The Talk discusses the challenges and advancements in using GraphQL and React together. It introduces RedwoodJS, a framework that simplifies frontend-backend integration and provides features like code generation, scaffolding, and authentication. The Talk demonstrates how to set up a Redwood project, generate layouts and models, and perform CRUD operations. Redwood automates many GraphQL parts and provides an easy way for developers to get started with GraphQL. It also highlights the benefits of Redwood and suggests checking out RedwoodJS.com for more information.
Adopting GraphQL in an Enterprise
GraphQL Galaxy 2021GraphQL Galaxy 2021
32 min
Adopting GraphQL in an Enterprise
Today's Talk is about adopting GraphQL in an enterprise. It discusses the challenges of using REST APIs and the benefits of GraphQL. The Talk explores different approaches to adopting GraphQL, including coexistence with REST APIs. It emphasizes the power of GraphQL and provides tips for successful adoption. Overall, the Talk highlights the advantages of GraphQL in terms of efficiency, collaboration, and control over APIs.

Workshops on related topic

Build with SvelteKit and GraphQL
GraphQL Galaxy 2021GraphQL Galaxy 2021
140 min
Build with SvelteKit and GraphQL
Top Content
Featured WorkshopFree
Scott Spence
Scott Spence
Have you ever thought about building something that doesn't require a lot of boilerplate with a tiny bundle size? In this workshop, Scott Spence will go from hello world to covering routing and using endpoints in SvelteKit. You'll set up a backend GraphQL API then use GraphQL queries with SvelteKit to display the GraphQL API data. You'll build a fast secure project that uses SvelteKit's features, then deploy it as a fully static site. This course is for the Svelte curious who haven't had extensive experience with SvelteKit and want a deeper understanding of how to use it in practical applications.

Table of contents:
- Kick-off and Svelte introduction
- Initialise frontend project
- Tour of the SvelteKit skeleton project
- Configure backend project
- Query Data with GraphQL
- Fetching data to the frontend with GraphQL
- Styling
- Svelte directives
- Routing in SvelteKit
- Endpoints in SvelteKit
- Deploying to Netlify
- Navigation
- Mutations in GraphCMS
- Sending GraphQL Mutations via SvelteKit
- Q&A
Build Modern Applications Using GraphQL and Javascript
Node Congress 2024Node Congress 2024
152 min
Build Modern Applications Using GraphQL and Javascript
Featured Workshop
Emanuel Scirlet
Miguel Henriques
2 authors
Come and learn how you can supercharge your modern and secure applications using GraphQL and Javascript. In this workshop we will build a GraphQL API and we will demonstrate the benefits of the query language for APIs and what use cases that are fit for it. Basic Javascript knowledge required.
End-To-End Type Safety with React, GraphQL & Prisma
React Advanced Conference 2022React Advanced Conference 2022
95 min
End-To-End Type Safety with React, GraphQL & Prisma
Featured WorkshopFree
Sabin Adams
Sabin Adams
In this workshop, you will get a first-hand look at what end-to-end type safety is and why it is important. To accomplish this, you’ll be building a GraphQL API using modern, relevant tools which will be consumed by a React client.
Prerequisites: - Node.js installed on your machine (12.2.X / 14.X)- It is recommended (but not required) to use VS Code for the practical tasks- An IDE installed (VSCode recommended)- (Good to have)*A basic understanding of Node.js, React, and TypeScript
GraphQL for React Developers
GraphQL Galaxy 2022GraphQL Galaxy 2022
112 min
GraphQL for React Developers
Featured Workshop
Roy Derks
Roy Derks
There are many advantages to using GraphQL as a datasource for frontend development, compared to REST APIs. We developers in example need to write a lot of imperative code to retrieve data to display in our applications and handle state. With GraphQL you cannot only decrease the amount of code needed around data fetching and state-management you'll also get increased flexibility, better performance and most of all an improved developer experience. In this workshop you'll learn how GraphQL can improve your work as a frontend developer and how to handle GraphQL in your frontend React application.
Build a Headless WordPress App with Next.js and WPGraphQL
React Summit 2022React Summit 2022
173 min
Build a Headless WordPress App with Next.js and WPGraphQL
Top Content
WorkshopFree
Kellen Mace
Kellen Mace
In this workshop, you’ll learn how to build a Next.js app that uses Apollo Client to fetch data from a headless WordPress backend and use it to render the pages of your app. You’ll learn when you should consider a headless WordPress architecture, how to turn a WordPress backend into a GraphQL server, how to compose queries using the GraphiQL IDE, how to colocate GraphQL fragments with your components, and more.
Relational Database Modeling for GraphQL
GraphQL Galaxy 2020GraphQL Galaxy 2020
106 min
Relational Database Modeling for GraphQL
Top Content
WorkshopFree
Adron Hall
Adron Hall
In this workshop we'll dig deeper into data modeling. We'll start with a discussion about various database types and how they map to GraphQL. Once that groundwork is laid out, the focus will shift to specific types of databases and how to build data models that work best for GraphQL within various scenarios.
Table of contentsPart 1 - Hour 1      a. Relational Database Data Modeling      b. Comparing Relational and NoSQL Databases      c. GraphQL with the Database in mindPart 2 - Hour 2      a. Designing Relational Data Models      b. Relationship, Building MultijoinsTables      c. GraphQL & Relational Data Modeling Query Complexities
Prerequisites      a. Data modeling tool. The trainer will be using dbdiagram      b. Postgres, albeit no need to install this locally, as I'll be using a Postgres Dicker image, from Docker Hub for all examples      c. Hasura