Schemas Everywhere: Understanding GraphQL, Databases & Prisma

Rate this content
Bookmark

In a world run by data, we as developers have turned to schemas to help describe and organize that data. But what happens when you have a ton of schemas to keep track of? In this talk you will learn the role of all the different schemas in a GraphQL API.

This talk has been presented at GraphQL Galaxy 2022, check out the latest edition of this Tech Conference.

FAQ

A database schema defines the structure of data stored in a database, while a GraphQL schema specifies the structure of data that an API exposes, which may include additional or restricted data not present in the database for security reasons.

Developers are responsible for managing data by understanding its structure and purpose, transforming it, and ensuring it interacts correctly with different parts of an application.

Data modeling is challenging because it involves understanding complex data flows, adapting models to evolving requirements, and often dealing with data that was not originally modeled by the developer.

A schema is a structured representation of a data model that helps in clearly defining how data is organized within a database or system. It acts as a source of truth for the data's structure.

The Prisma schema language simplifies modeling database schemas by using a syntax similar to GraphQL, making it more accessible for developers and aiding in generating type-safe client interactions with the database.

Developers can achieve type safety by using tools like Prisma for database-API interactions and GraphQL Code Generator for front-end types, ensuring consistency across the different layers of the application.

Multiple schemas can lead to confusion about the source of truth and the specific role of each schema, especially when they describe data differently or overlap in their data definitions.

Prisma bridges the gap between the database and API by generating migrations to update the database schema and creating mappings that ensure type-safe interactions between the database and the API.

Sabin Adams
Sabin Adams
9 min
08 Dec, 2022

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Welcome to the talk! As developers, we manage and understand the data that the world runs on. Each individual schema in your infrastructure defines your data in the context of its own domain. The Prisma schema is used to generate migrations and create a mapping between the database and API, enabling type-safe interactions. The GraphQL schema allows clients to safely query the database via the API. By using Prisma and GraphQL Code Generator, you can achieve an end-to-end type-safe environment.

1. Introduction to Data Management and Modeling

Short description:

Welcome to the talk! As developers, we manage and understand the data that the world runs on. Data modeling can be challenging, especially when dealing with data from various sources. Schemas provide a way to represent data models, but the proliferation of schemas has led to the question of the source of truth.

Welcome, everybody. Thank you so much for joining me for this talk. I am very excited to be giving this in a lightning talk format. I've given the same talk before in a full length format and I've condensed it down into just the necessary pieces. So, looking forward to see how this will go. If you have any questions about the talk after I've given it, feel free to shoot me a message on Twitter and I'll be happy to answer any questions you might have.

But before we get into the meat and bones of this talk, let's talk about the bigger picture concept that the world itself runs on data, whether it's your cell phone, whether you're using Facebook, or maybe your refrigerator or whatever you have that's connected to the internet, everything runs on data and as developers, it's actually our job to manage this. So it's a heavy load to put on our shoulders, but that's what we signed up for when we became developers is to actually take this data, do something with it, and spit it out in a format that other pieces of software can use.

So the TLDR of all this is that in order to manage a set of data, you have to have some sort of knowledge about its structure and its purpose. You have to know why you're dealing with your data and why you're doing what you're doing in your application's code to your data. So to revise this original statement, not only is it our job to manage this data that the world runs on, but it's also our job to at least to some degree understand it.

And this is hard because in general, data is hard to model. As technical people, we have a lot going on. We're doing a lot of technical things. We're developing applications. We have a lot of this knowledge to keep in our heads. There's not a whole lot of room to understand the whole data domain of whatever industry you're working in at the time. So a couple of other reasons though why your data is hard to model, as your data flows through different areas of your application, you have to know how to interact with this data. So it needs to be modeled in a way that with different pieces of your application. Your data model may change as your application evolves. So as new requirements come up in your industry, you may have to evolve your model a bit and doing that in a way that's safe for your application can be difficult at times. Another one, and this is a big one, is that your data may not have been modeled by you. And I would also revise this to say that your data probably wasn't modeled by you. You're probably consuming data from someone else and using it within your own application.

So for all of these reasons, we as developers came up with this idea of schemas, which is a way to clearly and concisely represent your data model. But there's still a problem, even with schemas. Schemas are now everywhere, so we've solved this problem of being able to model out our data in a way that makes sense. But now that we found a good solution, we're using it everywhere, and the original intent for the schema is now lost. So what the schema is supposed to be is a source of truth for what your data looks like. But as you start adding different schemas everywhere, it begs the question, now what is the source of truth? So this causes the problem that you now have multiple perceived sources of truth, and each schema may describe your data a little bit differently, which is what probably causes the question, what is the source of truth? And also, each schema has a different role.

2. Data Management and Schema Definition

Short description:

Each individual schema in your infrastructure defines your data in the context of its own domain. We'll look at the database schema, which is written in a data description language. Prisma has its own language, the Prisma schema language, that allows for easier modeling of the database schema. The GraphQL schema is different from the database schema as it defines what the API exposes, not the data itself.

And if the data, if the schemas look similar, it's kind of hard to determine what the role is of each individual schema. So to put this shortly, each individual schema in your infrastructure defines your data in the context of its own domain. So whether it's your database, your API, or something else, the schema is describing the data for that individual piece of your application.

So we're going to look at a stack that has a database graph QL API and Prisma thrown in there as well. So that we can look at the individual schemas and how they relate.

So to start off, we'll look at the database schema, which you see on the right is the code that you would need to write to create a basic database schema. This is written in DDL or a data description language. And it's typically a database specific language here. So this is a little bit complicated. These languages tend to be a little bit harder to learn than something like, um, something like JavaScript or something a little bit easier to look at on the eyes.

And for this reason, we at Prisma created our own language here called the Prisma schema language. And this allows you to model out your database schema within a Prisma schema file. The only difference is that it's written in this language that's a little bit more like graph QL. That's a little bit easier to read. This is important here because now as new developers come into your application, or maybe someone who's non-technical come in, they can actually look at this model and sort of see what's going on. This is also important because it's in a simple enough model that we can actually use this within Prisma to generate a type safe Prisma client that interacts with your database in a way that's safe and ensures that the data you're accessing is actually available.

And then finally we have the graph QL schema here. So this looks similar to the Prisma schema. However, there is a big difference. This is where a lot of people end up getting confused with schemas. The graph QL schema tends to look very similar to your database schema because it's exposing data from your database. But the problem here is that it should probably not look exactly like your database schema. This isn't defining your data here. This is defining what your API exposes. So, for example to elaborate on this, you may be exposing data from your graph QL API that's not even in your database. Or you may be not exposing certain fields from your database for security reasons. And because of these things your graph QL schema is a completely separate thing from your database schema. It may feel a little bit like code duplication as you're actually writing them. But if you understand this difference between the two different schemas here, it starts to make a little bit more sense. So, just to recap, we've got our database schema which is the schema running on your database server that defines your data shape.