Video Summary and Transcription
This Talk compares RESTful APIs, event-driven architectures, and low latency performance APIs. It discusses the limitations of RESTful APIs and the need for newer technologies like GraphQL. The Talk explores event-driven architecture using webhooks and web sockets, as well as the benefits of gRPC as a performant alternative. It also highlights the integration of gRPC with front-end development and the use of protocol buffers for improved performance. Lastly, it emphasizes the importance of considering team familiarity and infrastructure when choosing an API architecture.
1. Introduction
Hi, everyone. I'm Ian Douglas, a senior developer advocate at Postman. Today, I'll be comparing RESTful APIs, event-driven architectures, and low latency performance APIs. Let's dive in!
Hi, everyone. I'm Ian Douglas. I'm a senior developer advocate at Postman, and thanks for having me at TestJS. I'm going to do a talk today comparing RESTful APIs, event-driven I've been in the tech industry a really long time. Most of my gray hair comes from working at over a dozen start-ups and having two teenagers. I've spent over eight years in the advocacy space, four years of which were in education, teaching people about software. On the side, I'm also really into dog training, 3D printing, career coaching, and I really like really awful jokes, so you'll see some of those on my slides as well.
2. RESTful Standards and Event-Driven APIs
In this part, we'll discuss RESTful standards, event-driven APIs like web hooks and web sockets, and low latency performance APIs like gRPC. RESTful APIs have become popular, but they need to keep up with newer technology. HTTP 1 has limitations, such as the lack of interrupting a client and underfetching or overfetching data. Technologies like GraphQL provide more efficient ways of sending data. We'll also explore event-driven architecture, including WebSockets and WebHooks, which allow for asynchronous, event-driven processing.
So this is the quick agenda that we're going to go over. I'm going to give you some background on RESTful standards. We're going to talk about event-driven APIs like web hooks and web sockets, and then we're going to talk about low latency performance APIs like gRPC.
So REST was defined in 2000. A lot of people that are working in software development nowadays know what REST services are. The term RESTful services came shortly after within a few years, although there's no very easy way to determine when the term itself came to be. It happened because most developers didn't like following such strict rules. They wanted some flexibility in what they built. Most schools nowadays that are teaching API design are teaching RESTful API design, or they're teaching API consumption with RESTful APIs primarily. Postman actually has a student program where we work with schools to get students using Postman about other kinds of API architectures, as well. As popular as it's been, RESTful APIs seem a little bit stuck on HTTP version 1.1 territory and kind of needs to keep up with newer technology. Companies like Google started introducing new protocols like speedy, which was the basic basis for HTTP 2.0.
For example, in HTTP 1, it's a little bit like a phone call where I call someone and I ask them a single question, I get a single answer and then we hang up the phone. The next time I need to call somebody and get more information, I have to call, identify myself, ask them my question, get my response, and hang up. The other problem with HTTP 1 is there's no way to interrupt a client. If a client connects to a server and says, I'd like to upload some data, the server can say okay, and the server can just blast gigabytes or terabytes of data before the server is allowed to interrupt and say, wait, I can't process all the data that you just sent. We also have a problem in RESTful technologies that we call underfetching or overfetching, which is where we don't have efficient ways of sending data. We're either sending too much data and expecting our front ends to hide that data, which can lead to security issues, or we don't send enough data, which means I've got to make more and more of these connections again. So technologies like GraphQL came on the scene, allowing the client to specify which fields to return as part of the request. This could happen in REST with parameters, but it's a little more difficult to do that and puts a little more work on the developer to be able to implement which fields to send back as part of a query parameter, for example.
If you've been working with front-end development and front-end technologies, you've probably worked with WebSockets or any sort of collaborative feature, like a chat message system or event-driven applications like games. So, I'm going to give you a quick lesson on WebSockets and WebHooks as we look at event-driven architecture for a moment. The idea of asynchronous APIs is that they're event-driven. There's two classic methods, WebHooks and WebSockets, as I mentioned. They're still HTTP-based, and so they make a connection, they transfer data, and then they disconnect. WebSockets, though, unlike RESTful APIs and WebHooks, can keep a connection open for a long period of time, and either side can send data back and forth at any point. And so it allows for event-driven processing, that when something happens, you can transmit that over a connection that's already open. Developers don't necessarily need to pause and wait for a response before continuing work. So I can send a request, and then I can continue my event loop, waiting for other user interaction and wait for that response to come back. So, event loops like promises and other mechanisms called async and await allow developers to choose when to pause execution to wait for a response or to just process it in the background.
3. Webhooks, Web Sockets, GraphQL, and gRPC
Webhooks and web sockets are commonly used for real-time applications, while GraphQL introduces streaming data and server-sent events. Asynchronous APIs have design complexities and challenges with data reconciliation, rate-limiting, and throttling. Combining the best of RESTful APIs and asynchronous APIs, gRPC offers long-lived connections, binary data approach, and improved performance. gRPC is built on HTTP version 2 and has four different data transfer mechanisms.
Most typically, we see webhooks used in a response mechanism. For example, someone pays an invoice with something like PayPal. PayPal can call a webhook that you specify that can take some other action like updating your CRM system. As mentioned earlier, web sockets are commonly used for more real-time applications where you need to hold that connection open and have a conversation. There may need to be some amount of state exchanged on connecting or reconnecting to resume where you last left off, such as, in a chat mechanism, which message did we see last time for real-time operations?
GraphQL also has something now called streaming data that they call subscriptions. It's a little bit like a publish-subscribe pattern, which we sometimes call PubSub. There are other choices here, including newer mechanisms called server-sent events, where servers can push data to a client. Again, because these asynchronous channels are held open for a long period of time. Asynchronous still has some problems, though. It can be harder to design and plan since you don't know which events are happening in any particular order. So you might need to reconcile your data on either end of the connection to ensure that everything was completed. Webhook mechanisms need to ensure that whatever system is calling it can actually route to the server, which is tricky when you're dealing with internal network mechanisms or API gateways and so on. There are other issues that you can run into with asynchronous, including rate-limiting and throttling data, which can be more challenging.
So I'm going to pause on this screen for a moment if you want to look at some primary differences between RESTful APIs and async APIs as far as data transfer mechanisms and design complexity and so on. Async APIs, traditionally WebSockets, are going to be mostly built on HTTP version two. Webhooks may still be acting like RESTful APIs and still using HTTP 1.1. But in asynchronous APIs, getting that response, the client provides a means to kind of respond later or get interrupted later when an event has happened. But that does increase your design complexity. And the order that your requests and responses are coming back is not always predictable and there might need to be some reconciliation later on.
So let's talk about API performance and some things to take into consideration if we need to take the best things out of RESTful APIs and asynchronous APIs. We could, of course, just scale our hardware and add more systems to handle load. And that will help with a little bit on the performance side, but it raises other concerns and problems, especially with the cost of cloud computing providers. So how can we combine the best of API architectures that we've mentioned so far, like controlling our data, but also handling things like streamed data or having long lived connections? What if there was an API type that gave us the flexibility of single request response like REST or the ability to stream data in either direction, client to server or both, and also allow those connections to stay open for long periods of time, but also give you a major performance boost to speed while minimizing your data throughput? So taking the best of all of these things together, and we can look at gRPC. So gRPC was developed by Google and it's built on top of HTTP version 2. This allows for those long lived connections, similar to web sockets, and it utilizes a binary only data approach by default. It's still a work in progress. There aren't too many public facing gRPC APIs out there. But we are seeing a lot of growth. Another area of growth that we're seeing is a JavaScript library called gRPC web, which will allow you to connect to gRPC servers from web applications, where primarily it was introduced to be server to server microservice communication. gRPC has four different data transfer mechanisms.
4. gRPC Performance and Integration
gRPC is a performant alternative to RESTful APIs and WebSockets. It offers built-in compression and allows for streaming data in both directions. The gRPC web library has been available since 2018 and provides examples for implementing gRPC in various frameworks. With statically typed data structures and automatic data validation, gRPC simplifies development. While it has traditionally been used for mobile apps and server-to-server communication, the gRPC web library allows for front-end integration. By using protocol buffers, gRPC achieves a major performance boost and enforces data types. Implementing a RESTful API on top of HTTP2 is possible, but GraphQL offers a combination of REST and gRPC with JSON payloads and a PubSub pattern for asynchronous communication.
The first is very much like a restful API of a single request, a single response, and then a disconnect. The others are a combination of either the client wanting to stream a whole bunch of data to the server or the server streaming a bunch of data to the client, or streaming in both directions. The idea of streaming here is that you can send multiple requests or responses at any time as events occur on either end of the connection.
Because of all of this, along with some built-in compression, gRPC is typically seven to ten times more performant than restful APIs, mostly because of HTTP version 2, and that allows a lot of the communication back and forth between the client and server before disconnecting.
So you might be saying, well, gRPC is only for system level or from a mobile device to a server. But again, this gRPC web library, it was introduced actually a little while ago, it was put into general availability five years ago in 2018. The library's had a lot of work on it, it's quite robust, and there's several examples out there for implementing gRPC in frameworks like React and Vue, WebAssembly and Blazor, and more. gRPC has everything we need, collectively to replace potentially REST and WebSockets on front-end applications with a single architecture, and the data structures are statically typed. So if the client library unpacks it, you can trust that the data validation has already happened, you don't need to validate is this attribute actually a string, is this attribute actually a number. If the gRPC client is able to unpack that data payload, it's already done that validation for you. And it starts sending an error back to what sent that message saying I can't process that.
This is all pretty new though. And we need more people to start using this technology and help with documentation and examples and start making real applications for this gRPC web process to really kind of take hold. So again, I'm going to pause here for a moment and you can see some of the differences between RESTful APIs and gRPC APIs. gRPC APIs have a lot of commonality with async APIs, but they do give you this extra performance boost. So again, it's built on top of HTTP version two. Getting a response, you can make it synchronous like REST. You can make it asynchronous like webhooks. But the design complexity is usually much higher. It doesn't allow the flexibility for what you send in the message. It's a very rigid message format. And as we saw between REST and RESTful, some developers want more of that flexibility. There are some data types that allow provision for this inside of the binary format that you send these messages called protocol buffers that will allow different types of data to be sent or you can just send things as long strings and do your own encoding and decoding.
The typical use case of gRPC has traditionally been mobile apps and server to server communication for things like event logging and so on, or for mobile games, but with the gRPC web library, there's a lot of flexibility here to now introduce this as something to work with in the front end as well. Again, because that data structure is a binary format using protocol buffers or proto buffs as you might hear them referred to, you get a major performance boost and that data validation is already taken care of, which simplifies your code.
The question here then is, well, couldn't we just implement a RESTful API on top of HTTP2 and gain some of those benefits that we saw between Websockets and gRPC and webhooks? It's like, yeah, you could if you want to build that. GraphQL is a bit of REST and gRPC with JSON payloads where you can tell GraphQL, I want to go call this instruction and I want these fields to come back, and with their new subscribe process, it's a little bit like asynchronous as well with that PubSub pattern where you can subscribe to different channels and get events back from the server as they happen. Could we use a binary format in a RESTful API that enforces data types? Absolutely. You can actually use protocol buffers inside a RESTful API if you want to. You just have to make sure that every client that connects knows how to unpack the protobuf message format and also deal with the rigidity of the message has to be sent in this format.
5. API Architecture Considerations
In the protobuf, you can specify optional fields. gRPC has a built-in discovery mechanism for API message payloads and instructions. Implementing event-driven architecture using REST is possible but limited. HTTP version 3, based on QUIC, offers improved performance. Consider team familiarity and infrastructure when choosing an API architecture.
Now in the protobuf, you can specify that some fields are optional. So they don't have to be there. And again, you can do your own encoding and decoding, but that might have other implications on the efficiency and the performance of your software.
The nice thing about gRPC is that it actually has a discovery mechanism built in that allows clients to learn about the API message payloads as well as the different kinds of instructions that can be called. Now typically this introspection is actually turned off in production, but while you're in development mode or in staging mode, you could turn on this introspection and be able to collect data from the server as far as what's possible, which methods can I call? What are these data formats? How do I send this data?
Another question is could we just implement something like event driven architecture using REST? Kind of, but it would still be a single request response cycle because of HTTP version one and mimicking a stream of incoming data is technically possible. That's how we've managed streaming content for a few decades now, but it tends to be built more on UDP for streaming mechanisms for things like video and audio. Part of HTTP version two is that your data is encrypted by default. With HTTP one, you kind of have to build in your own encryption or rely on things like TLS. So you could add other layers of encryption on that data as well. In fact, we encourage everybody, please encrypt your data.
By now, though, building all of these things and taking all these things into consideration, you've just increased the complexity of your API design quite a lot. You've made your deployment and your management way more complex, and you've likely made your infrastructure more complex. So is that really worth it? Choosing an API architecture is a tricky balance of performance. But what does performance actually mean? When we think about the evolution between HTTP version 1 and version 2 and the improvements that were made there with the long lived connections and the binary formats and so on, and looking ahead to some of what HTTP version 3 performance is going to be, is actually going to be pretty interesting.
So let's talk about HTTP version 3 for a moment. It's actually based on a new protocol called QUIC, Q-U-I-C, and it's only built on UDP. It's not going to be built on TCP. This is going to reduce some of the communication chatter, because for every packet sent, you don't have to wait for a response to come back. It's going to have some new session management built into it to mimic stateless calls while still retaining familiar HTTP methods, headers, and status code systems that we've been using for quite a long time now. It's actually been supported now since 2020 and 2021 in most major browsers. In fact, most client libraries are fully supporting HTTP 3 already. The last I checked, the only one that wasn't supporting it was the Curl library, which is still a pretty major library.
So when it comes to performance, the other thing to think about is how quickly can the team perform on this? So I've got some notes on this slide that basically says I'm a fan of teams learning something new, but if your company needs to get something out quickly, are these technologies something that your team already knows? Or is your team going to have to go learn this? Does your DevOps team already know how to build out this kind of infrastructure to support HTTP 2 or HTTP 3? So when we think about performance, we often think of the software performance. But what about the performance of the team and the amount of time it takes the team to kind of build what they need to build? Is that infrastructure already in place? Do they already know how to set this stuff up? Or is there an easier migration path for performance rather than a total rewrite into some new architecture? So these are all very interesting things to think about when we think about different kinds of API architectures. When we think about the history of REST, and moving into asynchronous APIs and event driven API is when we talk about things like WebSockets and Webhooks, but then also considering performance of things like gRPC. So I hope this background was helpful. I'm happy to answer any questions that you have. My username on most social platforms, including LinkedIn and Twitter is iandouglas736. If you'd like to reach out and ask any further questions, the QR code that you see on this slide will also get you to all of my contact information as well. So thanks again very much to everyone at Test.js for having me today, and feel free to reach out with questions.
Comments