Learn how running your TS in Temporal's runtime gets you extreme reliability
This talk has been presented at TypeScript Congress 2022, check out the latest edition of this JavaScript Conference.
Learn how running your TS in Temporal's runtime gets you extreme reliability
This talk has been presented at TypeScript Congress 2022, check out the latest edition of this JavaScript Conference.
Lauren's talk focuses on building distributed systems in TypeScript and improving the reliability of an e-commerce store's order processing using Temporal.
Temporal is a durable code execution framework that runs your code in workers and ensures it can be resumed in the exact same state if anything goes wrong. It allows developers to write code that is oblivious to faults.
The key steps are: 1) Reserving inventory with the inventory service, 2) Charging the payment service, 3) Sending the package via the fulfillment service. If any step fails, appropriate compensation actions like unreserving inventory or issuing refunds are taken.
Temporal allows you to define retry policies with parameters like max attempts and timeout intervals. If a service call fails, Temporal retries it with exponential backoff. It also persists the state of each step so that if a process crashes, it can resume from the last successful step.
Workers in Temporal's architecture are responsible for executing tasks. They pull tasks from the Temporal server and report back upon completion. If a worker fails, another worker can pick up where it left off, ensuring durability and fault tolerance.
Lauren aims to simplify the process of making an e-commerce order system reliable by handling retries, state persistence, and failure compensation automatically using Temporal, rather than manually coding these aspects.
A durable store ensures that the state of each step in a process is saved persistently. This allows the system to resume operations from the last successful state in case of crashes or other failures, ensuring data consistency and reliability.
Temporal abstracts away the complexity of handling retries, state persistence, and failures. This allows developers to focus on business logic rather than fault tolerance and reliability concerns, making the development process simpler and more efficient.
You can learn more about Temporal's TypeScript SDK by visiting temporal.io/ts. For further questions, you can email Lauren at [email protected] or reach out on Twitter at @laurenDSR.
Idempotency ensures that retrying a failed operation does not result in unintended side effects, like double charging a user. It allows the system to handle retries gracefully without causing inconsistencies.
This is how to build distributed systems in TypeScript. We will work on an e-commerce store and implement a create order endpoint. We will go through all the failure modes and try to make it more reliable. We reserve the inventory, charge the payment, and send the package. If any step fails, we handle it accordingly and consider retrying.
Hi, folks. This is how to build distributed systems in TypeScript. My name is Lauren, and I wrote a book on GraphQL called the GraphQL Guide, and currently I am working as a language runtime engineer at Temporal working on our TypeScript SDK.
So when working in a monolith, we might not have very many distributed systems concerns to be worried about. If we just have a single app server layer and a database that does transactions, then we might just be getting a request from a user, doing a single operation, and then if that fails, then we say to our user, try again. If we have, if we're talking to external APIs and they're an essential part of the business application, like I talk to Stripe and say charge this, and then I update my database, then I have some more problems, like I might not be able to reach Stripe. So I need to retry that. When I retry, I need to do so idempotently, so that I'm not double charging the user. And there are cases in which my database might be out of sync with Stripe's understanding of the world. So Stripe returns success and I can't reach the database and that operation fails or my process dies before I get to that step, then my data is inconsistent.
In this talk, we will be working on an e-commerce store and implementing a create order endpoint. And we will go through all the failure modes and try to make it more reliable, and we won't get to full reliability, but we'll see how much simpler we can make it if we write it in temporal. To get started, we have a serverless function hosted on Vercel, and we get out of the body the item ID, the quantity, and the address that you go to, and we get the user ID out of the jar. And then we take three steps whenever we have a new order. We reserve the inventory with the inventory service. Then we talk to the payment service to charge, and then we talk to the fulfilment service to send the package. And we respond success.
So in a successful case, this logic works fine, but there are a number of things that can go wrong. If we fail to reserve, then we don't want to continue charging and sending packages. If we successfully reserve and then fail to charge, then we not only don't want to send the package, but we also want to unreserve from the inventory so that someone else can buy those items. And then lastly, if reserving and charging is successful, and we get that same package, then we want to refund the charge and unreserve. So let's see what that logic looks like. Get down to here, reserve. If the reservation failed, then we say status 400. Now, when we charge and it fails, we want to talk to the inventory service again and say unreserve this amount of this item. And then finally, send package if that fails, then we want to first refund, then unreserve. Right now, when we have a failure, we send status 400 to the client. But ideally, we would be retrying each of these steps in case the failure was transient, like a network error or the service was temporarily down. But if we retry it the second time, maybe it'll go through. So let's add retrying to each of these steps.
We have a retry function that handles service calls and retries them with exponential back off. We need an idempotency token to prevent duplicate calls. We also retry failure calls and catch errors in the response object.
We have a retry function that takes parameters, max attempts, a timeout in intervals. And so for each of these attempts, it tries calling the service, it races between that and the timeout. So if 30 seconds passes and we haven't heard back from the service, then we retry. And each time we retry, we are doing exponential back off. And here we're wrapping each of these service calls in the retry function.
One issue we have now as we are retrying is that we might have multiple successful calls and we need some kind of idempotency token so that the service knows not to do the second time. And the best way to do this is to get it from the client. So we'll add that. Get a request ID out of the body and pass it to each of the services.
We also want to retry these failure calls, like if the refund throws, then we'll neither refund nor will unreserve. So the customer will wind up still with the charge and the inventory is taken. So let's add those. So we're wrapping unreserve with the retry and also refund. Also, when we're speaking of throwing, we're not actually catching throwing, we're just seeing whether the response object had a failure property. So let's also be catching these errors.
We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career
Comments