Push Notifications: Can’t Live With Em, Can’t Live Without Em

Rate this content
Bookmark

Consider how many notifications you get per day... and now consider the millions of other people who are also receiving notifications. 16 million notifications a day that have places to be and people to see, in a race against time, load, and latency.

So what happens behind the scenes to ensure that all those notifications get where they need to go, and quickly? A combination of auto-scaling, rabbitMQ, scrupulous monitoring, and a tireless dev team. In this talk I’ll discuss the functionality of the message bus at the core of Vonage’s communications platform, the powerful Node scripts that power the entire operation, and how you can use similar solutions for a number of different challenges.

This talk has been presented at Node Congress 2021, check out the latest edition of this JavaScript Conference.

FAQ

The Vonage Business Communications platform offers capabilities for voice, messaging, video, and voicemail on both desktop and mobile devices. It serves 150,000 users and handles 800 notifications per second, ensuring that all notifications reach their destination in 15 milliseconds.

The Vonage messaging infrastructure, referred to as the bus station, connects your client (e.g., laptop or mobile app) to the messaging system. It uses an API called Frizzle to manage connections, store identifying information, and communicate with the message broker to route messages to the correct queue.

Frizzle is an API within the Vonage messaging infrastructure. It handles new connections, stores identifying information, and works with the message broker to create queues for users and route messages to the appropriate destinations.

When you use the Vonage mobile app and allow notifications, your phone connects to the bus station and registers your device along with a push token. The notification is then sent to the bus HTTP service, which forwards it to PushMe, a service that maps push tokens to users and sends the notification to the appropriate device.

CLSHooked is a continuation local storage package that uses async hooks to maintain local storage for each request session. In the Vonage infrastructure, it is used to save and retrieve trace IDs during the flow of a service, ensuring that logs and requests can be accurately traced throughout the system.

Vonage ensures notifications are delivered within 15 milliseconds by using a well-coordinated messaging infrastructure. The system includes APIs like Frizzle, message brokers, WebSocket connections, and services like PushMe and the bus HTTP service, all working together to route and deliver notifications quickly and efficiently.

The trace ID is used to track notifications throughout the Vonage system. It is added to logs and requests at every step of the notification journey, ensuring that each notification can be traced accurately from the sender to the receiver, thus helping in troubleshooting and ensuring reliable delivery.

If a notification sent by Vonage is not received on a user's device, the trace ID in the logs can be used to track the notification's journey. If all steps are verified and the notification was sent to the app store, the issue is likely with the app store or the device's operating system, such as Apple's notification delivery system.

PushMe is a service within the Vonage notification system responsible for managing push notifications. It maps push tokens to users and sends notifications to the appropriate device based on the user's operating system and device information.

For desktop apps, Vonage uses WebSocket connections to maintain a continuous link for notifications. For mobile apps, where continuous WebSocket connections are not feasible, push notifications are used. The mobile app registers with a push token, and notifications are routed through the bus HTTP service and PushMe to ensure delivery.

Avital Tzubeli
Avital Tzubeli
9 min
24 Jun, 2021

Comments

Sign in or register to post your comment.
Video Summary and Transcription
The Talk explores the journey of a notification in a communications platform, highlighting the challenges of infrastructure engineering. Trace IDs and local storage play a crucial role in ensuring the arrival of notifications, allowing for easy debugging if they don't reach the device. The logs demonstrate the journey of a notification, reaching the app store in just 4 milliseconds.

1. The Journey of a Notification

Short description:

When I was growing up, I loved a children's book about a louse who travels the world and finds a perfect match. This story reminded me of the journey of a notification in our communications platform, which must arrive in milliseconds. Let's explore the real-life challenges of infrastructure engineering and the steps involved in delivering notifications to different devices, including desktop and mobile apps.

When I was growing up in the USA to Israeli parents, most of my books and movies from an early age were actually in Hebrew, of course, to teach me the language before the English took over my brain.

One tape that I particularly loved was called Hakina Nekhama, Nekhama the Louse, which if you're not familiar with this word, is the singular version of lice. Gross, I know.

The story told of this louse who decided that she doesn't want to stay in one head forever. She wants to get out there and travel the world, see different heads and different cities. But obviously everybody hates her and wants her gone, and she journeys tirelessly from one head to another until she accidentally lands on the head of a bald man, who is actually very excited to have her because now he's got the same problem as people with hair. They become friends and live happily ever after.

Now this is obviously a ridiculous story, but when I joined Vonage last year and learned about our communications platform and how many messages it handles per day, I considered the one-in-a-million little notification making its way on a rapid fire journey of 5, 10, just milliseconds arriving to where it needs to be, it suddenly made me think of that pink little louse I loved as a kid, one-in-million, who with purpose and ambition eventually made it to the head of the bald man who wanted her, of her perfect match, precisely where she needed to be. Miraculous, isn't it?

Here's the thing though, in the world of children's books, it's happily ever after, but in real life, things don't go as planned, objects get lost, notifications never make it to their destination, so let's talk about this real life of infrastructure engineering, of tracing our steps and ensuring that we know what's going on at every point of the dangerous journey. But first, a little context.

The Vonage Business Communications platform, the system I'll talk about today, offers capabilities for voice, messaging, video, and voicemail on both desktop and mobile. The platform is home to 150,000 users who through all these functionalities produce 800 notifications per second. And the kicker? All those notifications get to where they need to be in 15 milliseconds, because that's the kind of standards we're used to nowadays. So what does this 15 millisecond journey of a cute little notification look like? Let's start at the beginning.

Say you're on your computer using the desktop app. Your laptop, through the client, connects to the messaging infrastructure that we call the bus station and makes an introduction, Hey, I belong to John's computer. I'd like to sign up for notifications. The bus station pings its API, which we named Frizzle because the children's book references never end, to tell it about the new connection. All the identifying information gets stored. Frizzle communicates with the message broker, which creates a new queue for your particular user and with the help of the message protocol determines which queue to put your messages in. The protocol returns a URL and thus a WebSocket connection is formed between your client and this entire thing that we call the bus. Now on the other end, a message gets sent to you. It hits the Frizzle API belonging to the bus, and Frizzle sends the notification to RabbitMQ, and the Message Protocol sends it to your device. Cool.

But you're wondering now, what about cell phones? The mobile app? I can't exactly maintain a WebSocket connection with the phone at all times, can I? So what about those notorious push notifications from the title of my talk? And more importantly, how am I following the progress of those notifications because with 800 notifications per second, it can be very easy to quickly lose track. Well, when you fire up your mobile app, a similar flow happens. Your phone connects to the bus station and introduces itself with your ID and information about your device and something called a push token, assuming you clicked Allow Notifications in the app. The API handles and stores away this information, and you are now signed up to receive notifications. Now, when Frizzle sends the notification to your queue, that notification also gets sent in parallel to what we call the bus HTTP service. Written in Node, the service knows whether you've signed up to receive push notifications.

2. The Role of Trace IDs and Local Storage

Short description:

If you have a notification and want to guarantee its arrival, trace IDs in the logs play a crucial role. By using a continuation local storage package or CLS hook, you can save the trace ID in local storage, making it accessible for any log or request at any point in the flow. In the HTTP service, middleware grabs the trace ID from the headers and adds it to the session's local storage. The trace ID is then included in logs and requests, ensuring that the logger doesn't need to know about it. Finally, when the notification is sent to PushMe, the trace ID is tracked in the headers, allowing for easy debugging if the notification doesn't reach the device. These logs demonstrate the journey of a notification, reaching the app store in just 4 milliseconds.

If you have, the notification and your information gets sent to PushMe, another Node service for push notifications. This service has a database of those push tokens mapped to each user, which tell it which operating system and which device to send that push notification to. Now with an infrastructure in place, how can we guarantee arrival for 16 million daily notifications? The obvious answer is the trace IDs in the logs, ensuring that the same ID is always used for the specific notification in each service, right? But how do we do that while separating business logic from infrastructure? We don't want a trace ID showing up all over our code now, do we? A handy NPM package comes to the rescue.

And this part's important even if you're not trying to build a massive communications platform. I'm talking about a continuation local storage package or CLS hook that uses async hooks in order to maintain something like a local storage for each request session. If you're already using node 14, though, this functionality is built in with an experimental native API. It allows for the trace ID to be saved in local storage so that it can be retrieved for any log and any requests at any point of the flow.

So, let's consider this part of the diagram. Imagine a push notification headed for your phone telling you it got a message from Jonathan. As we know, Frizzle sends this notification as a request to the bus HTTP service. Upon arriving to the HTTP service, middleware grabs the trace ID from its headers and places it in the session's local storage. Various logs are written as different things happen, like asserting a rifle, for example, and finding information about the user's device. And then each time, that trace ID is added to those logs before they're written. Assuming you're accepting push notifications, the interpreter grabs the notification right before it heads to PushMe and tracks the trace ID to the headers of the request. What's cool is that CLSHooked can be used for other things, not just logs, like in a situation where you need the user's details in every step throughout the flow of your service. But let's see what it looks like in our code, which has all been taken from the HTTP service. We first create an instance of the middleware like this, and we retrieve it later like this. So when a request arrives at the service and is intercepted by the tracing middleware, which checks whether it has a trace ID, assuming it arrived from Frizzle, it should, then it saves that ID to the local storage instance. And at this point we're handling business logic, right? We're checking if we should send this guy a push notification. At each point of the way where a log is to be written and before it gets sent, it will be caught by the middleware here, which will add the trace ID by taking it from the local storage and then writing the log. Now the same interpreter sits on Axios, the HTTP client, so that every time a request is about to go to another service, we catch that HTTP request right before and add the trace ID to its header. And all of this ensures that the logger doesn't need to know anything about the trace ID. Cool, right? When PushMe finally sends that notification to your device, we'll have that final log with the trace ID in question and we know we sent that notification to the app store. That way, if a notification was sent to your phone and you didn't get it, we can pretty much blame Apple, because don't we love blaming Apple? And this is what a series of logs look like for a notification of an inbound call. If you look at the timestamps, you'll notice that it's going from Frizzle to HTTP service to PushMe in a matter of 4 milliseconds. So this may not be the heroic journey of little cute parasite around the world, but hey, for all you know, these could very well be the logs of a little notification that made it around the world in under 15 milliseconds.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
The talk discusses the importance of supply chain security in the open source ecosystem, highlighting the risks of relying on open source code without proper code review. It explores the trend of supply chain attacks and the need for a new approach to detect and block malicious dependencies. The talk also introduces Socket, a tool that assesses the security of packages and provides automation and analysis to protect against malware and supply chain attacks. It emphasizes the need to prioritize security in software development and offers insights into potential solutions such as realms and Deno's command line flags.
Towards a Standard Library for JavaScript Runtimes
Node Congress 2022Node Congress 2022
34 min
Towards a Standard Library for JavaScript Runtimes
Top Content
There is a need for a standard library of APIs for JavaScript runtimes, as there are currently multiple ways to perform fundamental tasks like base64 encoding. JavaScript runtimes have historically lacked a standard library, causing friction and difficulty for developers. The idea of a small core has both benefits and drawbacks, with some runtimes abusing it to limit innovation. There is a misalignment between Node and web browsers in terms of functionality and API standards. The proposal is to involve browser developers in conversations about API standardization and to create a common standard library for JavaScript runtimes.
ESM Loaders: Enhancing Module Loading in Node.js
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
ESM Loaders enhance module loading in Node.js by resolving URLs and reading files from the disk. Module loaders can override modules and change how they are found. Enhancing the loading phase involves loading directly from HTTP and loading TypeScript code without building it. The loader in the module URL handles URL resolution and uses fetch to fetch the source code. Loaders can be chained together to load from different sources, transform source code, and resolve URLs differently. The future of module loading enhancements is promising and simple to use.
Out of the Box Node.js Diagnostics
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
This talk covers various techniques for getting diagnostics information out of Node.js, including debugging with environment variables, handling warnings and deprecations, tracing uncaught exceptions and process exit, using the v8 inspector and dev tools, and generating diagnostic reports. The speaker also mentions areas for improvement in Node.js diagnostics and provides resources for learning and contributing. Additionally, the responsibilities of the Technical Steering Committee in the TS community are discussed.
Node.js Compatibility in Deno
Node Congress 2022Node Congress 2022
34 min
Node.js Compatibility in Deno
Deno aims to provide Node.js compatibility to make migration smoother and easier. While Deno can run apps and libraries offered for Node.js, not all are supported yet. There are trade-offs to consider, such as incompatible APIs and a less ideal developer experience. Deno is working on improving compatibility and the transition process. Efforts include porting Node.js modules, exploring a superset approach, and transparent package installation from npm.
Multithreaded Logging with Pino
JSNation Live 2021JSNation Live 2021
19 min
Multithreaded Logging with Pino
Top Content
Today's Talk is about logging with Pino, one of the fastest loggers for Node.js. Pino's speed and performance are achieved by avoiding expensive logging and optimizing event loop processing. It offers advanced features like async mode and distributed logging. The use of Worker Threads and Threadstream allows for efficient data processing. Pino.Transport enables log processing in a worker thread with various options for log destinations. The Talk concludes with a demonstration of logging output and an invitation to reach out for job opportunities.

Workshops on related topic

Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
Building a Hyper Fast Web Server with Deno
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
WorkshopFree
Matt Landers
Will Johnston
2 authors
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
0 to Auth in an Hour Using NodeJS SDK
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
GraphQL - From Zero to Hero in 3 hours
React Summit 2022React Summit 2022
164 min
GraphQL - From Zero to Hero in 3 hours
Workshop
Pawel Sawicki
Pawel Sawicki
How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.
Mastering Node.js Test Runner
TestJS Summit 2023TestJS Summit 2023
78 min
Mastering Node.js Test Runner
Workshop
Marco Ippolito
Marco Ippolito
Node.js test runner is modern, fast, and doesn't require additional libraries, but understanding and using it well can be tricky. You will learn how to use Node.js test runner to its full potential. We'll show you how it compares to other tools, how to set it up, and how to run your tests effectively. During the workshop, we'll do exercises to help you get comfortable with filtering, using native assertions, running tests in parallel, using CLI, and more. We'll also talk about working with TypeScript, making custom reports, and code coverage.