English versionEN

Comprehensive Observability via Distributed Tracing on Node.js8

Chinmay Gaikwad

Epsagon

This ad is not shown to multipass and full ticket holders

React Advanced 2026

October 23 - 26, 2026

London, UK & Online

We will be diving deep

Learn More

Bookmark

Sentry

Promoted

Code breaks, fix it faster

Crashes, slowdowns, regressions in prod. Seer by Sentry unifies traces, replays, errors, profiles to find root causes fast.

Get started

The benefits of Node.js to develop real-time applications at scale are very well known. As Node.js architectures get more and more complex, visualization of your microservice-based architecture is crucial. However, the visualization of microservices is incredibly complex given the scale and the transactions across them. You not only need to visualize your Node.js applications but also analyze the health, flow, and performance of applications to have a complete observability solution. In this talk, we'll go over the challenges of scaling your Node.js applications and tools (such as distributed tracing) available to you to scale with confidence.

This talk has been presented at Node Congress 2021, check out the latest edition of this JavaScript Conference.

FAQ

Microservices bring challenges such as difficulty in observing and monitoring due to their distributed nature. Traditional monitoring systems often fail to provide clear visibility under the hood of microservices.

The three pillars of observability are logs, metrics, and traces. Metrics help identify issues, logs explain why they occurred, and traces provide detailed information about request paths through services.

Distributed tracing provides the ability to track a request's path through various services, helping identify where failures or bottlenecks occur. This can significantly reduce the time to detect and resolve issues compared to traditional methods.

Important metrics to monitor include CPU usage, memory usage, as well as business-level metrics like bounce rates, revenue, and click-through rates.

In a highly distributed microservices environment, logs can be voluminous and scattered, making it difficult to pinpoint issues without a significant amount of time and effort.

Correlation in observability involves linking metrics, logs, and traces across various services to provide a comprehensive view of system performance and issues, facilitating quicker problem identification and resolution.

Distributed tracing helps narrow down the scope of services involved in an issue, reduces guesswork, pinpoints where time is spent in the code, and provides actionable data through visualizations of service interactions.

A sustainable observability strategy should include clarity on business goals, a choice between DIY or managed solutions, implementation of lightweight observability tools, and scalability to handle fast-growing microservice architectures.

Being proactive in observability allows organizations to prevent issues before they impact the system significantly, reducing downtime and maintaining high performance and reliability.

node.js observability

Chinmay Gaikwad

8 min

24 Jun, 2021

Comments

Video Summary and Transcription

Welcome to the session on comprehensive observability via distributed tracing on Node.js. We'll explore the challenges of microservices and troubleshoot distributed applications using an example. Correlation is the missing piece in troubleshooting distributed applications. Distributed tracing helps pinpoint issues that logging or metrics may miss, reducing mean time to resolution. It provides visualization of microservices architecture, actionable data, and enables code optimization.

Available in Español: Observabilidad Integral a través del Rastreo Distribuido en Node.js

1. Introduction to Observability

Short description:

Welcome to the session on comprehensive observability via distributed tracing on Node.js. In this session, we'll look at the new challenges in microservices, troubleshoot distributed applications using an example, and build a sustainable observability strategy for your company. Microservices have great benefits but also bring new challenges such as observability. Traditional monitoring systems make it hard to know what's happening under the hood.

Hello, everyone. Welcome to the session on comprehensive observability via distributed tracing on Node.js. I'm the host for the session. I'm Chinmay Gaikwad. I'm a technical evangelist at Epsigon.

Let's get started with the session. In this session, we'll look at the new challenges in microservices, specifically focusing on observability. We'll also look at how to troubleshoot distributed applications using an example, and finally we'll look at how to build a sustainable observability strategy for your company.

So let's start with the challenges on microservices. We know microservices have great benefits including scalability, speed of development, decreased system administration time, but microservices have also brought about new challenges such as observability in microservices. Using traditional monitoring systems, it can be nearly impossible to know what is going on under the hood. We'll explore this into much details in the upcoming slides.

2. Troubleshooting Distributed Applications

Short description:

Let's start with metrics, which are a great way to identify issues. Logs tell us why something went wrong, but they are not sufficient in a microservices-based environment. The traditional way of debugging involves looking at metrics, then logs, but it lacks context. Correlation is the missing piece in troubleshooting distributed applications.

First, let's see how to troubleshoot distributed applications. So we know the three pillars of observability are logs and traces. We'll deep dive into tracing a bit later. Let's start with metrics. Metrics are a great way for opps to figure out if something has gone wrong. Some examples of metrics include CPU usage, memory usage. We also have business level metrics such as bounce rates, revenue, click through rate, etc.

Logs on the other hand, tell us why something went wrong. So for this session, let us consider an example of a virtual shop. As you can see, the SAP server authenticates requests using Auth0, and then pushes them onto the Kafka Stream. A Java container pulls the stream and updates a DynamoDB table. Let's say there was a situation where users complained about OAuth that was sent but not handled. Where would you start?

Traditional monitoring solutions come at the expense of higher resource utilization because they have multiple high-heavyweight agents. And they also have the ability to only collect host metrics or are purely metric-driven. Metrics, as we have seen, really only let us know that something is broken, but not when or why. Context is absolutely critical in today's environments. Using the traditional way, first you look at Kafka metrics. You don't see anything abnormal here, so maybe look at the DynamoDB metrics next. We see some spikes here, so that's pretty interesting. So for debugging this, you need more data. And more data means logs. But are logs really sufficient in a microservices-based environment? Let's look into it.

We all know what logs look like. Personally, I have a love-hate relationship with logs. I love the fact that they are available, but I hate digging through them. I've sat myself digging through hundreds or even thousands of lines of logs hoping to spot that outlier. What if I knew the exact path that request is taking through individual services and components? Logs are good to debug on the list, but they don't really work as a starting point in a highly distributed system. So in a workshop example, if you're very lucky, you'll be able to spot the problem, but it might take a very long time. So let's recap of what are the things that are missing here. It essentially boils down to correlation.

3. Correlation and Benefits of Distributed Tracing

Short description:

Correlation between metrics and logs and between different services is crucial for finding the exact problem. Distributed tracing helps shine a light on the needle in the haystack, revealing issues that logging or metrics may miss. By using distributed tracing in the virtual job example, we can pinpoint the problem, such as a missing key ID. By focusing on specific services like the Kafka stream and Auth0 microservice, we can identify the root cause, such as an expired token. This approach significantly reduces mean time to resolution and detection compared to traditional monitoring solutions.

Correlation between metrics and logs and between different services. The correlation will help us find the exact problem. So how do we correlate these pieces? That is where distributed tracing comes into picture. I'm sure most of you must have heard of tracing lately. Many vendors offer some form of distributed tracing. Even service meshes are now building support for it. This tracing essentially helps shine the light on the needle in the haystack that logging or metrics can miss. Just because your application has 15 or 20,000 services doesn't mean a request will travel through every single one of them. At most it will travel through a fraction of them. So using distributed tracing to our virtual job example, you can now see where the problem is. The problem is the missing key ID. With key ID though, once you focus on the Kafka stream, you see that the user name is missing. And then, when you focus on the Auth0 microservice, you can see why is it missing exactly. It is because of an expired token. So specifically for Auth0, you now know that you should be using refresh tokens instead of access tokens. In short, this has reduced your mean time to resolution as well as detection back quite a lot as compared to traditional monitoring solutions.

4. Benefits of Distributed Tracing

Short description:

Distributed tracing provides several benefits, including visualization of microservices architecture, actionable data, and narrowing the scope of services. It also helps pinpoint where time is being spent in the code, enabling optimization. At Epsilon, we have designed our product with lightweight agents, supporting different environments and providing rich context across metrics, events, logs, and traces. Building an observability strategy requires planning and clarity on business goals and architecture, choosing between DIY and managed approaches, implementing the solution, and ensuring scalability. Choose a proactive strategy. Thank you for attending!

Distributed tracing has a number of benefits. Let's look at a few of them. So typical architecture has a number of microservices involved, and one of the most important features of an observability solution is visualization. User should also expect actionable data within these complex visualizations and service maps. For example, in these visualizations, you should be able to see the latency between the components as well as areas where thresholds have been crossed.

As you have seen in the previous example of a virtual shop, distributed tracing based solution can also help narrow the scope of services. That actually takes out the guesswork in determining what has gone wrong. Without such smart filtering abilities, the architecture map becomes nothing more than an exercise in chaos theory.

Another great benefit of a distributed tracing solution is to pinpoint where time is being spent in the code. Here is an example of spans which make up a trace. These essentially can tell you if a significant portion of the time is spent waiting for an external API call or perhaps they have an inefficient database call that needs refactoring. At Epsilon, we have designed our product around the best practices for observability by talking to industry experts and our customers.

For example, we should have an automated approach, which can consist of lightweight agents, which won't consume a lot of your resources. Also, these agents should support different environments such as virtual machines or containers or serverless. We should be able to do this with rich context across metrics, events, logs, and traces that allow you to search full payload or custom tags. Observability should not only tell you if something has gone wrong, but pinpoint to where and why exactly to help reduce your mean time to detection and resolution.

And finally, if you have to build an observability strategy, you have to plan well in advance. First of all, have clarity on your business goals and architecture model and determine your approach, DIY or manage. Both of their pros and cons. For example, using DIY, you can use one of the open source solutions, but it will require a significant development effort to get it right, if you get it right. Then implement the observability solution and finally ensure the scalability because microservices can scale really fast. Scaling of the observability solution is super critical. Finally, I would like to end this session by saying that you should choose a strategy which enables you to be proactive and not reactive. Thank you for attending the session. Please visit the URL on the slide for a special offer.

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder

Node Congress 2022

26 min

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder

Top Content

Feross Aboukhadijeh

Feross is the author and maintainer of WebTorrent, StandardJS, and 100s of other open source projects

The talk discusses the importance of supply chain security in the open source ecosystem, highlighting the risks of relying on open source code without proper code review. It explores the trend of supply chain attacks and the need for a new approach to detect and block malicious dependencies. The talk also introduces Socket, a tool that assesses the security of packages and provides automation and analysis to protect against malware and supply chain attacks. It emphasizes the need to prioritize security in software development and offers insights into potential solutions such as realms and Deno's command line flags.

node.js security

ESM Loaders: Enhancing Module Loading in Node.js

JSNation 2023

22 min

ESM Loaders: Enhancing Module Loading in Node.js

Top Content

Gil Tayar

Microsoft, Israel

ESM Loaders enhance module loading in Node.js by resolving URLs and reading files from the disk. Module loaders can override modules and change how they are found. Enhancing the loading phase involves loading directly from HTTP and loading TypeScript code without building it. The loader in the module URL handles URL resolution and uses fetch to fetch the source code. Loaders can be chained together to load from different sources, transform source code, and resolve URLs differently. The future of module loading enhancements is promising and simple to use.

node.js

The State of Node.js 2025

JSNation 2025

30 min

The State of Node.js 2025

Top Content

Matteo Collina

Node.js TSC committee member. Pino & Fastify author.

The speaker covers a wide range of topics related to Node.js, including its resilience, popularity, and significance in the tech ecosystem. They discuss Node.js version support, organization activity, development updates, enhancements, and security updates. Node.js relies heavily on volunteers for governance and contribution. The speaker introduces an application server for Node.js enabling PHP integration. Insights are shared on Node.js downloads, infrastructure challenges, software maintenance, and the importance of update schedules for security.

node.js

Towards a Standard Library for JavaScript Runtimes

Node Congress 2022

34 min

Towards a Standard Library for JavaScript Runtimes

Top Content

James Snell

Workers team @Cloudflare

There is a need for a standard library of APIs for JavaScript runtimes, as there are currently multiple ways to perform fundamental tasks like base64 encoding. JavaScript runtimes have historically lacked a standard library, causing friction and difficulty for developers. The idea of a small core has both benefits and drawbacks, with some runtimes abusing it to limit innovation. There is a misalignment between Node and web browsers in terms of functionality and API standards. The proposal is to involve browser developers in conversations about API standardization and to create a common standard library for JavaScript runtimes.

javascript component library node.js

Out of the Box Node.js Diagnostics

Node Congress 2022

34 min

Out of the Box Node.js Diagnostics

Colin Ihrig

Member of the Node.js Technical Steering Committee

This talk covers various techniques for getting diagnostics information out of Node.js, including debugging with environment variables, handling warnings and deprecations, tracing uncaught exceptions and process exit, using the v8 inspector and dev tools, and generating diagnostic reports. The speaker also mentions areas for improvement in Node.js diagnostics and provides resources for learning and contributing. Additionally, the responsibilities of the Technical Steering Committee in the TS community are discussed.

node.js

Node.js Compatibility in Deno

Node Congress 2022

34 min

Node.js Compatibility in Deno

Bartek Iwanczuk

Deno core team member

Deno aims to provide Node.js compatibility to make migration smoother and easier. While Deno can run apps and libraries offered for Node.js, not all are supported yet. There are trade-offs to consider, such as incompatible APIs and a less ideal developer experience. Deno is working on improving compatibility and the transition process. Efforts include porting Node.js modules, exploring a superset approach, and transparent package installation from npm.

node.js deno js runtimes

Workshops on related topic

Building a RAG System in Node.js: Vector Databases, Embeddings & Chunking

Node Congress 2025

98 min

Building a RAG System in Node.js: Vector Databases, Embeddings & Chunking

Featured Workshop

2 authors

Large Language Models (LLMs) are powerful, but they often lack real-time knowledge. Retrieval-Augmented Generation (RAG) bridges this gap by fetching relevant information from external sources before generating responses. In this workshop, we’ll explore how to build an efficient RAG pipeline in Node.js using RSS feeds as a data source. We’ll compare different vector databases (FAISS, pgvector, Elasticsearch), embedding methods, and testing strategies. We’ll also cover the crucial role of chunking—splitting and structuring data effectively for better retrieval performance.Prerequisites- Good understanding of JavaScript or TypeScript- Experience with Node.js and API development- Basic knowledge of databases and LLMs is helpful but not required
Agenda📢 Introduction to RAG💻 Demo - Example Application (RAG with RSS Feeds)📕 Vector Databases (FAISS, pgvector, Elasticsearch) & Embeddings🛠️ Chunking Strategies for Better Retrieval🔬 Testing & Evaluating RAG Pipelines (Precision, Recall, Performance)🏊‍♀️ Performance & Optimization Considerations🥟 Summary & Q&A

node.js database

Build a MCP (Model Context Protocol) in Node.js

JSNation US 2025

97 min

Build a MCP (Model Context Protocol) in Node.js

Featured Workshop

Julián Duque

Model Context Protocol (MCP) introduces a structured approach to LLM context management that addresses limitations in traditional prompting methods. In this workshop, you'll learn about the Model Context Protocol, its architecture, and how to build and use and MCP with Node.jsTable of Contents:What Is the Model Context Protocol?Types of MCPs (Stdio, SSE, HTTP Streaming)Understanding Tools, Resources, and PromptsBuilding an MCP with the Official TypeScript SDK in Node.jsDeploying the MCP to the Cloud (Heroku)Integrating the MCP with Your Favorite AI Tool (Claude Desktop, Cursor, Windsurf, VS Code Copilot)Security Considerations and Best Practices

node.js

Node.js Masterclass

Node Congress 2023

109 min

Node.js Masterclass

Top Content

Workshop

Matteo Collina

Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate

node.js

Build and Deploy a Backend With Fastify & Platformatic

JSNation 2023

104 min

Build and Deploy a Backend With Fastify & Platformatic

Top Content

WorkshopFree

Matteo Collina

Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/).
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.

node.js cloud graphql fastify

Building a Hyper Fast Web Server with Deno

JSNation Live 2021

156 min

Building a Hyper Fast Web Server with Deno

Top Content

Workshop

2 authors

Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.

node.js deno backend

0 to Auth in an Hour Using NodeJS SDK

Node Congress 2023

63 min

0 to Auth in an Hour Using NodeJS SDK

WorkshopFree

Asaf Shen

Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher

javascript node.js authentication