English versionEN

Performance Monitoring of a Heterogeneous GraphQL Mesh App

Today it is fairly easy to integrate GraphQL on a client and server-side and get it all up and running quickly with any cloud service of your choice like e.g. Netlify or Vercel. With this setup, how can we monitor the performance, and how observe all parts together to find any root cause in case of problems?

This talk has been presented at GraphQL Galaxy 2021, check out the latest edition of this Tech Conference.

FAQ

Robert Horslowski is a professional working at Instaun in IBM, who has experience with GraphQL and has conducted talks and published courses on related subjects.

A service mesh in the context of GraphQL refers to an infrastructure where multiple services communicate with each other, often monitored and managed to ensure efficient and reliable operations.

Performance monitoring is crucial in service meshes to ensure that services meet expected timings and performance standards, as delays can lead to user dissatisfaction and potential business problems.

Robert Horslowski used ApolloEngine to track metrics and diagnose performance issues in his GraphQL application. Later, he also utilized Instana for more comprehensive monitoring.

Instana provides detailed traces of service communications and infrastructure metrics, combined with end user monitoring (UEM), which helps in efficiently identifying and resolving performance issues in GraphQL applications.

The specific issue in Robert's live demo was inconsistent response times in the GraphQL service backend, which varied dramatically, sometimes taking up to 13 seconds for a response.

Developers can enhance observability by using tools like Apollo Studio for schema management and Instana for monitoring, which help in identifying issues early and providing a comprehensive view of application performance in production.

Open telemetry is a set of APIs, libraries, and agents that collect telemetry data (metrics, logs, and traces) from applications, which is essential for observing and managing the performance of service meshes.

Robert Hostlowsky

8 min

10 Dec, 2021

Comments

Video Summary and Transcription

Performance monitoring is crucial for businesses as users don't like to wait. The ApolloEngine tool helps track and analyze metrics, revealing response time variances and other information. Instana combines traces for service communication with infrastructure metrics and end user monitoring, implementing open telemetry. Apollo Studio is great for managing the GraphQL schema and provides full observability, enabling efficient root cause analysis.

Available in Español: Monitorización del rendimiento de una aplicación heterogénea de GraphQL Mesh

1. Performance Monitoring and Issue Investigation

Short description:

I'm Robert Horslowski, a software engineer at Instaun in IBM company. I have experience with GraphQL and have encountered performance issues in live demo applications. Performance monitoring is necessary because users don't like to wait, and APIs are crucial for businesses. Investigating a real performance issue, I found that the communication with the database was sometimes very slow. The ApolloEngine tool helped track and analyze metrics, revealing response time variances and other information.

Hi everybody! I'm very happy to be here to have the opportunity to share my thoughts and learnings about performance with GraphQL specifically in a service mesh. Let me quickly introduce myself. I'm Robert Horslowski working at Instaun in IBM company and in 2016 I gave a talk about GraphQL in Relay. Later in 2018 I published this video course about a full-state trailer clone on top of GraphQL. By then 2019 I found a subtle performance issue in this live demo application which brings all this rolling.

But let's first dive into and see what do we mean with distributed mesh. So, actually we don't have only one service but typically our landscape from an infrastructure looks like this. So, of course there can be one or two machines going down and so on. But this typically handled. But what is then happening on the service level. And here also this is typically how a service mesh looks like when you look into it and have a representation of the traffic of the communication. And also here there are of course many communications running and this is typically not good visible if you have not such a tool.

But first, let's ask the question, why is performance monitoring necessary? Yeah, it's quite simple. Users don't like to wait. And typically when we have today a service mesh or at least some service is used. Maybe this is a tool for a payment service or anything like this. And typically, other services depend on that. And this needs to somehow be tracked. And in case of a failure, of course, should be easily found and fixed. Why is this important? Typically, today, when APIs are the center of a business, for instance, then also here, it's very important that timings are as expected. So nobody wants to wait for something and later find out it was not their fault, but somebody else. And even while there might have been a contract, so-called SLA, where you define a specific service needs to be reacting sometime. And if it does not, that's where somebody has a problem and the business has a problem at the end.

But let's come to investigating a real performance issue. As I mentioned, I had a problem with my live demo at the time. It's a simple Kanban board with some database transactions or a backend where you have some data stored, of course, but also, at that time the communication of the database was graphical. So, for some reason, it was very slow, but on other times, it was very fast. I couldn't say where the problem is, but sometimes it was really really slow, and there's only the tool out there, or it was there, it was called ApolloEngine. It was quite simple to just add an API key into the Apollo server when using the Apollo server library, and then it automatically tracks these metrics and showed them here in the board. So you can see here, this is the variance, let's say, or the spectrum of the response times, up to 13 seconds for a call, which of course is not acceptable, and there are some more information like on the right, so the number of queries and so on.

2. Instana and Apollo Studio

Short description:

A year ago, I had the chance to use Instana, which combines traces for service communication with infrastructure metrics and end user monitoring. It implements open telemetry. To collect user data, inject the UEM snippet in the website. Tracking down backend traces and analyzing query counts is easy. I also monitor my application running on Netlify functions using the instanawrapper. The real problem was using a GraphQL service backend with a premium plan. Apollo Studio is great for managing the GraphQL schema and provides full observability, enabling efficient root cause analysis.

This was a year ago. In the meantime, they improved their service and also have some tracing built in, which can also be very easily enabled and for specific freemium services also quite easy and doesn't cost anything.

So but this, at that time, also gave me a little bit of information and I also had the chance to use Instana, and Instana combines these traces for the communication of services together with infrastructure metrics and also with UEM, so with end user monitoring. And by the way, it's implementing open telemetry, the latest standard in this area.

So how do we get there? It's quite simple at the end. Finally, to get all the information of the user and what the user is doing, you just inject your UEM snippet in the website, then the GraphQL query can collect all the data, how even JavaScript errors and so on. And even specific requests you can find here, and then tracking down, we find a few to backend trace there at the end, also show up the GraphQL query. And right side you see there's some meta information of the operation and so on. And we can also do some more analytics on the counts of queries and so on. But nowadays, my application also runs in Netlify functions, which at the end run on AWS Lambda. So how can we track that? It's quite easy, just using here this instanawrapper. And with this, I was able to monitor the Apollo application server here as we saw in the slide before.

So finally, what was the real problem? At the end, I figured out that the real small thing was that as I used at that time a GraphQL service backend which used this premium plan. So that was the only problem. Summary, it's quite easy. Apollo Studio is great for managing the GraphQL schema, and it's done as a full-blown observability with all these extra features, and it enables the left shifting for giving developers a full context of their running application in production. So this makes it also very efficient to find any root cause. I would say, let me say, thank you very much for listening. And for any questions, please reach me at Twitter, at their hosts, or the email robertoslofskaya.steiner.com. And, of course, I hope to see you and meet you at the conference chat.

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

A Guide to React Rendering Behavior

React Advanced 2022

25 min

A Guide to React Rendering Behavior

Top Content

Mark Erikson

Replay.io

This transcription provides a brief guide to React rendering behavior. It explains the process of rendering, comparing new and old elements, and the importance of pure rendering without side effects. It also covers topics such as batching and double rendering, optimizing rendering and using context and Redux in React. Overall, it offers valuable insights for developers looking to understand and optimize React rendering.

react performance deep dive react rendering

Speeding Up Your React App With Less JavaScript

React Summit 2023

32 min

Speeding Up Your React App With Less JavaScript

Top Content

Watch video: Speeding Up Your React App With Less JavaScript

Miško Hevery

Qwik Creator

Mishko, the creator of Angular and AngularJS, discusses the challenges of website performance and JavaScript hydration. He explains the differences between client-side and server-side rendering and introduces Quik as a solution for efficient component hydration. Mishko demonstrates examples of state management and intercommunication using Quik. He highlights the performance benefits of using Quik with React and emphasizes the importance of reducing JavaScript size for better performance. Finally, he mentions the use of QUIC in both MPA and SPA applications for improved startup performance.

performance frameworks builders and founders qwik react less

React Concurrency, Explained

React Summit 2023

23 min

React Concurrency, Explained

Top Content

Watch video: React Concurrency, Explained

Ivan Akulov

Google Developer Expert, Web Performance Consultant, Netherlands

React 18's concurrent rendering, specifically the useTransition hook, optimizes app performance by allowing non-urgent updates to be processed without freezing the UI. However, there are drawbacks such as longer processing time for non-urgent updates and increased CPU usage. The useTransition hook works similarly to throttling or bouncing, making it useful for addressing performance issues caused by multiple small components. Libraries like React Query may require the use of alternative APIs to handle urgent and non-urgent updates effectively.

react performance best practices react 18 deep dive react concurrent mode

How React Compiler Performs on Real Code

React Advanced 2024

31 min

How React Compiler Performs on Real Code

Top Content

Nadia Makarevich

Coder, writer, author of Advanced React book

I'm Nadia, a developer experienced in performance, re-renders, and React. The React team released the React compiler, which eliminates the need for memoization. The compiler optimizes code by automatically memoizing components, props, and hook dependencies. It shows promise in managing changing references and improving performance. Real app testing and synthetic examples have been used to evaluate its effectiveness. The impact on initial load performance is minimal, but further investigation is needed for interactions performance. The React query library simplifies data fetching and caching. The compiler has limitations and may not catch every re-render, especially with external libraries. Enabling the compiler can improve performance but manual memorization is still necessary for optimal results. There are risks of overreliance and messy code, but the compiler can be used file by file or folder by folder with thorough testing. Practice makes incredible cats. Thank you, Nadia!

performance

Optimizing HTML5 Games: 10 Years of Learnings

JS GameDev Summit 2022

33 min

Optimizing HTML5 Games: 10 Years of Learnings

Top Content

Watch video: Optimizing HTML5 Games: 10 Years of Learnings

Will Eastcott

CEO & co-founder of PlayCanvas

PlayCanvas is an open-source game engine used by game developers worldwide. Optimization is crucial for HTML5 games, focusing on load times and frame rate. Texture and mesh optimization can significantly reduce download sizes. GLTF and GLB formats offer smaller file sizes and faster parsing times. Compressing game resources and using efficient file formats can improve load times. Framerate optimization and resolution scaling are important for better performance. Managing draw calls and using batching techniques can optimize performance. Browser DevTools, such as Chrome and Firefox, are useful for debugging and profiling. Detecting device performance and optimizing based on specific devices can improve game performance. Apple is making progress with WebGPU implementation. HTML5 games can be shipped to the App Store using Cordova.

performance game development game engine

From GraphQL Zero to GraphQL Hero with RedwoodJS

GraphQL Galaxy 2021

32 min

From GraphQL Zero to GraphQL Hero with RedwoodJS

Top Content

Tom Preston-Werner

GitHub cofounder, RedwoodJS author

Tom Pressenwurter introduces Redwood.js, a full stack app framework for building GraphQL APIs easily and maintainably. He demonstrates a Redwood.js application with a React-based front end and a Node.js API. Redwood.js offers a simplified folder structure and schema for organizing the application. It provides easy data manipulation and CRUD operations through GraphQL functions. Redwood.js allows for easy implementation of new queries and directives, including authentication and limiting access to data. It is a stable and production-ready framework that integrates well with other front-end technologies.

frameworks graphql redwoodjs builders and founders

Workshops on related topic

React Performance Debugging Masterclass

React Summit 2023

170 min

React Performance Debugging Masterclass

Top Content

Featured Workshop

Ivan Akulov

Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)

react performance best practices advanced debug react debugger react performance react profiler

Build a Headless WordPress App with Next.js and WPGraphQL

React Summit 2022

173 min

Build a Headless WordPress App with Next.js and WPGraphQL

Top Content

Workshop

Kellen Mace

In this workshop, you’ll learn how to build a Next.js app that uses Apollo Client to fetch data from a headless WordPress backend and use it to render the pages of your app. You’ll learn when you should consider a headless WordPress architecture, how to turn a WordPress backend into a GraphQL server, how to compose queries using the GraphiQL IDE, how to colocate GraphQL fragments with your components, and more.

next.js wordpress graphql

Next.js 13: Data Fetching Strategies

React Day Berlin 2022

53 min

Next.js 13: Data Fetching Strategies

Top Content

Workshop

Alice De Mauro

- Introduction- Prerequisites for the workshop- Fetching strategies: fundamentals- Fetching strategies – hands-on: fetch API, cache (static VS dynamic), revalidate, suspense (parallel data fetching)- Test your build and serve it on Vercel- Future: Server components VS Client components- Workshop easter egg (unrelated to the topic, calling out accessibility)- Wrapping up

performance next.js best practices react server components

Build with SvelteKit and GraphQL

GraphQL Galaxy 2021

140 min

Build with SvelteKit and GraphQL

Top Content

Workshop

Scott Spence

Have you ever thought about building something that doesn't require a lot of boilerplate with a tiny bundle size? In this workshop, Scott Spence will go from hello world to covering routing and using endpoints in SvelteKit. You'll set up a backend GraphQL API then use GraphQL queries with SvelteKit to display the GraphQL API data. You'll build a fast secure project that uses SvelteKit's features, then deploy it as a fully static site. This course is for the Svelte curious who haven't had extensive experience with SvelteKit and want a deeper understanding of how to use it in practical applications.

Table of contents:
- Kick-off and Svelte introduction
- Initialise frontend project
- Tour of the SvelteKit skeleton project
- Configure backend project
- Query Data with GraphQL
- Fetching data to the frontend with GraphQL
- Styling
- Svelte directives
- Routing in SvelteKit
- Endpoints in SvelteKit
- Deploying to Netlify
- Navigation
- Mutations in GraphCMS
- Sending GraphQL Mutations via SvelteKit
- Q&A

graphql svelte

Relational Database Modeling for GraphQL

GraphQL Galaxy 2020

106 min

Relational Database Modeling for GraphQL

Top Content

Workshop

Adron Hall

In this workshop we'll dig deeper into data modeling. We'll start with a discussion about various database types and how they map to GraphQL. Once that groundwork is laid out, the focus will shift to specific types of databases and how to build data models that work best for GraphQL within various scenarios.
Table of contentsPart 1 - Hour 1 a. Relational Database Data Modeling b. Comparing Relational and NoSQL Databases c. GraphQL with the Database in mindPart 2 - Hour 2 a. Designing Relational Data Models b. Relationship, Building MultijoinsTables c. GraphQL & Relational Data Modeling Query Complexities
Prerequisites a. Data modeling tool. The trainer will be using dbdiagram b. Postgres, albeit no need to install this locally, as I'll be using a Postgres Dicker image, from Docker Hub for all examples c. Hasura

database graphql

Build and Deploy a Backend With Fastify & Platformatic

JSNation 2023

104 min

Build and Deploy a Backend With Fastify & Platformatic

Top Content

WorkshopFree

Matteo Collina

Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/).
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.

node.js cloud graphql fastify