Creating an innovation engine with observability

Rate this content
Bookmark

How Baselime created a culture where it's possible to move fast, break as little as possible, and recover from failures gracefully. The culture is technically underpinned by Node.js, Event-Driven Architectures (EDAs), and Observability (o11y).

This talk has been presented at Node Congress 2023, check out the latest edition of this Tech Conference.

FAQ

Baseline focuses on observability for serverless architectures, enabling teams to run and maintain their serverless code efficiently.

Baseline is a company that provides observability solutions for serverless architectures, helping teams run and maintain their serverless code over time.

The key metrics are deployment frequency, lead time for code changes, change failure rate, recovery time from defects, and mess around lead time (customer insight to production).

Baseline allows code deployments whenever needed, without red tape. Developers can push changes, and the CI/CD pipeline handles the deployment, typically taking around two minutes.

'Mess around lead time' is the time it takes from gaining customer insights to deploying features based on those insights to production. At Baseline, this typically takes about half a day.

No, Baseline does not use code reviews. Instead, they rely on pair programming and trust in their engineers to maintain code quality.

The 'ship skateboards' philosophy involves building the smallest possible version of a feature, deploying it to production, gathering user feedback, and iterating on it until it evolves into a fully-fledged product.

Observability-driven development at Baseline means integrating observability into the development lifecycle, actively instrumenting applications, and testing in production to detect and address issues quickly.

Baseline ensures high deployment frequency and fast recovery times through streamlined testing, minimal red tape around deployments, and robust observability to quickly detect and fix defects.

Baseline minimizes tech debt and flaky tests by focusing on critical path testing, employing streamlined CI/CD pipelines, and leveraging observability to detect and resolve issues quickly.

Boris Tane
Boris Tane
27 min
14 Apr, 2023

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Baseline provides observability for serverless architectures and has created an innovation engine within their team. They measure team performance using Dora metrics and the Accelerate book. Baseline emphasizes the importance of foundations, streamlined testing, and fast deployment. They practice observability-driven development and incorporate observability as part of their development lifecycle. Baseline believes in building a culture that fosters ownership and democratizes production.

1. Introduction to Baseline and the Innovation Engine

Short description:

My name is Boris. I'm the founder and CEO of Baseline. We provide observability for serverless architectures. Today, I want to share how we have created an innovation engine within our team. We ship fast and I will discuss the methods we apply. The question of how well a team is performing can now be answered using the Dora metrics and the Accelerate book. We measure deployment frequency, time to go live, deployment failures, outage recovery time, and mess around lead time.

♪♪ ♪♪ ♪♪ My name is Boris. I'm the founder and CEO of Baseline. What we do is observability for serverless architectures. So I'm sure most of you guys have heard the word serverless multiple times today, from the talk earlier in the morning to all the demos that have happened since then, and a lot of emphasis is put on how to deploy code onto the cloud and et cetera, but very little effort is actually put into how do we actually run and maintain this code over time. And that's the sort of solution that we provide for people that are adopting serverless architectures.

But what I want to talk today, actually, is something completely different, is how internally within the Baseline team, we have been able to create what I like to call an innovation engine thanks to the observability that we have. So compared to other startups at very similar stages of life, we are at this point where we ship really, really, really fast. And I want to share with you the, I wouldn't say tricks, but the methods that we apply so that we can ship so fast. So, the first thing is, is anybody here in the room dealing with tech debt at their job right now? I see one hand to oh, wow. Almost all the room. Is anybody dealing with flaky tests? Is anybody dealing with CI, CD pipelines that never work when you need them to work? Again, almost everybody. And that's what I don't like. When we signed up to be software engineers, and for a lot of us cloud engineers, what we wanted is to create things and put them in the hands of people and make that innovation happen and see how people are interacting with those things that we create and we put on the web. But we are left day to day dealing with tech debt, flaky tests, and all of that, which is basically just slowing us and preventing us from innovating every single day.

And there's this question that comes up a lot in conferences and tech conversations is, how well is your team performing? And this question, what it actually means is, how much innovation is your team shipping every day? And up until very recently, there was no real way of actually answering this question, honestly. People will say, oh, we're doing well, but there was no way of quantifying it. Up until the Dora metrics and the Accelerate book. I hope everybody here has read it. If you haven't, please get a copy. And it gave us a scientific framework that we can use to actually be able to say, okay, we're in the top 10% performing teams. We're in the top 20% or we're in the bottom 10%, and we need to do a lot of work to get out of there. And to be able to answer that question, there are a few metrics that we need to measure. The first one is, how often do you deploy, that's your deployment frequency. Second one is, how long does it take for code to go live? So from a developer writing code in their code editor locally to that code being live in production used by real users, how long does that take? How many of your deployments fail? So when you deploy, you most definitely sometimes introduce defects into production. How often does that happen? And how long does it take to recover from an outage? So when someone introduces a defect to production, how long does it take for your team to detect that defect happened and ultimately fix it, either roll forward or roll back? And at baseline, we have another one. It's a bonus one. We call it how long, we call it mess around lead time. And it's how long does it take from customer insight to production? So I'm here at a conference. I've spoken with a lot of people, a lot of experts in serverless really. And I've learned a lot, I've gotten a lot of insights.

2. Insights to Production and Deployment Process

Short description:

Innovative teams have foundations laid, no flaky tests or bad CI/CD pipelines. Low performing teams experience chaos and spend time fixing instead of shipping. Smaller deployments lead to faster detection and recovery time. Baseline has no red tape around deployment and streamlined testing. The bottleneck is deploying to the cloud. They don't test everything and don't do code reviews.

How long does it take from those insights that I have now to being in production at some point in the future? What's that time gap? And this is what innovative teams look like. Every single day, they have the foundations laid, they don't have flaky tests, they don't have bad CI, CD pipelines, and they keep innovating, adding blocks on top of what they already have, such that innovation happens every single day.

And for low performing teams, this is what it looks like. Complete chaos, nobody knows what's happening, and every single day, instead of shipping stuff to production that is actually helpful to your users and customers, you are fixing stuff. You are fighting CI, CD pipelines. That's not what we signed up for, and we want to move away from this.

And when you start moving away from it, it's a self-fulfilling prophecy. That's the expression. Smaller deployments lead to faster deployments. Faster deployments lead to faster detection time. Faster detection time leads to faster recovery time. And if you know that you can recover from outages quickly, you will deploy more often, you will innovate more often.

So, how does all of this look like at baseline? So, our deployment frequency is whenever you want. You make a typo change in the frontend, git push, gets deployed. We spend the whole day working on this huge migration, and blah, blah, blah. Git push gets deployed. There is no red tape around deployment. And that's something that we need to introduce more into our development cycles. Because those red tapes that we introduce, they seem like they're helping us being more productive, but actually they're just slowing down everybody.

What's our mean lead time for changes? So how long does it take for somebody writing code on their code editor to that code being in production? That is actually however long infrastructure-as-code takes. We use infrastructure-as-code to manage all our infrastructure, and every time you git push, our CI CD pipeline picks up, and it builds the artifacts, and it deploys them to the cloud. And the bottleneck in our process is actually that deployed to the cloud piece. It can take maybe two minutes or so. And the reason we are able to achieve this is controversial. We have very streamlined testing. So we don't test for the sake of testing. We don't have testing suites that take 15 minutes to test buttons and etc. We test the critical path in our software, and the rest we are going to discover if there is a problem thanks to the observability that we have. And the second thing, probably even more controversial, we don't do code reviews. I know a lot of people are not, I hear a smile there.

QnA

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
The talk discusses the importance of supply chain security in the open source ecosystem, highlighting the risks of relying on open source code without proper code review. It explores the trend of supply chain attacks and the need for a new approach to detect and block malicious dependencies. The talk also introduces Socket, a tool that assesses the security of packages and provides automation and analysis to protect against malware and supply chain attacks. It emphasizes the need to prioritize security in software development and offers insights into potential solutions such as realms and Deno's command line flags.
Towards a Standard Library for JavaScript Runtimes
Node Congress 2022Node Congress 2022
34 min
Towards a Standard Library for JavaScript Runtimes
Top Content
There is a need for a standard library of APIs for JavaScript runtimes, as there are currently multiple ways to perform fundamental tasks like base64 encoding. JavaScript runtimes have historically lacked a standard library, causing friction and difficulty for developers. The idea of a small core has both benefits and drawbacks, with some runtimes abusing it to limit innovation. There is a misalignment between Node and web browsers in terms of functionality and API standards. The proposal is to involve browser developers in conversations about API standardization and to create a common standard library for JavaScript runtimes.
ESM Loaders: Enhancing Module Loading in Node.js
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
ESM Loaders enhance module loading in Node.js by resolving URLs and reading files from the disk. Module loaders can override modules and change how they are found. Enhancing the loading phase involves loading directly from HTTP and loading TypeScript code without building it. The loader in the module URL handles URL resolution and uses fetch to fetch the source code. Loaders can be chained together to load from different sources, transform source code, and resolve URLs differently. The future of module loading enhancements is promising and simple to use.
Out of the Box Node.js Diagnostics
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
This talk covers various techniques for getting diagnostics information out of Node.js, including debugging with environment variables, handling warnings and deprecations, tracing uncaught exceptions and process exit, using the v8 inspector and dev tools, and generating diagnostic reports. The speaker also mentions areas for improvement in Node.js diagnostics and provides resources for learning and contributing. Additionally, the responsibilities of the Technical Steering Committee in the TS community are discussed.
Node.js Compatibility in Deno
Node Congress 2022Node Congress 2022
34 min
Node.js Compatibility in Deno
Deno aims to provide Node.js compatibility to make migration smoother and easier. While Deno can run apps and libraries offered for Node.js, not all are supported yet. There are trade-offs to consider, such as incompatible APIs and a less ideal developer experience. Deno is working on improving compatibility and the transition process. Efforts include porting Node.js modules, exploring a superset approach, and transparent package installation from npm.
Multithreaded Logging with Pino
JSNation Live 2021JSNation Live 2021
19 min
Multithreaded Logging with Pino
Top Content
Today's Talk is about logging with Pino, one of the fastest loggers for Node.js. Pino's speed and performance are achieved by avoiding expensive logging and optimizing event loop processing. It offers advanced features like async mode and distributed logging. The use of Worker Threads and Threadstream allows for efficient data processing. Pino.Transport enables log processing in a worker thread with various options for log destinations. The Talk concludes with a demonstration of logging output and an invitation to reach out for job opportunities.

Workshops on related topic

Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
Building a Hyper Fast Web Server with Deno
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
WorkshopFree
Matt Landers
Will Johnston
2 authors
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
0 to Auth in an Hour Using NodeJS SDK
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher
GraphQL - From Zero to Hero in 3 hours
React Summit 2022React Summit 2022
164 min
GraphQL - From Zero to Hero in 3 hours
Workshop
Pawel Sawicki
Pawel Sawicki
How to build a fullstack GraphQL application (Postgres + NestJs + React) in the shortest time possible.
All beginnings are hard. Even harder than choosing the technology is often developing a suitable architecture. Especially when it comes to GraphQL.
In this workshop, you will get a variety of best practices that you would normally have to work through over a number of projects - all in just three hours.
If you've always wanted to participate in a hackathon to get something up and running in the shortest amount of time - then take an active part in this workshop, and participate in the thought processes of the trainer.
Mastering Node.js Test Runner
TestJS Summit 2023TestJS Summit 2023
78 min
Mastering Node.js Test Runner
Workshop
Marco Ippolito
Marco Ippolito
Node.js test runner is modern, fast, and doesn't require additional libraries, but understanding and using it well can be tricky. You will learn how to use Node.js test runner to its full potential. We'll show you how it compares to other tools, how to set it up, and how to run your tests effectively. During the workshop, we'll do exercises to help you get comfortable with filtering, using native assertions, running tests in parallel, using CLI, and more. We'll also talk about working with TypeScript, making custom reports, and code coverage.