Automation of a single monolithic app is pretty straight-forward. Split it into a frontend and backend and it's still manageable. Throw in more components or infrastructure and suddenly you're scratching your head at why a build ran - or didn't run. How many pipelines do I need? How many git repos should I have? Let's walkthrough use cases from small teams who own their entire stack to organizations with central IT units that manage shared infrastructure. Learn which scenarios and criteria determine how to slice but not spaghettify your pipelines.
Infra vs Apps – Where are my Pipelines?
This talk has been presented at DevOps.js Conf 2021, check out the latest edition of this JavaScript Conference.
FAQ
Julie is an engineer at Microsoft and part of the Fast Track for Azure program, where she helps onboard customers to Azure.
Before joining Microsoft, Julie was an Enterprise Architect at Allianz Germany and a full-stack engineer and designer. She has experience in the Mac world and open-source technologies, preferring Node.js and Ruby over Windows.
Julie's talk focuses on CI/CD processes, particularly how they work for both applications and infrastructure. She aims to teach the audience how to manage their own CI/CD processes effectively.
Julie prefers Jenkins because it is her favorite build server of all time, although she also uses Azure DevOps and GitHub Actions.
Julie discusses the complexity of managing CI/CD when applications have multiple deployable components and the challenge of coordinating different triggers and events in the build process.
Julie suggests splitting configurations into different files to manage backend and frontend deployments separately for development and production environments, making it easier to understand and manage the pipeline processes.
Julie emphasizes that end-to-end testing is crucial for verifying that the entire application works as expected before promoting changes to production. However, she acknowledges that they can be challenging to implement and maintain.
Julie recommends that organizations new to DevOps start with managed services to facilitate easier and more frequent deployments, emphasizing the importance of shipping often regardless of the tools used.
1. Introduction to CICD and My Background#
Hi, my name is Julie. I'm an engineer at Microsoft and today I'm going to talk to you about CICD and how it all works, and how it all works when you have both applications and infrastructure. I am part of the Fast Track for Azure program, which means I help onboard customers to Azure. Before that, I was an Enterprise Architect at Allianz Germany, which is a multi-billion dollar insurance company, and actually many of the opinions and recommendations I'm giving you today come from that experience as well as my experience at Microsoft. I come from the Mac world, open source. I like Node.js, Ruby, I really don't like Windows. I'm a very opinionated person, so I will try to mention when, yeah, something is my personal opinion and recommendation. The photo here I put nostalgically because it was literally the last week of February before lockdown because of Corona. So, it feels very strange not just to work remotely without ever having met your colleagues, but also giving a talk right now over video. But it seems to work.
Hi, my name is Julie. I'm an engineer at Microsoft and today I'm going to talk to you about CICD and how it all works, and how it all works when you have both applications and infrastructure.
So, a little bit about me. As I said, I'm an engineer at Microsoft. I am part of the Fast Track for Azure program, which means I help onboard customers to Azure. Before that, I was an Enterprise Architect at Allianz Germany, which is a multi-billion dollar insurance company, and actually many of the opinions and recommendations I'm giving you today come from that experience as well as my experience at Microsoft. Before that, I was a full-stack engineer, still am actually, and a designer.
So, I come from the Mac world, open source. I like Node.js, Ruby, I really don't like Windows. I'm a very opinionated person, so I will try to mention when, yeah, something is my personal opinion and recommendation. The photo here I put nostalgically because it was literally the last week of February before lockdown because of Corona. So, it feels very strange not just to work remotely without ever having met your colleagues, but also giving a talk right now over video. But it seems to work.
2. CICD Use Cases and Mono Repo with Jenkins#
Today I'm going to give you various use cases for CICD. Let's start with a mono repo and Jenkins as the build server. After pushing to the main branch, Jenkins deploys to the production environment. To ensure it works, we need continuous delivery and automated promotion. Running end-to-end tests on the deployed application helps verify its functionality. If the tests fail, the job ends. If they pass, Jenkins commits the changes to the production branch, triggering another job to deploy it.
Okay, so let's start with a very simple example. Today I'm going to give you various use cases. I'm going to try to start simple, and then it gets really complicated really quickly, but the point being I want to teach you how to fish, and not give you a fish, when it comes to figuring out CICD for yourself.
So let's start with the easiest thing possible, right, a mono repo, because we come from monoliths. Very simple. I'm going to make a push and a build server will pick it up. So I have Jenkins here. Jenkins is my favorite server of all time build server. Yeah, I use Azure DevOps and GitHub Actions as well, but I still prefer Jenkins.
Anyway, so let's say I push to the main branch. It's going to deploy to my production environment. Let's say eventually I'm happy. I'm going to somehow on my local computer, merciless changes into production, and then I'm going to push the change to production and Jenkins pushes that over to my production environment. All is good, I think. How do you know it actually works? You know, like that kind of CI just goes there. It does some tasks. But does it actually work? Right. How do you get to the point of continuous delivery? Can you do automated promotion? That is a little bit more complicated than many people expect when they first do it.
So, we still have the same mono-repo, the same sort of monolithic application. We're going to make a push to our main branch, which remember, corresponds to our development environment. So Jenkins will have deployed it. It's all done. Let's run some end-to-end tests. So in this theoretical example, let's say I have even a single-page application that actually has an end-to-end test suite, that will fire up a browser, click through everything. And what my end user is trying to do in the application, we can verify that it works as expected. So maybe I can buy a t-shirt, for example. Based on the results of that test, if they don't work, then we say, oh, failed, end of job, end of story, end of the build job, that is. Let's say it actually works. What you can do is then have Jenkins make that commit for you to that production branch. Whereas before, you might have sort of, you know, by hand, went and clicked through everything to make sure it works, you could run an end-to-end test suite and say, okay, I'm confident, let's put it into production, which will kick off another job, and then we'll deploy it to production.
3. Challenges in CI-CD and Pipeline Configuration#
It's challenging to write tests, especially end-to-end tests. Handling events and triggers in CI-CD can be complicated, with various possibilities to keep track of. Putting everything together in the pipeline is not as easy as it seems. YAML files, used in many CI platforms, can be tricky with indentation and exclusion. Managing multiple components, branches, and environments adds complexity. Examples of Jenkins and YAML code highlight the challenges and potential issues that can arise.
It sounds very sort of simple in practice, right? But it's actually much more challenging than you would expect. So it's really, really hard to write tests in general, and then to write end-to-end tests, and then for people who have things like credit card processing, etc, that they have to do, really complicated. And it's okay if you don't have them, and it's okay even if you have them, you still don't do automatic promotion. So I don't do automatic promotion most of the time, for example, but that's mostly because I'm not coding every day, just some days.
Anyway, let's make it a little bit more complicated. Let's say your application you've been working on a couple years, and you want to split things out, because we have mobile devices, so you're going to split out your back-end. Now we make a change to main. Jenkins is going to say, well, what do you want me to do? Am I supposed to deploy the back-end, front-end, or both? Like super confusing suddenly, but how do you fit everything together? And then one of the biggest challenges when you're doing CI-CD, and especially when you're in that phase where you're not used to actually having multiple deployable components. You have so many sort of events and triggers, and okay, Jenkins is going to run a job, but what, with what code, why? And so usually people, you know, they keep track of pushes, which branch triggered was pushed to, what files sort of changed. So previously you saw there was a sub-folder for front-end, and there was another sub-folder for the back-end. There are other triggers that people forget about, so pull requests. Super easy way to hack somebody, by the way, because if the pull request actually does a deployment, then I could maybe fork the repo and just make a pull request and, ha, I'm in your environment. But I'm a jerk like that because I used to be an enterprise architect, and I would just find holes in people's software. And in terms of events that can trigger things, right, because you want to automate based on, you see on the right here, a list of webhook events for GitHub actions. You see pull requests, you see the review, just a comment on the pull request, which also makes sense. Maybe you want to be like, hey, Julie, hey, Julie, you got to go look at this. So because we have so many possibilities, it's actually quite scary how many things we have to keep track of. And then once you've sorted it out in your head, you still have to sort of actually put that in your pipeline, which is, it sounds easy, but it's not necessarily. Because if you look at here, I have two examples. So the first one is pretty simple, right? It's a Gruby code for a Jenkins file, and it looks like code. It says when, there's brackets. All right, I'm kind of pretty sure what's going to happen. When you have YAML, right, which is true for many CI platforms, including Azure DevOps and GitHub Actions, it's easy to forget something. The indentation is wrong. Like sometimes you have ig clues, but wait, I want to exclude, because there's some file in there that really shouldn't be triggering a deployment. It's a readme file, et cetera. And suddenly things are happening, and it also just means everything slows down, right? You have all these jobs that are waiting to run, or yeah, you're just spending time figuring out why something is happening when you don't want it to happen. And then, remember we said we had now backend and frontend, and then you have development and production. So you have four suddenly. And you have this one big file here, as an example.
4. Frontend Development and Production#
When it's the frontend in development, do this. When it's the frontend in production, do this. It might just be the when part, okay, let's just change some configuration variables. But the point is that you have sort of four different cases that you have to worry about. So it gets a little bit just sort of like, what is changing and why? I personally like to swing them up into different files. The build server will also often show like, oh, the frontend dev pipeline is running or the backend production pipeline is running. It's just easier to see, whereas before, you might just see it's deploying. Well, what is it deploying? That's really hard to figure out.
When it's the frontend in development, do this. When it's the frontend in production, do this. It might just be the when part, okay, let's just change some configuration variables. But the point is that you have sort of four different cases that you have to worry about. So it gets a little bit just sort of like, what is changing and why? I personally like to swing them up into different files. The build server will also often show like, oh, the frontend dev pipeline is running or the backend production pipeline is running. It's just easier to see, whereas before, you might just see it's deploying. Well, what is it deploying? That's really hard to figure out.
So many people don't like this because it's too many files and, oh, you know, there's, you have, you're not reusing code in a library, et cetera. I don't care. So I've been working in the web for, wow, 20 years. No, no, no, 15, 15 years as a salaried person. And what's most important actually is how long does it take you to debug something? How long does it take you to fix something when something is broken in production because that costs you money. So yeah, making code sort of nice little libraries is fun. I enjoy doing that as well. But what's more important in terms of priority is being able to fix things and build things super fast. Okay.
5. Microservices, Stability, and Testing#
When you're building microservices, it's important to understand the difference between independent services and a distributed monolith. Stability is not a technology problem but a people problem. Testing changes and running end-to-end tests can help ensure the stability of your product.
So the next thing that's complicated, now that we split things up, right, so we're also deploying differently. We have to figure out, oh, from the user perspective, I can't buy a shirt anymore for whatever reason. And to sort of track that down, we're going to look at versions. And then you have different ones for front end and back end. But what does that mean for your overall like sort of application?
However, if you're asking that question, you don't really have sort of the independent services. You thought you were building microservices, but actually what you have is a distributed monolith. So let's say you've kind of figured that out now, you've learned that a little bit, you're going to keep going forward. Now you're going to do microservices.
OK, so I picked the simplest example possible for microservice. All right. So everybody knows what a calculator is. Let's say now I'm just building the front end. The rest of the services, they do something. I talk to the API. I really don't care. Super easy for me. I just version something. It's right now the back ends are stable, right, because I'm assuming they do whatever they do. It works. But what's really important to understand is that whether or not something is stable is not a technology problem. It's going to be a people problem. So to illustrate that, let's say the people who are working on Multiply. They're going to do something funky, right. So they make a change and it goes into their development environment. But they don't know if something broke, right, for the user who wants to buy a T-shirt. So what they can do in their pipeline is let me grab those tests from the front-end application and run them. But which version are you running against? So that's also something that's really sort of hard to figure out. And then once you have that test, you can figure, okay, do I move that change all the way through or do I not? The thing is, like, you're looking for that certainty, which you might or might not have. And ultimately, in my experience, what determines whether or not you need end-to-end tests and whether or not you rely 100% on them is sort of the stability of your product, right? So a calculator is super simple, right? It just...one plus one is two. When you're doing insurance, it's a little bit more complicated.
6. CICD Challenges and Infrastructure Complexity#
Even if the technology doesn't do what you expect, talking to each other is important. Using Kubernetes adds complexity with multiple data stores and triggers. Managing infrastructure in Kubernetes requires additional skills. Putting microservices in a Kubernetes cluster increases the complexity. Promoting to a central infrastructure team in Kubernetes brings new challenges with certificates and access. Developers rely on the infrastructure team for changes and updates.
And even if the technology doesn't necessarily sort of do what you expect it to do, if the business rules are very clear, it's a bit easier to figure stuff out. Anyway, what's most important is to understand that you talk to each other. That's how you figure stuff out. Even if you don't have those end-to-end tests, or even if you do, go on Chat, whatever, go talk to people.
OK, so let's make this a little bit more complicated, right? Because management comes around and says, we need to scale, and we think past services are expensive, platform-as-a-service services. We're going to use Kubernetes. We hear it's awesome, all the cool kids are doing it, and it's super, super cheap. And so you, as a calculator team, suddenly have a cluster, and you're like, ah, this doesn't look so bad, we have to worry about ingress, we have to do routing on our own and configure it. But actually, it's a little bit more complicated than you think, because suddenly you have multiple, let's say, data stores, right? So you have your container registry for your Docker images, and you also have pipelines as code, as well as infrastructure as code. So before, when we were worried about front-end triggering back-ends, right, in which environments, now you have to worry about your infrastructure code triggering deployments as well, which is also kind of like crazy. Too many triggers. So if we look at this sort of case with the mono-repo, right, you're just getting started. You can sort of do a lot of learning by doing, but as you can see, it's getting already quite complicated and you need more skills. Kubernetes is not application development. It is so much infrastructure with networking and security that you have to configure yourself, and in some ways, okay, I can, in my sandbox, blow things up, but there is so many things that are so easily blown up. So let's make it more complicated and say your management came around and let's put all those services, right, your little monolithic calculator, let's put all the microservice ones in a Kubernetes cluster. And you see here, we have different namespaces and different repositories, and boom, way more sort of triggers and events that can happen and that can sort of cross fire. Obviously, it might not be a problem for you, right? You've practiced CICD, your domains are stable, but still, it can happen, and it's just something that you have to worry about, and you probably will stumble on it for a very long time before you've mastered it.
So let's look at infrastructure now because your management is, like, now, like, I don't know, they're on some ego trip. We're going to do a Kubernetes, all the things. It doesn't matter if it makes sense. It doesn't matter if you're ready yet. We're going to do all the things in Kubernetes, and suddenly, you're promoted to a central infrastructure team. So in this diagram, I have three different layers, right? So a sort of foundational infrastructure layer, a middle layer that's building, like, kind of a weird platform as a service for other teams that has a Kubernetes cluster, and let's say the goal of the company is that the application development teams, at layer two, all they do is, like, kind of like in our previous, our very first story scenario, I make a change, push, done. Everything else will be managed by other people. It's still not that simple because you have here, for example, I have routing, and then you're going to have TLS certificates because we want secure connections. Then you have the question, wait, who has those certificates, right? How can I get access to them? If they belong to the team, as you see in layer two at the top, then I have to be able to, from my cluster, which is in a different layer, actually grab those certificates, right? I'm like, oh, I don't want to manage all that. Let me just put all the certificates with me in layer zero, and that's easier for me to configure. It's kind of one credential maybe. But then you have all these developers knocking on your door whenever they need a change, whenever something's about to expire or something expired, and you didn't update it, and suddenly it's your fault.
7. Inner-Sourcing, Container Registry, and Security#
Whereas if it's their certificate, it's their fault. Not everybody has access to your root domain. As you grow as an organization, you can use inner-sourcing to streamline processes. A container registry is given to the app team to have control over their own images. Configuring Kubernetes clusters and managing access to pull images can be challenging. It's important to consider security when sharing client IDs and secrets. Managed identities can help mitigate these risks. End-to-end governance is crucial, as companies often overlook CICD in their cloud deployments. Various configurations, such as secrets and branch protections, help prevent unauthorized deployments.
Whereas if it's their certificate, it's their fault. No matter what, you're probably going to have people knocking on your door, right? Because for things like DNS and routing, that has to be configured, and it's probably essential. Not everybody has access to your root domain. In most of these examples, app.com. And old school would be fill out an Excel file, email it to me, and one day I will make that record change for you. One day I'll open that firewall part for you. But as you grow as an organization, you can actually use inner-sourcing, right, forking repositories and pull requests to streamline some of that.
The last thing I want to mention here is that there is a container registry that we've given to the app team themselves so that they have control over their own images, that one team can't shoot another team in the foot by accidentally overriding their image, for example. So that's a security decision that we made, but, again, we have the challenge of configuring various Kubernetes clusters to know which container to go to, to pull an image, and then to be able to have access to pull those images. So, backtracking a bit, with inner-source and code examples. So these are actually two examples from open-source repositories that I have. So I used Terraform to do infrastructure-as-code because it's kind of easier to read, and on the left-hand side you would see an example of how you could do that. You could actually even extrapolate some of the Terraform and just create a variable. So application developer teams, you don't need to know Terraform, right? But anybody who can write a few lines of code in whatever language can read this file and make a change.
Similarly, what I want to show on the right-hand side, it's a bad example for inner-source, but the question of security, right? What you don't want to do is suddenly have client IDs and client secrets, so kind of usernames and passwords, floating around all over the place, never mind the fact that they eventually expire as well. And so there are ways to get around that, right? So your cloud provider might offer something like managed identities, as we do in Azure, and you can see here that I'm just passing references to something. I don't know what all those credentials are, and I don't need to know it. I'll do other things and role assignments and Azure will figure that out for me. What I kind of want to say is that this is really, really hard. It's super, super hard. In this diagram here, I'm talking about end-to-end governance, because what you'll often have is, when people go to the cloud, they'll say, we locked everything down. You can't deploy anything to, you know, so I talk about Azure, but it could be AWS or GCP, Google Cloud Platform, whatever. It's all locked down. The developers can't do anything, but they forgot about CICD, especially these companies that are new. They're like, yeah, we're on the DevOps train. It's like, can the contractor deploy, push to the production branch? Yeah. That's not a problem. And so if you look at this diagram, there are various places with little red locks as well, because you can configure secrets in various places. You configure branch protections in various places. I want to prevent that contractor from deploying to the production branch.
8. Importance of People in CICD#
It's all about the people, not the technology. You can make it worse with technology or Kubernetes, but the biggest takeaway is that it's about the experience of the people.
I have to be able to configure that. And who has the permission to do that? Because if the contractor can do that, then, well, guess what? You didn't fulfill your security rules. So aside from infrastructure, right? This is super, super complicated, and it's complicated because of the people, not because of technology per se. You can make it worse with technology. You can make it worse with Kubernetes because it's not really up to it. But the biggest takeaway that I want to give you is that it's all about the people, right? It's a lot about the experience of the people.
9. Importance of Learning and Compliance in Business#
People can learn the necessary skills, and investing in them provides the advantage of learning with the business domain. Starting with a complex industry like insurance has made smaller tasks easier. Technical limitations reflect the business rules, some of which are non-negotiable due to compliance and security requirements.
So it's not about, oh, you know, I want to hire senior folks, people who have those skills. People can learn those skills. You can invest in them and give them a chance to learn it, and the advantage that you have there is that they're learning it with your business domain, right? So I'm very fortunate in retrospect to have started with insurance, which is super, super complicated because now smaller things like a shop are kind of easy for me to do. All the things about coupling and the dependencies that you have, the technical limitations will really just be a reflection of the business rules that you have. And some of them you don't determine, right? So I come from a compliant industry in insurance and security says you have to do things this way. You have to treat a contractor this way. I don't want to. But I have to. That's kind of the rule.
10. Coordination, Complexity, and Open Source#
In terms of security, separating everything in the cloud can be complicated, but it's doable. Coordinate your teams and communicate effectively. Remember that triggers grow exponentially, so don't underestimate the complexity. Ask the people who will be running the show for their input on repositories and pipelines. Promoting software manually is also acceptable. You can find my code on GitHub and I share a lot of open-source projects. I also blog and make YouTube videos.
And so I'm going bouncing back and forth with the pros and cons. But one thing I definitely want to mention with the cons is that in terms of security, people say, okay, I can separate everything, the cloud provider lets me do that. That's a lot, a lot of overhead, especially with stuff like credentials that can expire, certificates, et cetera.
And while this all seems super, super complicated, it's totally doable. Right. The biggest thing you want to do is basically coordinate your teams. If you can basically like walk together and move around together, then all kind of work. You'll minimize the sort of pain. And actually, yes, that's the big sort of take away on the right hand side here. Talk to each other. Right. So, you know, in person, chat, videos, issues, pull requests, everything.
And on the left, a couple of technical to take away is that I tend to remind people the triggers grow exponentially. I can't speak English anymore. Lived in Germany too long. And so don't estimate. Don't underestimate that sort of complexity. When it comes to how many repositories, how many pipelines? Ask the people who are going to be doing it and running the show and they will tell you how many they want. It doesn't matter what I want. People always ask me. No, what matters is what they want. And same thing for the level of complexity, whatever they're comfortable with, not what I'm comfortable with. And it's OK to promote your software manually by pushing manually or merging manually into production branches. I do that, too.
So last thing. You can find some of this code on GitHub. A lot of the stuff that I do is open source. I put it all out there. I'm not worried about secrets, etc. I also blog and I make videos on YouTube.
QnA
Summary and Q&A on Terraform and Jenkins Files#
If you like any of these examples and want more detail, let me know on GitHub, blog, or YouTube. The consensus is that all questions have been covered in the talk. The results show that Terraform is a standard choice for building production-ready applications. However, there are still other options and legacy systems in use. We have a question from Jonathan about handling large Jenkins files and the suggestion of using bash scripts for abstraction.
So if you like any of these examples, you think they're interesting and I totally sped through all of them. But you want more detail. Let me know somewhere GitHub, blog, YouTube. And, yeah, I will maybe make a video about it. Maybe because I enjoy doing it, but I have to find time to do it as well.
So, all right. That's it. So thanks for joining us. Great talk. The consensus is that all of the questions that people wanted to ask you, have already been covered in your talk, because it was so well done.
So let's first go to the question that you asked everybody else. And I'm looking at the results, and actually, I was looking at it throughout the talk, and before the end of the talk, it was actually above 50% Terraform. Now, it's a little bit under 50%. I don't think that's a surprising result. But what are your thoughts on the results? No, I mean, Terraform is kind of standard and it's just so mature, right? You know, when you're building something for production, you want something that's still going to be around a year from now. If you do go the cloud agnostic route, which is what I'm all about as well, then Terraform is just great. I'm really surprised that it says 19 other. Like what is the other? Right? Yeah. You know, first party. Yeah. I'm guessing it's got maybe some custom stuff, some custom scripts. People are still using a lot of, you know, script where we'll call it. And so it's like a lot of legacy stuff that they really can't get rid of. Some homegrown stuff that, you know, they built even before like there were tools like Terraform. I worked at a company that before Terraform was even around, they were doing like automation. And they have like these old tools that they can't kind of get rid of.
But yeah, so we have some questions coming in from the crowd. Jonathan asks, Jenkins files sometimes tend to be way huge. You just mentioned separating them into files works really well. I've thought creating bash scripts is a good way to abstract logic.
Bash Scripts, Jenkins Files, and Sharing#
Understanding the difference between a bash script and a Jenkins file is important. Depending on your workload, you can split it up in different ways. Managing Jenkins pipelines and libraries can be challenging, but putting things in a library can help with sharing. Follow me on YouTube to stay updated on my upcoming talks and activities. You can also find me on Twitter as jng5.
What do you think about that? How did you come up with other sorts of solutions? So it's important to understand the difference between a bash script and a Jenkins file. Right. So where you putting your logic and how can you sort of share it. So, you know, I think as I mentioned, I use many files in part because if I'm looking at it later, I just want to know what's going on. So sometimes you can put bash scripts into a Jenkins pipeline that you can sort of share across different pipelines. So I do do some bash scripts, but I usually try to keep it to a minimum just because I don't know, it's not as nice to read. I tend to put things more in make files. It's just easier. I'd rather see make install, make do this, instead of like all this bash code. I know it is necessary. Right. So I'm not like, oh, don't do that. But I think depending upon your workload, you can try to split it up in different ways. I know managing Jenkins pipelines, libraries is not easy, because you have to deal with versioning and stuff. But because I come from enterprise, where you want to sort of share stuff, I tend to head towards that route, putting in things in a library.
Awesome. Thank you. That was very helpful. Well, we have a fan in the audience. Anna asks, Where can I hear you speak again? It was such a good talk. You're incredible. Do you have a place where you share your upcoming talks and kind of what you're up to so people can follow you, even your Twitter? Yeah, so I'm jng5 on Twitter. So I tweet every now and then. But I would say the best thing to do would be to follow me on YouTube. So I tend to make videos on YouTube, because I want to basically teach people to fish instead of giving them fish. It's fun. I get to be my snarky, sassy self. I would say that's probably best because of Corona World, actually. I miss being on stage and it's I want to walk around and stuff like that, but I can't. I'm here at home.
Challenges of DevOps in Organizations#
The trickiest thing about DevOps, regardless of whether you're small or big, is actually figuring out where you are. It's hard to be perfect and have everybody working together at the right rhythm and pace. Coordinating workflows, naming conventions, and releases becomes more challenging as the team grows. Having a distributed monolith adds to the complexity. However, it's a fun challenge with something new to learn. In the context of Conway's Law, the bigger the organization, the more fragmented it becomes.
But I would say that's probably the best way for now. And, of course, as soon as we can do other conferences, preferably in person, I'll probably announce it on Twitter.
Awesome. So just a couple more questions. So what would you say is the trickiest thing when it comes to DevOps and organizations in general, large organizations?
I think the trickiest thing about DevOps, regardless of whether you're small or big, but it's more problematic if you're big, is actually figuring out where you are, right? Everybody says, I want to do DevOps and I want to be perfect from day one. And it's so hard to be perfect. And I feel like I've been doing it for many, many years. I think the first time I saw Jenkins was maybe 10 years ago. And I'm nowhere near perfect. And as one of the last slides you saw, like the choreography, dancing, it's so hard just to have everybody sort of working together at the right rhythm, at the right pace. And even I still make those mistakes like now. Everybody has their own sort of workflow. And if you work in a small team, it's much easier to agree on something, agree on a rhythm, agree on a naming convention. When do people push? And you're like, oh, don't push now because we're all going to go to lunch or something or whatever. That's easier to coordinate, whereas as soon as you get more and more people. And you have to work together in some way, even if it's just via the published sort of APIs that they have. Or even worse, if you do have to share pipeline libraries or you have to coordinate. Like we're going to release together. But as I mentioned in this talk as well, we actually have a distributed monolith, not really microservices, at least not yet. And that's just really hard to coordinate. And you have to just do it together as a team. And it's always a journey. You can always be better, but it's kind of fun like that as a challenge. There's always something new to learn. And yeah, I don't know. That's why I picked the dancing animated gifs, because you can always learn a new dance. It's true. OK, we have a couple more questions. But yeah, so it's always kind of like in the context of Conway's Law, the bigger you get, the more fragmented. And then you kind of got to figure that out.
Automatic Promotions and Tool Combination#
In situations where you need to prioritize automatic promotions versus manual, the key consideration is the amount of testing required. End-to-end tests with UIs and clicks can be expensive and flaky. It's important to have a mature software with good coverage of unit tests and integration tests. If you feel confident and the software is not too complex, automatic promotions can be a good option. However, there's no shame in choosing manual promotions, as it can be less stressful and easier to fix issues as a team. Combining tools like Jenkins, GitLab CI, and TeamCity is a common practice to balance user experience and costs, especially when considering licensing and execution power. Having experienced people to help and manage the tools is crucial, and going the open source route can be a cost-effective solution for organizations new to DevOps.
So Matt asks, in which situations would you suggest prioritizing automatic promotions versus manual? So the biggest thing there is test. How much test do you have to end-to-end test to be really sure that you can do that. And so one of the things I do with customers, I really push them. Are you sure? Are you sure? Are you sure? And what people don't realize as well is that you can't test everything right. Those end-to-end tests with UIs and clicks and whatnot, they're very expensive. They tend to be very flaky, right, because you need a lot of power to have a webkit running and doing stuff. And it's just it's flaky. But if you software matures to a point where you have some end-to-end tests, right. My preference is, you know, you're testing the login, that the security is at least definitely there and you have a good coverage unit tests, other integration tests, then it really is ultimately do you feel comfortable doing it and depends on the complexity of your software. If it's just a really tiny piece for API and you feel confident, then go for it. But if you don't, right, and you want to do it manually, there's no shame in that. Absolutely no shame. And that's so many people do that. And it's actually even less stressful and easier to deploy as a team and celebrate as a team or fix things as a team than to try to do it automatically and just have things burn.
Yeah, I agree. Couldn't agree more.
There's no issue if you prefer manual to manual. It's not about like being cool to do automation.
There are two more questions, slightly long, but, you know, for those who are asking questions, and we won't have enough time for all of them, Julie is still around, she'll be on the Discord, she'll be in the Spatial chat and in her discussion room, so join her there. I'm going to just pick one of these last questions. I'm going to pick the question from Louise, because we haven't heard from them yet. We're in the middle of a POC, researching new tools around CICD. One of the critical points to choose the tool is the price and the cost to create and increase the execution power. If you have a licensing by each agent or executor, etc., do you think that it's OK to combine some tools like Jenkins plus GitLab CI plus TeamCity to have a balance between UX and costs? Of course! And a lot of people do that, right? And you don't have to just pick one tool. Something like Jenkins, I love Jenkins. You just need people who are experienced with it, to help you with it and somebody to manage it. And if you have that, then that's awesome. Go for it. Some people go the open source route to avoid the licensing. If you're new as an organization to DevOps, then go for a managed service.
Closing Remarks and Q&A#
The most important thing is to ship and ship often. Stick around for Julie's panel as well. She'll be answering helpful questions. Thank you, Julie, for joining us. Hope to see you around the community and at DevOps Days Tel Aviv.
That's also fine. The most important thing is to ship and ship often. So mix and match. It's all good.
We have a couple of more questions. Oh, awesome! So definitely stick around for Julie's panel as well. She'll be answering a lot of really helpful questions. William and Jonathan, we appreciate your questions. And I'm going to allow Julie to get to them on Discord, as I think we are out of time.
Thank you so much, Julie, for joining us and being with us. No problem. And your excellent talk. So, yeah, hope to see you around the community, and we'll try to see how I can get you to come to DevOps Days Tel Aviv. That's definitely something in the plan, when it's possible. When it's possible, yeah. Awesome. Cool. Thanks, everybody. See you in the discussion room.
Table Of Contents
1. Introduction to CICD and My Background2. CICD Use Cases and Mono Repo with Jenkins3. Challenges in CI-CD and Pipeline Configuration4. Frontend Development and Production5. Microservices, Stability, and Testing6. CICD Challenges and Infrastructure Complexity7. Inner-Sourcing, Container Registry, and Security8. Importance of People in CICD9. Importance of Learning and Compliance in Business10. Coordination, Complexity, and Open SourceSummary and Q&A on Terraform and Jenkins FilesBash Scripts, Jenkins Files, and SharingChallenges of DevOps in OrganizationsAutomatic Promotions and Tool CombinationClosing Remarks and Q&AAvailable in other languages:
Check out more articles and videos
We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career
Workshops on related topic
Prerequisites:Familiar with CI/CD concepts.
Comments