The Dark Side of Open Source

Rate this content
Bookmark

Join Feross, CEO of Socket, on a thrilling journey into the dark side of open source software. Come along for the ride as we explore the unseen risks lurking within everyday software dependencies. See firsthand how AI-driven solutions, specifically large language models, are helping us battle against malicious dependencies within the npm ecosystem. Arm yourself with the knowledge and tools to protect your codebase in this ever-evolving battle.

This talk has been presented at Node Congress 2024, check out the latest edition of this Tech Conference.

FAQ

The main focus of the talk by Firas is on the dark side of open source, specifically discussing examples of malicious code and threats in the NPM and JavaScript ecosystems.

Firas has worked on popular open source packages including WebTorrent and StandardJS.

Firas was a former member of the Node.js Foundation board.

Socket is a tool that helps protect code from malicious attacks by assisting developers and security teams in safely finding, auditing, and managing open source software. It aims to reduce the time spent on security busy work and increase the speed of shipping code.

In the last five years, it's become common for applications to have over 90% of their code come from open source dependencies. This means that much of the code in modern applications is written by third parties rather than the developers themselves.

A software supply chain attack occurs when malicious code is introduced into a software product by compromising a third-party supplier, leading to downstream breaches in the networks of users who rely on that product.

One example is the attack on the NPM package used by Ledger in December 2023. Hackers gained access to a former employee's NPM account, published a malicious version of a library, and aimed to steal cryptocurrency from users of the library.

Using packages that load code from a CDN is risky because it bypasses the lock file, allowing the code to change from build to build without the developer's knowledge. This can lead to security vulnerabilities as attackers can insert malicious code into the remotely loaded script.

Some best practices include not hot linking to CDNs, regularly auditing NPM access, avoiding HTTP or Git dependencies, and using fewer dependencies. Additionally, using security tools that detect supply chain risks and thoroughly reviewing the code in dependencies can improve security.

AI, particularly large language models (LLMs), can analyze source code and provide plain English explanations of potential risks. This helps in identifying obfuscated or hidden malicious behavior that might be missed by human reviewers.

Feross Aboukhadijeh
Feross Aboukhadijeh
37 min
04 Apr, 2024

Comments

Sign in or register to post your comment.

Video Summary and Transcription

The talk explores the dark side of open source, focusing on supply chain attacks and the need for improved security measures. It highlights the dangers of loading external code and the importance of mitigating supply chain risks. The talk also discusses the use of AI and LLMs in code analysis to enhance security. It emphasizes the challenges of sustaining IC maintained open source projects and the future of supply chain security. Lastly, it touches on the variations in open source definitions and the empowerment of the open source community.
Available in Español: El Lado Oscuro del Open Source

1. The Dark Side of Open Source

Short description:

Welcome to the talk on the dark side of open source. We'll explore malicious code and threats in the NPM and JavaScript ecosystems. I have experience in open source and cybersecurity, and now work on open source security at Socket. Socket helps developers and security teams find, audit, and manage open source software. Today, over 90% of applications rely on open source dependencies, making supply chain security crucial. The open source ecosystem is under attack, with software supply chain attacks affecting companies of all types.

♪♪ ♪♪ ♪♪ Hey, everybody. It's Firas, and welcome to this talk on the dark side of open source. I'm really excited to share with you some of the lesser-explored parts of open source. We're going to dig into some examples of malicious code and going to give you a sense for some of the threats out there in the NPM and JavaScript ecosystems. So let's get started.

So first off, a little bit about me. I started out in open source. I worked on some pretty popular packages, including WebTorrent and StandardJS, and I'm also a former member of the Node.js Foundation board. And so I really got to see a massive increase in the usage of open source within companies and in the community. Then I moved into more of a security focus. I taught the web security course at Stanford, and now I'm working on open source security at Socket.

So real quick, just a couple words on Socket. So Socket's a tool that helps protect your code from everyone else's, and we help developers and security teams to ship faster and spend less time on security busy work by helping them safely find, audit, and manage open source software. We have a ton of companies using us. A lot of these are actually open source projects, and we're protecting over a quarter million repositories today. So I'm really happy that we've been able to help protect the community to this degree.

Okay, so let's talk a little bit about our applications. So in the last five years, the way that we write software has really changed. It's undergone a really massive shift. Today it's really common to see applications where over 90% of the code comes from open source dependencies. So that means code that your developers, you know, you and your teammates didn't write. The average open source dependency actually has 79 transitive dependencies. So in this world where your application is built on, you know, thousands of dependencies, software security is not just about your code. It's about every piece of code that you depend on. And so in this talk we're going to be talking about open source dependencies because we're JavaScript developers, but, you know, this is actually a broader issue. If you think about there's this term software supply chain, and that really includes all the third-party code that you rely on, whether it's APIs, cloud services, and even dependencies like your operating system, really all the parts and the pieces that make up our software is what we talk about here when we talk about supply chain security. And unfortunately, you know, the open source ecosystem is under attack. We've seen software supply chain attacks surge in the past couple of years. There's headlines pretty regularly about different breaches and attacks, and these attacks impact all types of companies. You know, it's really just anyone who depends on open source, and I know you do depend on open source, will at some point be affected by one of these attacks just because of the scale of NPM.

2. Supply Chain Attacks and the Problem of Trust

Short description:

The SolarWinds hack was a sophisticated supply chain attack that compromised SolarWinds and impacted thousands of networks. NPM packages, often maintained by individuals or small teams, can also be vulnerable to supply chain attacks. I'll present a real example of a recent attack that exfiltrated environment variables. As developers, we rely on trust in open source, but it takes too long to detect malicious packages and they're often not catalogued for future reference.

And so how many of you have heard of the SolarWinds hack? This was pretty big news a few years ago. It was a sophisticated supply chain attack that compromised a supplier called SolarWinds, and the way that it worked was that an attacker added malicious code into one of SolarWinds' software products, and then they did this by basically getting into the network of SolarWinds and adding their attack code into the SolarWinds product. And then downstream of that, they were able to get into thousands of networks of SolarWind customers, including U.S. government agencies and large corporations. And while it's pretty hard to kind of figure out the exact monetary damages of this attack, the costs associated with the investigation, the remediation, and the increased cybersecurity measures as a result of this was probably in the billions of dollars.

That's SolarWinds. That's a company that has a security team and a lot of effort to defend their products. Now let's talk about NPM packages, right, which are often maintained by individuals or small teams of volunteers. So sometimes software supply chain security can be kind of abstract, right? It's kind of like, what are we talking about here? So I wanted to make this really concrete for everyone and just really show you, what does a supply chain attack look like? So here's a real example. This is an attack that we detected a few days ago. And we're going to discuss, like, what's going on here. So let me help you out a bit. I'll highlight a few parts of the code here. Does that help? So now if you look at this, you can see, were a developer to install this package, this malicious code would immediately run in an install script, and it would exfiltrate or steal their environment variables, which can include, obviously, secrets, tokens, keys, and then it would send it to an attacker-controlled server. So you can see those three parts there. It's acquiring the network package, it's accessing the environment variables, and then it has this sort of obfuscated or hidden network request down there on that third line. And this is really how a software supply chain attack can lead to a breach at a company.

Now, this package was targeting probably Airbnb, given the name of the package, but honestly we don't really have all the details on what the goal of this package was, but the name is very suspicious, I'll just say that. And so fundamentally the problem here is that, you know, as developers we're using so many packages, but we just don't have the time to read every line of code in our dependencies. And so we're always trusting other people fundamentally, and open source is built on trust. And for the most part this trust is well placed. Most people are good. But there are a few bad apples. And unfortunately it does take us as a community a little bit too long to find these types of bad packages today. Right now we're looking at over 200 days to detect a malicious package as a community. And so, you know, this is pretty bad. This is from a research paper published in 2021. And the other big problem is when we find these malicious packages as a community, we report them and they get taken down, but they're often not catalogued and saved in any way. So they don't go into the typical vulnerability tracking systems like the National Vulnerability Database. They just get taken down and then no one knows whether or not they may have installed that package in the past.

QnA

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Remix Flat Routes – An Evolution in Routing
Remix Conf Europe 2022Remix Conf Europe 2022
16 min
Remix Flat Routes – An Evolution in Routing
Top Content
Remix Flat Routes is a new convention that aims to make it easier to see and organize the routes in your app. It allows for the co-location of support files with routes, decreases refactor and redesign friction, and helps apps migrate to Remix. Flat Folders convention supports co-location and allows importing assets as relative imports. To migrate existing apps to Flat Routes, use the Remix Flat Routes package's migration tool.
How to Make a Web Game All by Yourself
JS GameDev Summit 2023JS GameDev Summit 2023
27 min
How to Make a Web Game All by Yourself
This talk guides you on how to make a web game by yourself, emphasizing the importance of focusing on tasks that interest you and outsourcing the rest. It suggests choosing a game engine that allows distribution on the web and aligns with your understanding and enjoyment. The talk also highlights the significance of finding fun in the creative process, managing scope, cutting features that don't align with the game's direction, and iterating to the finish line. It concludes by discussing the options for publishing the game on the web and leveraging unique web features.
How to Build Your Own Open Source Project
React Advanced Conference 2022React Advanced Conference 2022
16 min
How to Build Your Own Open Source Project
Hello my friend, in this talk, I wanna share with you how to build your own open source project. Building an open source software project can be challenging. I receive a lot of things randomly in a day, like thank you messages for making my life easier, which motivates me. To choose an open source project to work on, pick one you use every day. Your software is being used when people report issues and send pull requests.
Atomic Deployment for JS Hipsters
DevOps.js Conf 2024DevOps.js Conf 2024
25 min
Atomic Deployment for JS Hipsters
This Talk discusses atomic deployment for JavaScript and TypeScript, focusing on automated deployment processes, Git hooks, and using hard links to copy changes. The speaker demonstrates setting up a bare repository, configuring deployment variables, and using the post-receive hook to push changes to production. They also cover environment setup, branch configuration, and the build process. The Talk concludes with tips on real use cases, webhooks, and wrapping the deployment process.
Your GraphQL Groove
GraphQL Galaxy 2022GraphQL Galaxy 2022
31 min
Your GraphQL Groove
The Talk discusses the value proposition of GraphQL and its ability to solve common pain points in API development. It highlights the importance of making informed decisions when choosing GraphQL clients, servers, and schema builders. The Talk also emphasizes the need to focus on the best developer experience in the present rather than seeking a perfect long-term solution. Additionally, it mentions the future of the Urkel GraphQL client and the reasons for dropping ReScript support. Overall, the Talk provides insights into the current state and future trends of GraphQL development.
Full-stack & typesafe React (+Native) apps with tRPC.io
React Advanced Conference 2021React Advanced Conference 2021
6 min
Full-stack & typesafe React (+Native) apps with tRPC.io
Top Content
Alex introduces tRPC, a toolkit for making end-to-end type-safe APIs easily, with auto-completion of API endpoints and inferred data from backend to frontend. tRPC works the same way in React Native and can be adopted incrementally. The example showcases backend communication with a database using queries and validators, with types inferred to the frontend and data retrieval done using Prisma ORM.

Workshops on related topic

Integrating LangChain with JavaScript for Web Developers
React Summit 2024React Summit 2024
92 min
Integrating LangChain with JavaScript for Web Developers
Featured Workshop
Vivek Nayyar
Vivek Nayyar
Dive into the world of AI with our interactive workshop designed specifically for web developers. "Hands-On AI: Integrating LangChain with JavaScript for Web Developers" offers a unique opportunity to bridge the gap between AI and web development. Despite the prominence of Python in AI development, the vast potential of JavaScript remains largely untapped. This workshop aims to change that.Throughout this hands-on session, participants will learn how to leverage LangChain—a tool designed to make large language models more accessible and useful—to build dynamic AI agents directly within JavaScript environments. This approach opens up new possibilities for enhancing web applications with intelligent features, from automated customer support to content generation and beyond.We'll start with the basics of LangChain and AI models, ensuring a solid foundation even for those new to AI. From there, we'll dive into practical exercises that demonstrate how to integrate these technologies into real-world JavaScript projects. Participants will work through examples, facing and overcoming the challenges of making AI work seamlessly on the web.This workshop is more than just a learning experience; it's a chance to be at the forefront of an emerging field. By the end, attendees will not only have gained valuable skills but also created AI-enhanced features they can take back to their projects or workplaces.Whether you're a seasoned web developer curious about AI or looking to expand your skillset into new and exciting areas, "Hands-On AI: Integrating LangChain with JavaScript for Web Developers" is your gateway to the future of web development. Join us to unlock the potential of AI in your web projects, making them smarter, more interactive, and more engaging for users.
Node.js: Landing your first Open Source contribution & how the Node.js project works
Node Congress 2023Node Congress 2023
85 min
Node.js: Landing your first Open Source contribution & how the Node.js project works
Workshop
 Claudio Wunder
Claudio Wunder
This workshop aims to give you an introductory module on the general aspects of Open Source. Follow Claudio Wunder from the OpenJS Foundation to guide you on how the governance model of Node.js work, how high-level decisions are made, and how to land your very first contribution. At the end of the workshop, you'll have a general understanding of all the kinds of work that the Node.js project does (From Bug triage to deciding the Next-10 years of Node.js) and how you can be part of the bigger picture of the JavaScript ecosystem.

The following technologies and soft skills might be needed):
  - Basic understanding of Git & GitHub interface
  - Professional/Intermediate English knowledge for communication and for allowing you to contribute to the Node.js org (As all contributions require communication within GitHub Issues/PRs)
  - The workshop requires you to have a computer (Otherwise, it becomes difficult to collaborate, but tablets are also OK) with an IDE setup, and we recommend VS Code and we recommend the GitHub Pull Requests & Issues Extension for collaborating with Issues and Pull Requests straight from the IDE.

The following themes will be covered during the workshop:
- A recap of some of GitHub UI features, such as GitHub projects and GitHub Issues
- We will cover the basics of Open Source and go through Open Source Guide
- We will recap Markdown
- We will cover Open Source governance and how the Node.js project works and talk about the OpenJS Foundation
  - Including all the ways one might contribute to the Node.js project and how their contributions can be valued
- During this Workshop, we will cover Issues from the nodejs/nodejs.dev as most of them are entry-level and do not require C++ or deep technical knowledge of Node.js.
  - Having that said, we still recommend enthusiast attendees that want to challenge themselves to "Good First Issues" from the nodejs/node (core repository) if they wish.
  - We're going to allow each attendee to choose an issue or to sit together with other attendees and tackle issues together with Pair Programming through VS Code Live Share feature
    - We can also do Zoom breakrooms for people that want to collaborate together
  - Claudio will be there to give support to all attendees and, of course, answer any questions regarding Issues and technical challenges they might face
  - The technologies used within nodejs/nodejs.dev are React/JSX, Markdown, MDX and Gatsby. (No need any knowledge of Gatsby, as most of the issues are platform agnostic)
- By the end of the Workshop, we'll collect all (make a list) the contributors who successfully opened a Pull Request (even if it's a draft) and recognise their participation on Social media.
Managers Are From Mars, Devs Are From Venus
TechLead Conference 2024TechLead Conference 2024
111 min
Managers Are From Mars, Devs Are From Venus
Workshop
Mo Khazali
Mo Khazali
A Developer’s Guide to Communicating, Convincing, and Collaborating Effectively With Stakeholders
It’s a tale as old as time - collaboration between developers and business stakeholders has long been a challenge, with a lack of clear communication often leaving both sides frustrated. The best developers can deeply understand their business counterparts’ needs, effectively communicate technical strategy without losing the non-technical crowd, and convince the business to make the right decisions. Working at a consultancy, I’ve both failed and succeeded in architecting and “selling” technical visions, learning many lessons along the way.Whether you work at a product company, are a consultant/freelancer, or want to venture beyond just being a developer, the ability to convince and clearly communicate with stakeholders can set you apart in the tech industry. This becomes even more important with the rise of GenAI and the increasingly competitive developer market, as problem-solving and effective communication are key to positioning yourself.In this workshop, I’ll share real-world examples, both good and bad, and guide you through putting the theory into practice through dojos.
How to create editor experiences your team will love
React Advanced Conference 2021React Advanced Conference 2021
168 min
How to create editor experiences your team will love
Workshop
Lauren Etheridge
Knut Melvær
2 authors
Content is a crucial part of what you build on the web. Modern web technologies brings a lot to the developer experience in terms of building content-driven sites, but how can we improve things for editors and content creators? In this workshop you’ll learn how use Sanity.io to approach structured content modeling, and how to build, iterate, and configure your own CMS to unify data models with efficient and delightful editor experiences. It’s intended for web developers who want to deliver better content experiences for their content teams and clients.