Stop Paying for AI APIs: npm Install Your Way to In-Process Inference

Bookmark
Rate this content
Sentry
Promoted
Code breaks, fix it faster

Crashes, slowdowns, regressions in prod. Seer by Sentry unifies traces, replays, errors, profiles to find root causes fast.

Every Node.js developer adding AI to their apps faces the same choice: pay for external APIs or wrestle with some local inference like Ollama (that also requires API calls). But there's a third option nobody's talking about: running ML inference *inside* your Node.js process with Transformers.js. In this talk, I'll show you how to generate embeddings, classify text, and run LLMs with nothing more than `npm install`. No API keys, no network latency, no separate processes. Just JavaScript doing machine learning the way it should: simple, fast, and fully under your control.

This talk has been presented at Node Congress 2026, check out the latest edition of this JavaScript Conference.

FAQ

Ed Silva is a Node.js core collaborator in the developer relations team, focusing on connecting with the community.

Ed Silva works for CodeMirror4UQ, a software boutique specialized in Node.js, JavaScript, Ruby on Rails, and Golang.

The main topic of Ed Silva's talk at Node Congress 2026 is AI and its implementation in software development, particularly using Node.js and Hugging Face Transformers.

Hugging Face is a community platform with models, datasets, and documentation related to AI, often referred to as the 'GitHub of AIs.'

Hugging Face Transformers JS is a library that allows execution of AI models in the browser and Node.js, supporting tasks like NLP, computer vision, and audio processing.

Yes, Hugging Face Transformers JS can be used for local inference, allowing execution of AI models on local machines without external services or network calls.

Some use cases for Transformers JS include sentiment analysis on product reviews, semantic queries on documents, and local text generation using models like LLAMA.

Yes, Hugging Face Transformers JS supports GPU execution using the WebGPU API, although this feature is still in preview for Node.js.

The demos from the talk can be found on Ed Silva's GitHub, which includes examples of sentiment analysis, semantic queries, and text generation.

Trade-offs of using local inference include disk space usage for model downloads, potential blocking of the main thread during execution, and the need for sufficient hardware resources.

Edy Silva
Edy Silva
26 min
26 Mar, 2026

Comments

Sign in or register to post your comment.
Video Summary and Transcription
Introduction by Ed Silva, a Node.js core collaborator, discussing the significance of AI in 2026. Companies facing challenges in AI implementation, focusing on AI integration and the need for developer skills. Demonstrations of Node.js egg cooking using Hug and Face community and model inference with Hug and Face Transformers. Transformers JS extending NLP to computer vision and audio tasks, utilizing O-N-N-X format for model execution. Tasks and examples available in Transformers.js for NLP and computer vision, emphasizing model differentiation and execution processes. Optimization techniques for model download trade-offs, GPU utilization, and experimentation possibilities with models like Hugging Face.
Video transcription and chapters available for users with access.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
The talk discusses the importance of supply chain security in the open source ecosystem, highlighting the risks of relying on open source code without proper code review. It explores the trend of supply chain attacks and the need for a new approach to detect and block malicious dependencies. The talk also introduces Socket, a tool that assesses the security of packages and provides automation and analysis to protect against malware and supply chain attacks. It emphasizes the need to prioritize security in software development and offers insights into potential solutions such as realms and Deno's command line flags.
ESM Loaders: Enhancing Module Loading in Node.js
JSNation 2023JSNation 2023
22 min
ESM Loaders: Enhancing Module Loading in Node.js
Top Content
ESM Loaders enhance module loading in Node.js by resolving URLs and reading files from the disk. Module loaders can override modules and change how they are found. Enhancing the loading phase involves loading directly from HTTP and loading TypeScript code without building it. The loader in the module URL handles URL resolution and uses fetch to fetch the source code. Loaders can be chained together to load from different sources, transform source code, and resolve URLs differently. The future of module loading enhancements is promising and simple to use.
The State of Node.js 2025
JSNation 2025JSNation 2025
30 min
The State of Node.js 2025
Top Content
The speaker covers a wide range of topics related to Node.js, including its resilience, popularity, and significance in the tech ecosystem. They discuss Node.js version support, organization activity, development updates, enhancements, and security updates. Node.js relies heavily on volunteers for governance and contribution. The speaker introduces an application server for Node.js enabling PHP integration. Insights are shared on Node.js downloads, infrastructure challenges, software maintenance, and the importance of update schedules for security.
Towards a Standard Library for JavaScript Runtimes
Node Congress 2022Node Congress 2022
34 min
Towards a Standard Library for JavaScript Runtimes
Top Content
There is a need for a standard library of APIs for JavaScript runtimes, as there are currently multiple ways to perform fundamental tasks like base64 encoding. JavaScript runtimes have historically lacked a standard library, causing friction and difficulty for developers. The idea of a small core has both benefits and drawbacks, with some runtimes abusing it to limit innovation. There is a misalignment between Node and web browsers in terms of functionality and API standards. The proposal is to involve browser developers in conversations about API standardization and to create a common standard library for JavaScript runtimes.
Out of the Box Node.js Diagnostics
Node Congress 2022Node Congress 2022
34 min
Out of the Box Node.js Diagnostics
This talk covers various techniques for getting diagnostics information out of Node.js, including debugging with environment variables, handling warnings and deprecations, tracing uncaught exceptions and process exit, using the v8 inspector and dev tools, and generating diagnostic reports. The speaker also mentions areas for improvement in Node.js diagnostics and provides resources for learning and contributing. Additionally, the responsibilities of the Technical Steering Committee in the TS community are discussed.
Node.js Compatibility in Deno
Node Congress 2022Node Congress 2022
34 min
Node.js Compatibility in Deno
Deno aims to provide Node.js compatibility to make migration smoother and easier. While Deno can run apps and libraries offered for Node.js, not all are supported yet. There are trade-offs to consider, such as incompatible APIs and a less ideal developer experience. Deno is working on improving compatibility and the transition process. Efforts include porting Node.js modules, exploring a superset approach, and transparent package installation from npm.

Workshops on related topic

Building a RAG System in Node.js: Vector Databases, Embeddings & Chunking
Node Congress 2025Node Congress 2025
98 min
Building a RAG System in Node.js: Vector Databases, Embeddings & Chunking
Featured Workshop
Alex Korzhikov
Pavlik Kiselev
2 authors
Large Language Models (LLMs) are powerful, but they often lack real-time knowledge. Retrieval-Augmented Generation (RAG) bridges this gap by fetching relevant information from external sources before generating responses. In this workshop, we’ll explore how to build an efficient RAG pipeline in Node.js using RSS feeds as a data source. We’ll compare different vector databases (FAISS, pgvector, Elasticsearch), embedding methods, and testing strategies. We’ll also cover the crucial role of chunking—splitting and structuring data effectively for better retrieval performance.Prerequisites- Good understanding of JavaScript or TypeScript- Experience with Node.js and API development- Basic knowledge of databases and LLMs is helpful but not required
Agenda📢 Introduction to RAG💻 Demo - Example Application (RAG with RSS Feeds)📕 Vector Databases (FAISS, pgvector, Elasticsearch) & Embeddings🛠️ Chunking Strategies for Better Retrieval🔬 Testing & Evaluating RAG Pipelines (Precision, Recall, Performance)🏊‍♀️ Performance & Optimization Considerations🥟 Summary & Q&A
Build a MCP (Model Context Protocol) in Node.js
JSNation US 2025JSNation US 2025
97 min
Build a MCP (Model Context Protocol) in Node.js
Featured Workshop
Julián Duque
Julián Duque
Model Context Protocol (MCP) introduces a structured approach to LLM context management that addresses limitations in traditional prompting methods. In this workshop, you'll learn about the Model Context Protocol, its architecture, and how to build and use and MCP with Node.jsTable of Contents:What Is the Model Context Protocol?Types of MCPs (Stdio, SSE, HTTP Streaming)Understanding Tools, Resources, and PromptsBuilding an MCP with the Official TypeScript SDK in Node.jsDeploying the MCP to the Cloud (Heroku)Integrating the MCP with Your Favorite AI Tool (Claude Desktop, Cursor, Windsurf, VS Code Copilot)Security Considerations and Best Practices
Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
Top Content
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.
Building a Hyper Fast Web Server with Deno
JSNation Live 2021JSNation Live 2021
156 min
Building a Hyper Fast Web Server with Deno
Top Content
Workshop
Matt Landers
Will Johnston
2 authors
Deno 1.9 introduced a new web server API that takes advantage of Hyper, a fast and correct HTTP implementation for Rust. Using this API instead of the std/http implementation increases performance and provides support for HTTP2. In this workshop, learn how to create a web server utilizing Hyper under the hood and boost the performance for your web apps.
0 to Auth in an Hour Using NodeJS SDK
Node Congress 2023Node Congress 2023
63 min
0 to Auth in an Hour Using NodeJS SDK
WorkshopFree
Asaf Shen
Asaf Shen
Passwordless authentication may seem complex, but it is simple to add it to any app using the right tool.
We will enhance a full-stack JS application (Node.JS backend + React frontend) to authenticate users with OAuth (social login) and One Time Passwords (email), including:- User authentication - Managing user interactions, returning session / refresh JWTs- Session management and validation - Storing the session for subsequent client requests, validating / refreshing sessions
At the end of the workshop, we will also touch on another approach to code authentication using frontend Descope Flows (drag-and-drop workflows), while keeping only session validation in the backend. With this, we will also show how easy it is to enable biometrics and other passwordless authentication methods.
Table of contents- A quick intro to core authentication concepts- Coding- Why passwordless matters
Prerequisites- IDE for your choice- Node 18 or higher