Scaling Distributed Machine Learning, to the Edge & Back

Rate this content
Bookmark
Slides

This talk will cover why and how organizations are distributing data storage and machine learning to the edge. By pushing machine learning to the edge, we can geographically distribute learning so that the models will actually learn different things relevant to specific locations. By delivering both edge database and compute in a single platform, more people can transition to a distributed architecture. The performance gains from this new architecture cements the value that mobile edge computing brings.

This talk has been presented at JSNation 2023, check out the latest edition of this JavaScript Conference.

FAQ

HarperDB is a Distributed Application Platform where Jackson Repp works. It is built entirely in Node.js, leveraging JavaScript to provide benefits like simplicity and resource availability for distributed applications, including machine learning.

JavaScript, being ubiquitous on client devices and web browsers, is a strategic choice for deploying machine learning models closer to users. Libraries like TensorFlow.js allow JavaScript to handle machine learning tasks, facilitating edge computing by running models directly on user devices.

The deployment of machine learning models involves training the model with data, testing and validating its performance, and finally deploying it to production where it can provide actionable insights or predictions.

Hierarchical knowledge in machine learning involves processing data at multiple levels, similar to how the human brain processes sensory inputs. JavaScript, due to its flexibility and widespread use, is ideal for refining and deploying machine learning models across various layers, from cloud servers down to edge devices.

Machine learning at the edge faces challenges such as limited computational power, the need for data privacy, and the necessity to handle diverse and localized data. Efficient model training and inference must be balanced with these constraints to ensure performance and user satisfaction.

HarperDB integrates machine learning capabilities by providing a platform for data storage, model training, and distribution. It simplifies the complexity of managing distributed systems, allowing models to be replicated and retrained across various nodes effectively.

TensorFlow.js is a JavaScript library that allows for the training and deployment of machine learning models directly in the browser or on Node.js. It enables JavaScript developers to utilize machine learning without needing a background in Python, making ML more accessible to a broader range of developers.

Using JavaScript for machine learning offers several benefits, including ease of integration with web applications, a large community of developers, and the ability to run on virtually any device. This accessibility makes it easier to deploy and scale machine learning models across different platforms.

Jaxon Repp
Jaxon Repp
21 min
05 Jun, 2023

Comments

Sign in or register to post your comment.
Video Summary and Transcription
This talk explores JavaScript's role in distributed machine learning at scale, discussing the lack of tooling and the accessibility of machine learning deployments. It also covers cloud-based machine learning architecture, machine learning at the edge, and the use of HarperDB for simplified machine learning deployment. The concept of iterative AI and model training is also discussed.

1. Introduction to JavaScript ML

Short description:

Hi, welcome to my talk for JS Nation entitled To the Edge and Back JavaScript's Role in Distributed ML at Scale. I am a recovering developer, father of two daughters, based in Denver, Colorado. I work for HarperDB, a Distributed Application Platform built entirely in Node.js. Today, I will explore the JavaScript machine learning ecosystem, tactical architecture, and systems and methods for delivering performant access to machine learning and AI.

Hi, welcome to my talk for JS Nation entitled To the Edge and Back JavaScript's Role in Distributed ML at Scale. My name's Jackson Repp. I am a recovering developer, father of two daughters. I'm based in Denver, Colorado. I've been a part of eight startups, so I've had two exits, five what I call opportunities for learning. And now I work for HarperDB, which is a Distributed Application Platform. We've been around six years and we've got a lot of production deployments and a fairly robust community.

So when I talk about HarperDB as the place I work, I think of more interest to JS Nation is the fact that we are, in fact, built entirely in Node.js. So we are, we've leveraged the language you already love. And it was one of those things where we looked around and we could have chosen any language, but we realized there were tremendous benefits in terms of simplicity and availability of resources and deployment platforms. Where can JavaScript run? So we love to focus on the JavaScript community and machine learning is obviously, it's one of those things that has expanded dramatically in the very recent future. And how does that get done? What are the logistics behind it? And that's what I wanted to explore today.

So the syllabus for this course, I guess, would be understanding the JavaScript machine learning ecosystem. What are the resources we have available to us to build these amazing, cool technologies that function out maybe closer to the user, leveraging a language we all love. And then we have a section called tactical architecture, which is sort of how people do it now or how people did it in the past and where we think it's going over time. How do we continue to deliver performant access to machine learning and AI and these incredibly complex models when running them takes so much horsepower and you don't necessarily have all of the horsepower in the world sitting on your phone or perhaps, you know, in a browser. And finally systems and methods. So how can we approach this problem? What are the considerations we need to have in mind or keep in mind when we're planning a system that is truly distributed and iterative as I'll sort of outline what those architectures look like?

2. Machine Learning Tooling and Tactical Architecture

Short description:

People become aware of machine learning and its potential applications. However, the lack of tooling requires developers to write low-level code to train models and build applications. With the right infrastructure, machine learning deployments become more accessible. ChatGPT has gained significant attention and offers a comprehensive and fast solution. JavaScript is a great choice for pushing machine learning to the edge, with libraries like TensorFlow.js and mobile platforms like CoreML and MLK. The hierarchical nature of accessing data suggests opportunities for cloud, near edge, far edge, and mobile deployments. The tactical architecture involves training, testing, and deploying models.

First, people become aware of it, right? They know the machine learning is a thing. They know that it can help me identify stuff in a photo or they know they can make recommendations using it. But the tooling isn't there. So you're out writing super low-level code to train a model, to build something that can act on user input and give you a recommendation or a classification or accomplish whatever that end goal might be.

And then the infrastructure gets built out behind that to support stuff that we are now capable of deploying because we have the tooling. And with that infrastructure, it becomes more available deployments, which obviously you can roll out to a wider audience, and then it starts to get. So if you look at awareness, the number one thing that everybody's talking about is ChatGPT to the point that the last three weeks of earnings calls have included mentions of AI and ChatGPT in products that didn't even seem like they would take advantage of them because the stock price goes up, because everybody's so excited and aware. And ultimately, we want to deliver this product, this solution, this result. And it's simple, accessible, comprehensive and fast. And ChatGPT nailed all of those things. And it's tremendous if you've ever used it. You know that there's a wait usually to get in line and commercial accounts are hard to come by and expensive, because it takes tremendous amount of resources to do something as impressive as what ChatGPT does. Now, obviously it's also a little terrifying in terms of the scope of what it can do. It's a very large model that's been trained on lots of pieces of data and not everybody needs to deploy a fully comprehensive human-speaking chat engine, but there are a million other applications for machine learning, especially at the edge, that can leverage a lot of the best practices that ChatGPT put in front of us in terms of accessibility.

We look at the tooling then that we have to continue to push this logic out to the edge, right? How do we get closer to those users? And JavaScript obviously, being on every client device and running just about everywhere, is a great choice for that. And while machine learning and machine learning models and AI has traditionally been, you know, on servers with lots of power, a la ChatGPT training a giant model, there's lots of libraries available. TensorFlow.js is the JavaScript cousin to kind of the king of machine learning platforms sponsored by Google. But you've also got lots of other platforms that are available to take data in, generate a model, and ultimately push that out and run it on the edge as well as mobile platforms like CoreML and CreateML on iOS and MLK for Android. So there's lots of ways to push this out as far as you can. Now, again, you have horsepower that's required to ultimately create and use models, so it really depends where you're going to do it. Traditionally, we've done this in the cloud, right? We run a big server with lots of GPU, and we build big models. And then we set up infrastructure on the edge or in another cloud region to leverage that model, take requests from inbound clients, and to take their data and run it against the model and get some sort of a classification or resulting dataset out of it. But as we continue to look at just the hierarchical nature of, say, how we access data, there's probably an opportunity for bifurcation or trifurcation. Just the vision of responsibilities across cloud to the near edge, i.e. the servers that are just in regions closer to you, the far edge, i.e. AWS local zones or on-prem, things that are very, very close to you. And then finally, things you're actually carrying around with you, a mobile app or a browser on your phone or running on a laptop. So there's lots of things that needed to be put in place and have that tooling so that we could actually deliver the results at a more local level. So we look at a tactical architecture, again, the basics are we want to train a model, we want to test it and validate that it works, and then we want to deploy it. We want to put that out there and have it actually start doing things for us.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Don't Solve Problems, Eliminate Them
React Advanced 2021React Advanced 2021
39 min
Don't Solve Problems, Eliminate Them
Top Content
Kent C. Dodds discusses the concept of problem elimination rather than just problem-solving. He introduces the idea of a problem tree and the importance of avoiding creating solutions prematurely. Kent uses examples like Tesla's electric engine and Remix framework to illustrate the benefits of problem elimination. He emphasizes the value of trade-offs and taking the easier path, as well as the need to constantly re-evaluate and change approaches to eliminate problems.
Using useEffect Effectively
React Advanced 2022React Advanced 2022
30 min
Using useEffect Effectively
Top Content
Today's Talk explores the use of the useEffect hook in React development, covering topics such as fetching data, handling race conditions and cleanup, and optimizing performance. It also discusses the correct use of useEffect in React 18, the distinction between Activity Effects and Action Effects, and the potential misuse of useEffect. The Talk highlights the benefits of using useQuery or SWR for data fetching, the problems with using useEffect for initializing global singletons, and the use of state machines for handling effects. The speaker also recommends exploring the beta React docs and using tools like the stately.ai editor for visualizing state machines.
Design Systems: Walking the Line Between Flexibility and Consistency
React Advanced 2021React Advanced 2021
47 min
Design Systems: Walking the Line Between Flexibility and Consistency
Top Content
The Talk discusses the balance between flexibility and consistency in design systems. It explores the API design of the ActionList component and the customization options it offers. The use of component-based APIs and composability is emphasized for flexibility and customization. The Talk also touches on the ActionMenu component and the concept of building for people. The Q&A session covers topics such as component inclusion in design systems, API complexity, and the decision between creating a custom design system or using a component library.
React Concurrency, Explained
React Summit 2023React Summit 2023
23 min
React Concurrency, Explained
Top Content
Watch video: React Concurrency, Explained
React 18's concurrent rendering, specifically the useTransition hook, optimizes app performance by allowing non-urgent updates to be processed without freezing the UI. However, there are drawbacks such as longer processing time for non-urgent updates and increased CPU usage. The useTransition hook works similarly to throttling or bouncing, making it useful for addressing performance issues caused by multiple small components. Libraries like React Query may require the use of alternative APIs to handle urgent and non-urgent updates effectively.
Managing React State: 10 Years of Lessons Learned
React Day Berlin 2023React Day Berlin 2023
16 min
Managing React State: 10 Years of Lessons Learned
Top Content
Watch video: Managing React State: 10 Years of Lessons Learned
This Talk focuses on effective React state management and lessons learned over the past 10 years. Key points include separating related state, utilizing UseReducer for protecting state and updating multiple pieces of state simultaneously, avoiding unnecessary state syncing with useEffect, using abstractions like React Query or SWR for fetching data, simplifying state management with custom hooks, and leveraging refs and third-party libraries for managing state. Additional resources and services are also provided for further learning and support.
TypeScript and React: Secrets of a Happy Marriage
React Advanced 2022React Advanced 2022
21 min
TypeScript and React: Secrets of a Happy Marriage
Top Content
React and TypeScript have a strong relationship, with TypeScript offering benefits like better type checking and contract enforcement. Failing early and failing hard is important in software development to catch errors and debug effectively. TypeScript provides early detection of errors and ensures data accuracy in components and hooks. It offers superior type safety but can become complex as the codebase grows. Using union types in props can resolve errors and address dependencies. Dynamic communication and type contracts can be achieved through generics. Understanding React's built-in types and hooks like useState and useRef is crucial for leveraging their functionality.

Workshops on related topic

React Performance Debugging Masterclass
React Summit 2023React Summit 2023
170 min
React Performance Debugging Masterclass
Top Content
Featured WorkshopFree
Ivan Akulov
Ivan Akulov
Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)
React Hooks Tips Only the Pros Know
React Summit Remote Edition 2021React Summit Remote Edition 2021
177 min
React Hooks Tips Only the Pros Know
Top Content
Featured Workshop
Maurice de Beijer
Maurice de Beijer
The addition of the hooks API to React was quite a major change. Before hooks most components had to be class based. Now, with hooks, these are often much simpler functional components. Hooks can be really simple to use. Almost deceptively simple. Because there are still plenty of ways you can mess up with hooks. And it often turns out there are many ways where you can improve your components a better understanding of how each React hook can be used.You will learn all about the pros and cons of the various hooks. You will learn when to use useState() versus useReducer(). We will look at using useContext() efficiently. You will see when to use useLayoutEffect() and when useEffect() is better.
React, TypeScript, and TDD
React Advanced 2021React Advanced 2021
174 min
React, TypeScript, and TDD
Top Content
Featured WorkshopFree
Paul Everitt
Paul Everitt
ReactJS is wildly popular and thus wildly supported. TypeScript is increasingly popular, and thus increasingly supported.

The two together? Not as much. Given that they both change quickly, it's hard to find accurate learning materials.

React+TypeScript, with JetBrains IDEs? That three-part combination is the topic of this series. We'll show a little about a lot. Meaning, the key steps to getting productive, in the IDE, for React projects using TypeScript. Along the way we'll show test-driven development and emphasize tips-and-tricks in the IDE.
Designing Effective Tests With React Testing Library
React Summit 2023React Summit 2023
151 min
Designing Effective Tests With React Testing Library
Top Content
Featured Workshop
Josh Justice
Josh Justice
React Testing Library is a great framework for React component tests because there are a lot of questions it answers for you, so you don’t need to worry about those questions. But that doesn’t mean testing is easy. There are still a lot of questions you have to figure out for yourself: How many component tests should you write vs end-to-end tests or lower-level unit tests? How can you test a certain line of code that is tricky to test? And what in the world are you supposed to do about that persistent act() warning?
In this three-hour workshop we’ll introduce React Testing Library along with a mental model for how to think about designing your component tests. This mental model will help you see how to test each bit of logic, whether or not to mock dependencies, and will help improve the design of your components. You’ll walk away with the tools, techniques, and principles you need to implement low-cost, high-value component tests.
Table of contents- The different kinds of React application tests, and where component tests fit in- A mental model for thinking about the inputs and outputs of the components you test- Options for selecting DOM elements to verify and interact with them- The value of mocks and why they shouldn’t be avoided- The challenges with asynchrony in RTL tests and how to handle them
Prerequisites- Familiarity with building applications with React- Basic experience writing automated tests with Jest or another unit testing framework- You do not need any experience with React Testing Library- Machine setup: Node LTS, Yarn
Master JavaScript Patterns
JSNation 2024JSNation 2024
145 min
Master JavaScript Patterns
Top Content
Featured Workshop
Adrian Hajdin
Adrian Hajdin
During this workshop, participants will review the essential JavaScript patterns that every developer should know. Through hands-on exercises, real-world examples, and interactive discussions, attendees will deepen their understanding of best practices for organizing code, solving common challenges, and designing scalable architectures. By the end of the workshop, participants will gain newfound confidence in their ability to write high-quality JavaScript code that stands the test of time.
Points Covered:
1. Introduction to JavaScript Patterns2. Foundational Patterns3. Object Creation Patterns4. Behavioral Patterns5. Architectural Patterns6. Hands-On Exercises and Case Studies
How It Will Help Developers:
- Gain a deep understanding of JavaScript patterns and their applications in real-world scenarios- Learn best practices for organizing code, solving common challenges, and designing scalable architectures- Enhance problem-solving skills and code readability- Improve collaboration and communication within development teams- Accelerate career growth and opportunities for advancement in the software industry
AI on Demand: Serverless AI
DevOps.js Conf 2024DevOps.js Conf 2024
163 min
AI on Demand: Serverless AI
Top Content
Featured WorkshopFree
Nathan Disidore
Nathan Disidore
In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.