This talk digs into what it really takes to run LLMs efficiently on mobile hardware in a React Native environment. We’ll examine the constraints teams face - memory limits, model loading strategies, inference performance, platform-specific APIs - and how they shape real-world product decisions. From there, we’ll introduce a React Native library that provides two complementary ways to integrate on-device AI: cross-platform, state-of-the-art models that run locally on both Android and iOS, and a dedicated path for leveraging Apple Intelligence capabilities on supported iOS devices.
We’ll walk through the architecture, usage patterns, and trade-offs of each approach, and discuss best practices for delivering smooth, low-latency AI experiences without relying on the cloud. We will cover both native libraries for mobile AI & LLM model inference, and the available wrappers for React Native, along with the trade-offs, capabilities, hardware compatibility, model format compatibility and compile-time model optimizations (operator fusing, vectorization, memory planning, operations accelerated for specific hardware) considerations so that the audience is aware to pick the best solution for their specific use cases.
This talk has been presented at React Summit US 2026, check out the latest edition of this React Conference.




















