How to Help Agents Remember

This ad is not shown to multipass and full ticket holders
JS Nation
JSNation 2026
June 11 - 15, 2026
Amsterdam & Online
The main JavaScript conference of the year
Upcoming event
JSNation 2026
JSNation 2026
June 11 - 15, 2026. Amsterdam & Online
Learn more
Bookmark
Rate this content
Sentry
Promoted
Code breaks, fix it faster

Crashes, slowdowns, regressions in prod. Seer by Sentry unifies traces, replays, errors, profiles to find root causes fast.

LLMs need access to the relevant context to succeed. One key element of Context Engineering that we can use to help agents remember is Memory.

Let's dive into the world of memory management for agents. We'll discuss the difference between short and long term memory, how to manage memory, and the importance of preventing context poisoning and clash using techniques such as trimming and summarization.

This talk has been presented at AI Coding Summit 2026, check out the latest edition of this Tech Conference.

FAQ

Memory is crucial for providing LLMs with relevant context to ensure they generate accurate and relevant results. It helps minimize hallucinations by offering the correct information needed for decision-making.

The types of memory discussed include sensory memory, short-term memory, and long-term memory. Sensory memory involves short-lived inputs from senses, short-term memory stores information temporarily, and long-term memory involves storing information for extended periods.

LLMs have a context window, which is the maximum number of tokens they can process at once. Challenges include potential overflow, confusion, distraction, and contradictions if irrelevant or excessive information is included.

LLMs hallucinate due to limited training data, outdated knowledge bases, overfitting, biases, language ambiguity, and catastrophic forgetting. They may also be rewarded for guessing rather than acknowledging uncertainty.

Techniques for managing memory include summarization, pruning, and context quarantine. These methods help control the size and relevance of stored information, reducing costs and improving efficiency.

Semantic memory involves using semantic search to find relevant memories based on meaning and context, allowing agents to use past information effectively in decision-making.

Short-term memory involves messages within the current session, while long-term memory stores messages for future retrieval. Long-term memory requires storing data in a way that can be accessed in new sessions.

Common storage options include disk and file systems, databases, and semantic memory systems that utilize semantic search for context retrieval.

Techniques like 'LLM as a Judge' involve using another LLM to evaluate responses for accuracy and detect hallucinations, helping to identify and correct potential issues.

Long context windows can lead to confusion, distraction, poisoning, and conflicting information, which can degrade the model's ability to generate accurate responses.

Carly Richmond
Carly Richmond
20 min
26 Feb, 2026

Comments

Sign in or register to post your comment.
Video Summary and Transcription
Carly discusses memory, context in LLMs, and managing information for effective results in agentic systems. LLMs hallucinate due to knowledge gaps, biases, overfitting, and incentives. Importance of context in minimizing hallucinations and focus on memory storage for decision-making. Short term memory acts like RAM, while long term memory stores knowledge and experiences. Efficient memory storage using disk, file systems, and semantic memory. Emphasis on managing message context, retrieval, and concatenation for LLM input. Attention to context length, information evaluation, and memory optimization for accurate responses.

1. Memory and Context in Agentic Systems

Short description:

Carly discusses memory, context in LLMs, and managing information for effective results in agentic systems. Connect with Carly for further assistance.

Hi, everyone. It's so great to be online with you all. My name is Carly, and today I'm going to be talking to you about agents and specifically memory as well. So what we're going to cover is you know, why we actually care about memory, what context LLMs need, and what happens when we don't have that relevant information. We're also going to talk about memory, short and long term memory, both in humans and in agents to show the parallels. We'll talk about semantic memory as an example, along with the various ways we can store long-term memory in agents. And then we'll talk about some various management techniques that you might come across as you're starting to build out your own genetic systems.

So if you haven't met me before, it's lovely to meet you. My name is Carly Richmond. I lead the developer advocacy team within DevRel at Elastic. I've been there for just under four years and before that I was a front-end engineer for 10. Now, if you do have questions, please come find me on Discord. I would love to help you out. But if you think of something afterwards and you go, I really wish I'd asked Carly that, just scan the QR code and find me on whichever social you're at and I'll be more than happy to do my best to help you.

So if you caught my last talk at AI Coding Summit last year, you would have heard me talk about context engineering and what this is a set of practices and tools that we have in order to manage the context window of a large language model when we build agentic systems. And for those who need a reminder about what the context window is, this is in simple terms, the maximum number of tokens an LLM can process at once. LLMs have a particular limit on the number of tokens that they can effectively remember and anything over that, they won't basically factor into the results that they generate for you. And this is known as overflow as you can see from the picture. But even with larger context windows that we're seeing from all of these LLMs, we need to also think about the quality of the information that's within a context window to make sure that we give it the information so that it can actually perform the relevant tasks that we want and generate the right results, and that we're not passing in information that could potentially cause contradictions, can lead to incorrect information or anything like that.

2. Challenges in LLMs and Model Hallucination

Short description:

LLMs hallucinate due to knowledge gaps, biases, overfitting, and incentives. Awareness of these issues crucial for accurate results.

So I don't know if you've kind of had the news or you've seen any particular situations where an LLM has given you back the wrong answer. I know I certainly have. LLMs make stuff up. It's not, you know, this isn't newsworthy. This is something we've known about for a really long time. And LLMs hallucinate because of several key reasons. Firstly, it's down to the knowledge base that they have, of which they're trained on. So, for example, they might not be trained on the proprietary information that you're building agentic systems on top of to make critical key decisions or to engage with users, and you want to make sure that they're going to provide the correct answers on data that's not been trained.

Additionally, if you're building something using an older model that perhaps has a knowledge cutoff date of, say, you know, November 2025, if we try to ask it questions about what has been happening in politics over the last week, it's not going to know and it's probably going to make up an answer, or it might, if you're lucky, come and say, I don't know. When it comes to model choices, we also have issues such as overfitting and machine learning terms. This is basically where the parameters of a model are too tight and they don't have the flex to actually perform the task that we've asked the particular model to do. The third is around biases. So this, again, is something you might have come across that biases inherent in the data set that these particular models have been trained on, not just in terms of gender, but also other characteristics can lead to them producing answers that are not necessarily quite right. And if they actually discriminate, can end up landing us in hot water with regards to reputation or legal issues.

There's the fact that the English language and indeed sometimes other languages have ambiguity within them and that can confuse an LLM, albeit it's becoming less and less common. And then there's this interesting thing called catastrophic forgetting, which is where spontaneously an LLM just forgets large waves of training data and something goes absolutely disastrously wrong. But one of the other things we need to factor in when it comes to hallucination is that models hallucinate because they have been incentivised to. So if you look at this paper from some of the researchers at OpenAI and Georgia Tech from September last year, you'll see that they put forward the case that the training and evaluation procedures reward an LLM for guessing and giving us an answer over acknowledging uncertainty and simply saying, I don't know. And we need to be on the lookout for these.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Building a Voice-Enabled AI Assistant With Javascript
JSNation 2023JSNation 2023
21 min
Building a Voice-Enabled AI Assistant With Javascript
Top Content
This Talk discusses building a voice-activated AI assistant using web APIs and JavaScript. It covers using the Web Speech API for speech recognition and the speech synthesis API for text to speech. The speaker demonstrates how to communicate with the Open AI API and handle the response. The Talk also explores enabling speech recognition and addressing the user. The speaker concludes by mentioning the possibility of creating a product out of the project and using Tauri for native desktop-like experiences.
The Ai-Assisted Developer Workflow: Build Faster and Smarter Today
JSNation US 2024JSNation US 2024
31 min
The Ai-Assisted Developer Workflow: Build Faster and Smarter Today
Top Content
AI is transforming software engineering by using agents to help with coding. Agents can autonomously complete tasks and make decisions based on data. Collaborative AI and automation are opening new possibilities in code generation. Bolt is a powerful tool for troubleshooting, bug fixing, and authentication. Code generation tools like Copilot and Cursor provide support for selecting models and codebase awareness. Cline is a useful extension for website inspection and testing. Guidelines for coding with agents include defining requirements, choosing the right model, and frequent testing. Clear and concise instructions are crucial in AI-generated code. Experienced engineers are still necessary in understanding architecture and problem-solving. Energy consumption insights and sustainability are discussed in the Talk.
The Rise of the AI Engineer
React Summit US 2023React Summit US 2023
30 min
The Rise of the AI Engineer
Top Content
Watch video: The Rise of the AI Engineer
The rise of AI engineers is driven by the demand for AI and the emergence of ML research and engineering organizations. Start-ups are leveraging AI through APIs, resulting in a time-to-market advantage. The future of AI engineering holds promising results, with a focus on AI UX and the role of AI agents. Equity in AI and the central problems of AI engineering require collective efforts to address. The day-to-day life of an AI engineer involves working on products or infrastructure and dealing with specialties and tools specific to the field.
AI and Web Development: Hype or Reality
JSNation 2023JSNation 2023
24 min
AI and Web Development: Hype or Reality
Top Content
This talk explores the use of AI in web development, including tools like GitHub Copilot and Fig for CLI commands. AI can generate boilerplate code, provide context-aware solutions, and generate dummy data. It can also assist with CSS selectors and regexes, and be integrated into applications. AI is used to enhance the podcast experience by transcribing episodes and providing JSON data. The talk also discusses formatting AI output, crafting requests, and analyzing embeddings for similarity.
The AI-Native Software Engineer
JSNation US 2025JSNation US 2025
35 min
The AI-Native Software Engineer
Software engineering is evolving with AI and VIBE coding reshaping work, emphasizing collaboration and embracing AI. The future roadmap includes transitioning from augmented to AI-first and eventually AI-native developer experiences. AI integration in coding practices shapes a collaborative future, with tools evolving for startups and enterprises. AI tools aid in design, coding, and testing, offering varied assistance. Context relevance, spec-driven development, human review, and AI implementation challenges are key focus areas. AI boosts productivity but faces verification challenges, necessitating human oversight. The impact of AI on code reviews, talent development, and problem-solving evolution in coding practices is significant.
Web Apps of the Future With Web AI
JSNation 2024JSNation 2024
32 min
Web Apps of the Future With Web AI
Web AI in JavaScript allows for running machine learning models client-side in a web browser, offering advantages such as privacy, offline capabilities, low latency, and cost savings. Various AI models can be used for tasks like background blur, text toxicity detection, 3D data extraction, face mesh recognition, hand tracking, pose detection, and body segmentation. JavaScript libraries like MediaPipe LLM inference API and Visual Blocks facilitate the use of AI models. Web AI is in its early stages but has the potential to revolutionize web experiences and improve accessibility.

Workshops on related topic

AI on Demand: Serverless AI
DevOps.js Conf 2024DevOps.js Conf 2024
163 min
AI on Demand: Serverless AI
Top Content
Featured WorkshopFree
Nathan Disidore
Nathan Disidore
In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.
AI for React Developers
React Advanced 2024React Advanced 2024
142 min
AI for React Developers
Top Content
Featured Workshop
Eve Porcello
Eve Porcello
Knowledge of AI tooling is critical for future-proofing the careers of React developers, and the Vercel suite of AI tools is an approachable on-ramp. In this course, we’ll take a closer look at the Vercel AI SDK and how this can help React developers build streaming interfaces with JavaScript and Next.js. We’ll also incorporate additional 3rd party APIs to build and deploy a music visualization app.
Topics:- Creating a React Project with Next.js- Choosing a LLM- Customizing Streaming Interfaces- Building Routes- Creating and Generating Components - Using Hooks (useChat, useCompletion, useActions, etc)
Building Full Stack Apps With Cursor
JSNation 2025JSNation 2025
46 min
Building Full Stack Apps With Cursor
Featured Workshop
Mike Mikula
Mike Mikula
In this workshop I’ll cover a repeatable process on how to spin up full stack apps in Cursor.  Expect to understand techniques such as using GPT to create product requirements, database schemas, roadmaps and using those in notes to generate checklists to guide app development.  We will dive further in on how to fix hallucinations/ errors that occur, useful prompts to make your app look and feel modern, approaches to get every layer wired up and more!  By the end expect to be able to run your own AI generated full stack app on your machine!
Please, find the FAQ here
Vibe coding with Cline
JSNation 2025JSNation 2025
64 min
Vibe coding with Cline
Featured Workshop
Nik Pash
Nik Pash
The way we write code is fundamentally changing. Instead of getting stuck in nested loops and implementation details, imagine focusing purely on architecture and creative problem-solving while your AI pair programmer handles the execution. In this hands-on workshop, I'll show you how to leverage Cline (an autonomous coding agent that recently hit 1M VS Code downloads) to dramatically accelerate your development workflow through a practice we call "vibe coding" - where humans focus on high-level thinking and AI handles the implementation.You'll discover:The fundamental principles of "vibe coding" and how it differs from traditional developmentHow to architect solutions at a high level and have AI implement them accuratelyLive demo: Building a production-grade caching system in Go that saved us $500/weekTechniques for using AI to understand complex codebases in minutes instead of hoursBest practices for prompting AI agents to get exactly the code you wantCommon pitfalls to avoid when working with AI coding assistantsStrategies for using AI to accelerate learning and reduce dependency on senior engineersHow to effectively combine human creativity with AI implementation capabilitiesWhether you're a junior developer looking to accelerate your learning or a senior engineer wanting to optimize your workflow, you'll leave this workshop with practical experience in AI-assisted development that you can immediately apply to your projects. Through live coding demos and hands-on exercises, you'll learn how to leverage Cline to write better code faster while focusing on what matters - solving real problems.
The React Developer's Guide to AI Engineering
React Summit US 2025React Summit US 2025
96 min
The React Developer's Guide to AI Engineering
Featured WorkshopFree
Niall Maher
Niall Maher
A comprehensive workshop designed specifically for React developers ready to become AI engineers. Learn how your existing React skills—component thinking, state management, effect handling, and performance optimization—directly translate to building sophisticated AI applications. We'll cover the full stack: AI API integration, streaming responses, error handling, state persistence with Supabase, and deployment with Vercel.Skills Translation:- Component lifecycle → AI conversation lifecycle- State management → AI context and memory management- Effect handling → AI response streaming and side effects- Performance optimization → AI caching and request optimization- Testing patterns → AI interaction testing strategiesWhat you'll build: A complete AI-powered project management tool showcasing enterprise-level AI integration patterns.
Build LLM agents in TypeScript with Mastra and Vercel AI SDK
React Advanced 2025React Advanced 2025
145 min
Build LLM agents in TypeScript with Mastra and Vercel AI SDK
Featured WorkshopFree
Eric Burel
Eric Burel
LLMs are not just fancy search engines: they lay the ground for building autonomous and intelligent pieces of software, aka agents.
Companies are investing massively in generative AI infrastructures. To get their money's worth, they need developers that can make the best out of an LLM, and that could be you.
Discover the TypeScript stack for LLM-based development in this 3 hours workshop. Connect to your favorite model with the Vercel AI SDK and turn lines of code into AI agents with Mastra.ai.