How LLMs “Remember”

Context windows, reflection loops, and vector memory for practical agents.

Agent memory flow: reflection module, vector memory database, and context builder feeding the LLM prompt.
How agents ‘remember’: reflections embedded to vector memory, retrieved to condition prompts.

1️⃣ Overview: How LLMs “Remember”

LLMs don’t have permanent memory — they only see what’s in the context window. So if you want reflection or learning, you need an external memory store to hold experiences and recall them later.

That’s where vector DBs like Chroma, Milvus, or BilberryDB come in — they’re used as long-term associative memory.


2️⃣ Mapping Memory Types to Storage

Human Memory Agentic Equivalent Stored In Purpose
Semantic memory The LLM’s pretrained weights — (in-model) Static knowledge learned from training corpus
Episodic memory Past experiences, reflections, outcomes Vector DB (e.g., Chroma, Milvus, BilberryDB) Retrieve “lessons” or “context” relevant to new tasks
Procedural memory Patterns for reasoning, reflection, or acting Prompt scaffolds / tool logic Encoded as reusable routines (“think → act → reflect”)

For reflection loops, episodic memory is the dynamic one — it grows and evolves with each task.


3️⃣ How It Works (Step by Step)

Here’s a typical flow for a ReAct + Reflexion agent using a vector DB:

  1. Experience: The agent completes a task and generates reflections: “Lesson: Verify sources before finalizing summaries.”
  2. Embed Reflection: Convert that reflection into a vector with an embedding model (e.g., text-embedding-3-small).
  3. Store in Vector DB: Save the vector with metadata (task ID, timestamp, tags like “summarization,” “verification”).
  4. Query Later: Before a new task, query for reflections most similar to the current context.
  5. Condition Prompt: Inject retrieved reflections into the model’s context to guide reasoning.

That’s how agents “remember” past reasoning — and how reflection loops stay grounded in experience.


4️⃣ Differences Between Chroma, Milvus, BilberryDB

DB Strength Best Use
Chroma Simple, local, great for prototypes Single-agent reflection storage or desktop tools
Milvus Distributed, scalable Multi-agent systems or long-term learning environments
BilberryDB Memory-native LLM integration (recent) Agents needing recall and reasoning synchronization

In short:

  • Use Chroma to learn the concept.
  • Milvus when scaling persistent agent memory.
  • BilberryDB to experiment with higher-level, cognitive-style memory.
                   ┌────────────────────────────┐
                   │        User / Task Input   │
                   └────────────┬───────────────┘
                                │
                                ▼
                      ┌────────────────────┐
                      │  REASONING MODULE  │
                      │  (LLM, CoT, ReAct) │
                      └────────────────────┘
                                │
                                ▼
                      ┌────────────────────┐
                      │   ACTION MODULE    │
                      │  (tools, search,   │
                      │   code, etc.)      │
                      └────────────────────┘
                                │
                                ▼
                      ┌────────────────────┐
                      │  OBSERVATION LOOP  │
                      │ (results, feedback)│
                      └────────────────────┘
                                │
                                ▼
                ┌────────────────────────────────┐
                │        REFLECTION MODULE        │
                │  critique → summarize → embed   │
                └────────────────────────────────┘
                                │
                                ▼
        ┌─────────────────────────────────────────────────┐
        │            VECTOR MEMORY DATABASE               │
        │ (Chroma / Milvus / BilberryDB)                  │
        │   • store reflection embeddings + metadata       │
        │   • similarity search for related experiences    │
        └─────────────────────────────────────────────────┘
                                ▲
                                │
              retrieval of relevant reflections
                                │
                                ▼
                      ┌────────────────────┐
                      │ CONTEXT BUILDER    │
                      │ (inject past       │
                      │ reflections & facts│
                      └────────────────────┘
                                │
                                ▼
                      ┌────────────────────┐
                      │   LLM PROMPT       │
                      │  (reason + reflect)│
                      └────────────────────┘
                                │
                                ▼
                   ┌────────────────────────────┐
                   │   FINAL OUTPUT / ACTION    │
                   └────────────────────────────┘

Graphic: you mentioned you have an image ready — we can place it above or below the diagram when publishing.