EngineeringFebruary 18, 2026

Fact Extraction: Teaching Agents to Listen

Raw conversation logs are useless for AI memory. Here is how DeltaMemory extracts structured facts, deduplicates them, and builds a living profile of every user.

A user tells your AI agent: "I just moved to Austin from Denver. Starting a new job at a fintech startup next week."

Two sentences. Four facts. A location change, a previous location, a new employer, and a start date.

Most AI memory systems would store this as a single conversation turn, embedded as a vector, and hope that similarity search retrieves it later. Maybe it does. Maybe it surfaces when the user asks about restaurants in Austin. But it probably does not surface when the user mentions their commute, or when the agent needs to understand their industry context.

The problem is not storage. The problem is that raw conversation logs are the wrong unit of memory.

Why conversation logs fail

Think about how you remember a conversation with a friend. You do not replay the transcript word for word. You remember the key facts: they moved, they got a new job, they seem excited about it. The actual words fade. The meaning persists.

AI agents that store raw conversation turns have the opposite problem. They remember the exact words but lose the meaning. A vector embedding of "I just moved to Austin from Denver" captures the semantic neighborhood of that sentence, but it does not create a structured fact that the agent can reason about.

This matters when the user says something related weeks later. "The weather here is brutal in August." The agent needs to connect "here" to "Austin" — a fact it extracted, not a sentence it memorized.

The extraction pipeline

When a message comes into DeltaMemory, it does not just get stored. It runs through a cognitive pipeline that extracts structured knowledge.

The first stage is summarization. The raw conversation gets condensed into a summary that preserves temporal context. "User mentioned moving to Austin from Denver" is more useful than the full back-and-forth that led to that statement.

The second stage is profile extraction. The summary gets parsed into structured facts organized by topic and sub-topic. "I just moved to Austin" becomes a profile entry: topic basic_info, sub-topic location, content Austin, with a confidence score based on how explicit the statement was.

DeltaMemory recognizes a fixed taxonomy of topics: basic info, demographics, work, education, interests, personality, relationships. Each has specific sub-topics. This structure means the agent can look up a user's location directly instead of hoping vector search returns the right conversation turn.

Confidence scoring

Not all facts are equally reliable. "My name is Sarah" deserves high confidence. "I think I might try learning piano" deserves lower confidence.

Every extracted fact gets a confidence score between 0.5 and 1.0 based on how explicit the information is. Direct statements score high. Inferences and hedged language score lower. This confidence feeds into retrieval scoring later — when the agent is building context, high-confidence facts get priority.

The deduplication problem

Here is where things get interesting. Users repeat themselves. They also update information without explicitly saying "this replaces what I told you before."

Session one: "I work at a startup."

Session twelve: "I just joined Acme Corp as a senior engineer."

A naive system stores both. Now the agent has contradictory information. Is the user at a startup or at Acme Corp? Maybe both — maybe Acme Corp is a startup. The agent does not know.

DeltaMemory solves this with merge logic. When a new fact is extracted, it gets compared against existing profiles for that user. The system makes one of four decisions:

Insert — No existing fact on this topic. Store it as new.

Update — An existing fact exists but the new information is more complete or more recent. Replace it. "Alice" becomes "Alice Smith" when the user shares their full name.

Append — Both facts are valid and complementary. The user's hobbies list grows from "painting" to "painting; piano." This applies to list-like sub-topics where multiple values make sense.

Skip — The new fact is a duplicate or less complete than what already exists. Do not store it.

For simple cases, this decision happens without an LLM call. If the new content contains the existing content, it is an update. If they are identical, it is a skip. If the sub-topic is list-like (hobbies, skills, pets), it is an append. This keeps costs down for the common cases.

For ambiguous cases — where the new fact contradicts the existing one and neither is clearly more complete — the system can use an LLM to make the merge decision. But the simple path handles the majority of cases.

Caching extraction results

Fact extraction requires an LLM call, and LLM calls are not free. If the same conversation content gets processed twice (which can happen during retries or reprocessing), you do not want to pay for extraction again.

DeltaMemory caches extraction results per user with a content-based hash key. The cache has a short TTL (five minutes by default) and LRU eviction, so it handles the burst case without growing unbounded. If the same content comes through within the TTL window, the cached facts are returned instantly.

From facts to context

Extracted facts do not just sit in storage. They are the first thing the agent reaches for when building context for a response.

DeltaMemory uses a profile-first approach to context building. Before doing any vector search or temporal retrieval, it loads the user's structured profile. Name, location, job, preferences — these facts are always available to the agent, regardless of when they were mentioned.

This is a different philosophy from "retrieve the top K similar memories." Structured facts provide a stable foundation. Vector search adds situational context on top. The agent always knows who it is talking to, and then layers in relevant episodic memories based on the current conversation.

The context builder manages a token budget, allocating portions to profiles, events, and episodic memories. Profiles get priority because they are the most information-dense. A single profile entry like "location: Austin" conveys the same information as an entire conversation turn, but costs a fraction of the tokens.

Concept extraction and knowledge graphs

Facts about the user are one layer. DeltaMemory also extracts concepts and relationships, building a knowledge graph over time.

When the user mentions "Acme Corp," that becomes a concept with a type (company) and an importance score. When they say "I work at Acme Corp," that creates a relationship between the user concept and the company concept. When they later mention "Acme is in the healthcare space," another relationship forms.

Now the agent can traverse these connections. It does not need to be told that the user works in healthcare. It can infer it: user → works at → Acme Corp → is in → healthcare. This multi-hop reasoning happens at retrieval time, not during the conversation, so it adds no latency to the response.

The compound effect

The real power of fact extraction shows up over time. After ten conversations, the agent has a structured profile with dozens of facts, a growing knowledge graph, and a set of episodic memories ranked by salience.

Compare this to an agent that stored ten raw conversation logs. The structured agent knows the user's name, location, job, company, industry, preferences, and relationships — all indexed and instantly accessible. The raw-log agent has ten blobs of text that might or might not surface the right information on any given query.

The difference compounds with every interaction. Each conversation adds new facts, refines existing ones, and extends the knowledge graph. The agent gets smarter about this specific user in a way that raw storage never achieves.

Teaching agents to listen is not about recording more. It is about understanding what was said.