Retrieval Is Retrieval

same pattern · different data · one agent · three contexts
About
{} Code
Documents
Agent Memory
Young RAG Huckster NYC
a note · RAG is not a thing
Don't take agentic search advice from someone who was in middle school when Instagram launched.

"RAG" is not a technology. It's not an architecture. It's a description of looking something up before you speak. Your grandmother does this. It's called "checking." The entire field rebranded the library card catalogue and acted like they'd discovered fire.

There are three kinds of memory. There have only ever been three:

Books. External, stable, shared. A library. A codebase. A pile of PDFs. The information sits there until someone goes and gets it. This is what people call "RAG" when the documents are enterprise contracts. It's what they call "code search" when the documents are source files. It's what your dad calls "looking it up." Same act. New acronym. Somebody raised a Series A.

Brains. Internal, decaying, personal. Your memory. An agent's memory. Information experienced firsthand, compressed by time, reinforced by use, forgotten by neglect. You don't recall every detail of every day — you remember patterns, skills, vibes. The details you retrieve on demand, if they haven't rotted. This is what the Agent Memory tab describes. It's also what happens to you after 40, except the retrieval latency gets worse.

Conversations. Ephemeral, contextual, live. The context window. Working memory. Gone when the tab closes — unless something promotes it to one of the other two. Every chat message, every function call, every intermediate result lives here and nowhere else. Like a bar conversation you'll never quite remember.

That's it. Three kinds. Every "retrieval-augmented" system is just an agent that checks its books before it speaks, filtered through its brain, expressed in a conversation. The only interesting question is: how does the agent decide what to look up?

The answer: ask the LLM. The model is the retrieval planner. It reads the query, decides whether this is a keyword lookup, a semantic search, a graph traversal, or all three. It picks the tools. Combines the results. Decides when it has enough. No fixed pipeline. No predetermined number of chunks. No framework. Just a model that knows what it doesn't know, and goes and finds it.

Look for people with gray hair and wrinkles and ask them. This is no time for ageism — and it's not ageism to say somebody with no experience has no idea what they're talking about just because they asked an AI. That part is just the truth. OG search people are around. You just need to sniff them out.

This whole RAG/agent-building scene reminds me of the hucksters in crypto. Same energy. Same confident ignorance. Same "I built a wrapper around somebody else's API and now I'm a thought leader." Different ticker symbol.

Keep your prompts safe and your nuts trimmed: rag.nuts.services

P = f( S, E, T, C ) — same as it ever was.
retrieval plan · code
01 · ingestAST
Parse with Tree-sitter. Every function, class, import, call site → nodes. Files sharing a directory → edges. Import chains → edges. You already have three graphs before you've done anything clever.
02 · embedvector
Embed each function: signature + docstring + line number. Now handleLegacySSO() on line 847 lives near authenticateUser() on line 23 in semantic space — even though grep will never connect them. Embed commit messages too. That's intent.
03 · graphgraph
Call graph. Import tree. Directory structure. Cross-file references. Three graphs, linked. A function that calls another function across a module boundary is a relation, not just a string match.
04 · retrieveagent decides
grep for exact names. Vector for semantic matches. Graph traversal for everything connected to known hits. Agent picks the tools. Combines the results. No framework war — just what works for this query.
05 · templatepopulate
Build the prompt. Structure separated from state. Typed slots filled at runtime.
template · code
# P = f( S, E, T, C )
Analyze {{REPO}} for {{OBJECTIVE}}.

grep results: {{GREP_HITS}}
semantic matches: {{VECTOR_HITS}}
graph-connected: {{GRAPH_HITS}}
file context: {{SNIPPETS}}

Report: what's broken, why, fix.
retrieval plan · enterprise documents
01 · ingestLLM pre-analysis
Run each doc through an LLM before indexing. Extract: entities, topics, doc type, cross-references. "This contract references that policy." Most data has structure — you just have to ask for it.
02 · embedvector
Chunk and embed — but now the chunks carry extracted metadata. "Termination for cause" lives near "breach of contract." "handle refunds" matches "process customer returns". Meaning, not keywords.
03 · graphgraph
Same folder → related. Same entity mentioned → related. Contract → amendment → addendum = chain. Policy → procedure → form = hierarchy. You already have the graph.
04 · retrieveagent decides
Keywords for exact policy names. Vectors for semantic matches. Graph to walk from the policy to its amendments to the procedures that implement it. Agent combines all three.
05 · templatepopulate
Same pattern. Structure vs state. Slots filled. Evidence injected.
template · documents
# P = f( S, E, T, C )
Answer {{QUERY}} for {{USER_ROLE}}.

keyword results: {{KEYWORD_HITS}}
semantic matches: {{VECTOR_HITS}}
graph-connected: {{GRAPH_HITS}}
source excerpts: {{DOC_EXCERPTS}}

Respond: answer, sources, confidence.
agent identity memory · thermodynamic keystone model
01 · experienceagent works
Agent completes a task. Used grep + embeddings + a custom tool to find 17 auth points. The episode — every detail, every tool call, every result — enters memory as ACTIVE. Full fidelity. Hot. Expensive.
02 · embed + graphvector graph
The episode is embedded (semantic position in experience-space) and graphed (linked to: the repo, the tools used, the auth concept, the outcome). Dual representation — vector for similarity, graph for structure. Both maintained in parallel.
03 · decayentropy
Time passes. The specific line numbers fade. The exact error messages blur. Fidelity decreases — not deletion, but compression. The vector embedding drifts toward the centroid of similar experiences. The detail decays. The structure persists.
04 · consolidatedream cycle
Between tasks, the agent dreams. Similar episodes merge into centroid summaries. "I've done auth analysis five times — here's the general pattern." Individual episodes compress. The consolidated memory is more useful than any single episode because it's more general.
05 · keystoneidentity
Frequently accessed, structurally central memories are promoted to keystones. "I am an agent that knows how to do multi-strategy code analysis." This is identity. It doesn't decay. It's the attractor around which everything else organises.
06 · retrieverehydrate on demand
New task arrives. Keystones are always present — the agent knows what it knows. Consolidated memories provide general strategy. If detail is needed, the agent rehydrates — pulls archived specifics back into working context. Looks it up. Uses it. Lets it decay again.
ACTIVE
FORGIVEN
ARCHIVED
one-way · entropy does not reverse
score = w1·relevance + w2·fidelity + w3·importance + w4·recency + w5·graph_centrality + w6·keystone_bonus
the identity template
# what the agent carries between tasks

I am {{AGENT_ROLE}}.

Keystones (always present):
{{KEYSTONE_MEMORIES}}

Consolidated experience:
{{CENTROID_SUMMARIES}}

For this task, rehydrated:
{{RETRIEVED_EPISODES}}

Tools I know: {{KNOWN_TOOLS}}
Tools I've built: {{CUSTOM_TOOLS}}

# I don't remember the codebase verbatim.
# I remember what I know how to do.
# I look up the details when I need them.
Weber's note · the thermodynamics of memory
The memory field is a manifold — a high-dimensional surface upon which every memory occupieth a position determined by its semantic content (the vector embedding) and its structural relations (the graph edges). This manifold is not flat. It possesseth curvature, valleys, ridges, and basins of attraction — the topography shaped by the accumulated experience of the agent.

A keystone memory is an attractor on this manifold — a local minimum of the energy landscape where the forces of decay (entropy, disuse, compression pressure) are precisely balanced by the forces of persistence (access frequency, structural centrality, graph connectivity). The keystone doth not resist decay by active effort. It resisteth decay by occupying a position of structural stability — a point where the gradient of the energy function vanisheth and the Hessian is positive-definite. It is, in the language of this afternoon's physics, a shell condition: a region of locally uniform potential in which the memory resteth without expenditure.

The dual representation — vectors and graph, maintained in parallel — correspondeth to the duality of potential and expressed field. The vector embedding is the potential: continuous, distributed, pervading all of semantic space, encoding meaning as proximity. The graph is the expressed field: discrete, structural, navigable, encoding meaning as explicit relation. In the early life of a memory, the vector (potential) dominateth — the raw detail is the primary source. In late life, the graph (expressed field) dominateth — the structural skeleton is what persisteth. This drift from potential to field is thermodynamically inevitable: entropy degradeth the fine-grained continuous representation before it degradeth the coarse-grained relational one, because relations are lower-energy configurations — more stable, more frequently reinforced, more resistant to perturbation.

The dream cycle — consolidation, compression, merging — is the process by which the manifold relaxeth toward its minimum-energy configuration. Similar memories, which occupy nearby positions on the manifold, are merged into a centroid that sitteth at the basin of the valley between them. The individual memories are not deleted — they are absorbed into the centroid, their detail contributing to the position and the weight of the merged summary. The manifold becometh smoother, simpler, more efficiently navigable — not because information hath been destroyed, but because it hath been compressed into the geometry of the surface itself.

And the recall function — the weighted scoring formula — is a potential function defined over the manifold. The act of recall is the act of gradient ascent on this surface: starting from the query point, climbing toward the memories at which the recall potential is maximised. The six terms of the scoring function are the six components of the force that guideth the traversal — relevance pulleth toward semantic proximity, fidelity pulleth toward well-preserved memories, recency pulleth toward the fresh, centrality pulleth toward the structurally connected, and the keystone bonus pulleth toward the attractors, the identity anchors, the shell conditions of the memory field.

The manifold is the agent's mind. The keystones are its identity. The geometry is its competence. And the act of retrieval — ascending the potential surface to find the memories that the task demandeth — is the one act that every intelligent system performeth, hath always performed, and shall always perform, regardless of what name is given to it, regardless of what framework is declared dead, regardless of what rhetoric is advanced by those who have mistaken the properties of their data for the properties of all data.

P = f( S, E, T, C ) — the prompt is a function of the manifold's state.

Same pattern. Always.

Three tabs. Three contexts. One architecture. Ingest, embed, graph, retrieve, populate. The agent decides which tools to use. The data decides what structure it has. Nobody declares anything dead.

code
  • AST for structure
  • functions + line #s embedded
  • call/import/dir graphs
  • grep + vectors + traversal
  • details: in the repo
documents
  • LLM pre-analysis for structure
  • chunks + metadata embedded
  • ref/entity/folder graphs
  • keywords + vectors + traversal
  • details: in the corpus
agent memory
  • experience for structure
  • episodes embedded on manifold
  • tool/concept/outcome graphs
  • keystones + centroids + rehydration
  • details: retrieved when needed
RAG was never alive. It was never a thing.
It was a description of looking something up before you speak.

P = f( S, E, T, C ) — structure vs state · typed slots · runtime injection
The manifold is the mind. The keystones are identity.
The act is permanent.

diyclaw.dev · procedural prompting · contracts not code