Systems

Postgres is the answer to 90% of your database questions.

Queues, search, vectors, JSON, time-series. You probably do not need a second database.

by Vincent Tat·Apr 8, 2026·6 min·31 reading now

For two years, every AI product had the same spine: chunk documents, embed, stuff top-k into the prompt, pray. Retrieval-augmented generation became synonymous with "enterprise AI." Vendors sold it. Consultants deployed it. Conference talks celebrated it.

Then the context windows grew. Not a little — a thousand-fold. And the question quietly shifted from "what should we retrieve?" to "what should we leave out?"

The retrieval tax

Every RAG pipeline pays three hidden costs: chunking destroys structure, embeddings flatten meaning, and top-k ranking discards the long tail. For a customer support bot answering FAQ questions, fine. For reasoning across a codebase, a legal corpus, or a research archive — catastrophic.

# The old way
chunks = split(doc, size=512)
vectors = embed(chunks)
results = top_k(query, vectors, k=5)
answer = llm(query + results)

# The new way
answer = llm(query + doc)  # the whole thing

What replaces it

Context engineering. The discipline of deciding what a model sees, in what order, at what fidelity — without pre-emptively throwing information away. It looks less like database design and more like film editing.

The tools are different. Caching layers that remember per-session. Summarizers that compress low-salience passages. Routers that decide when to fall back to retrieval and when to stream the full document. This is the new stack.

Comments

Add a comment…

Sign in
No comments yet. Be the first to share your thoughts!

More to read

AI

RAG is dead. Long live context engineering.

Web

The quiet comeback of Web Components.

Design

The LLM is the UI.