Every six weeks a vendor slide deck promises that their reasoning model has “obsoleted RAG.” Every six weeks another deck claims the opposite. The engineering answer is boring: they serve different jobs.

What each technique is actually good at

Retrieval-augmented generation (RAG) is an information-transport mechanism. You want the model to ground its output in a specific corpus it doesn't carry in its weights. This shines for:

  • Knowledge that changes faster than the model trains.
  • Compliance-grade citation requirements.
  • Long-tail facts.

Reasoning models spend test-time compute on a multi-step chain-of-thought that is not exposed to the user. This shines for:

  • Problems whose answer requires a proof, a plan, or a multi-hop deduction.
  • Problems where the input is self-contained (no external facts needed).
  • Verification: checking whether a generated solution is actually correct.

Where the two combine

The best 2026 systems we've deployed use reasoning around retrieval. The pipeline:

  1. Retrieve a broad set of candidate passages (high recall).
  2. Ask a reasoning model to plan the query decomposition.
  3. Re-retrieve with the decomposed sub-queries.
  4. Have the reasoning model assemble the final answer with citations.

This is more expensive than plain RAG but uncomfortably more accurate on LIACC's legal-text benchmark.

The 2026 heuristic

Write down the task. If the ideal answer would be found in a manual, use RAG. If the ideal answer would be derived in a notebook, use reasoning. If it's both — use the composed pipeline above and accept the inference bill.