← back to stream

RAG

#ai

RAG stands for Retrieval-Augmented Generation — enriching an LLM's answer with information retrieved on demand.

  • Retrieval — find the relevant pieces from an external index (usually a vector database like Pinecone or Qdrant).
  • Augmented — inject those pieces into the prompt as extra context.
  • Generation — the model produces the answer with that context in view.

This is how you give an LLM access to knowledge it was never trained on — private docs, fresh data, your own codebase — without retraining the model.