RAG
RAG stands for Retrieval-Augmented Generation — enriching an LLM's answer with information retrieved on demand.
- Retrieval — find the relevant pieces from an external index (usually a vector database like Pinecone or Qdrant).
- Augmented — inject those pieces into the prompt as extra context.
- Generation — the model produces the answer with that context in view.
This is how you give an LLM access to knowledge it was never trained on — private docs, fresh data, your own codebase — without retraining the model.