Better Context for your RAG with Contextual Retrieval

Better chunks = better RAG?

Venelin Valkov
9 min readSep 30, 2024

No matter how advanced your model (LLM) is, if the context chunks don’t provide the right information, the model won’t generate accurate answers. In this tutorial, we’ll explore a technique called contextual retrieval to improve the quality of context chunks in your RAG systems.

To give you a better understanding, let’s start with a simple example. Imagine you have a document with multiple chunks, and you want to ask a question based on one of them. Let’s have a look at a sample chunk:

For more information, please refer to 
[the documentation of `vllm`](https://docs.vllm.ai/en/stable/).

Now, you can have fun with Qwen2.5 models.

This is a good example of a chunk that could benefit from additional context. In itself, it’s not very informative. Let’s look at the one with added context:

For more information, please refer to 
[the documentation of `vllm`](https://docs.vllm.ai/en/stable/).

Now, you can have fun with Qwen2.5 models.
The chunk is situated at the end of the document, following the section on
deploying Qwen2.5 models with vLLM, and serves as a concluding remark
encouraging users to explore the capabilities of Qwen2.5 models.

--

--