Contextual Retrieval System

A hybrid retrieval system combining semantic search and BM25 with context enrichment, achieving (a naive) 2.92/3.0 average accuracy on complex queries.

Overview

This project implements a hybrid document retrieval system that combines vector embeddings (semantic search) with BM25 (lexical search) to improve query results. The system uses a novel approach of enriching document chunks with AI-generated contextual descriptions before indexing, as guided by Anthropic's blog on Contextual Retrieval.The system processes documents by splitting them into manageable chunks, generating contextual descriptions for each chunk, and then creating both embeddings and BM25 indices. When a query arrives, both retrieval methods are used and results are combined using a rank fusion algorithm.

Key Features

Hybrid retrieval combining semantic search with BM25
AI-powered context enrichment of document chunks
Rank fusion algorithm for result combination
Comprehensive token usage tracking and statistics

Challenges & Solutions

The main challenge was balancing retrieval accuracy with computational efficiency. Context enrichment improved accuracy but added token overhead and processing time. I concluded that in real-world applications, the slight improvements in RAG accuracy from contextual embeddings were minimal compared to the much increased latency as a result of API calls to LLM providers on every chunk.

Status

Completed

Tech Stack

PythonNLPSentence TransformersBM25OpenAI APIPandasNumPyNLTK

Year

2024

Contextual Retrieval System

Overview

Key Features

Challenges & Solutions

Status

Tech Stack

Year

Related Projects

Spec2MCP

Merin

Obsidian Vercel