Contextual Retrieval System
A hybrid retrieval system combining semantic search and BM25 with context enrichment, achieving (a naive) 2.92/3.0 average accuracy on complex queries.
Overview
This project implements a hybrid document retrieval system that combines vector embeddings (semantic search) with BM25 (lexical search) to improve query results. The system uses a novel approach of enriching document chunks with AI-generated contextual descriptions before indexing, as guided by Anthropic's blog on Contextual Retrieval.The system processes documents by splitting them into manageable chunks, generating contextual descriptions for each chunk, and then creating both embeddings and BM25 indices. When a query arrives, both retrieval methods are used and results are combined using a rank fusion algorithm.
Key Features
- Hybrid retrieval combining semantic search with BM25
- AI-powered context enrichment of document chunks
- Rank fusion algorithm for result combination
- Comprehensive token usage tracking and statistics
Challenges & Solutions
The main challenge was balancing retrieval accuracy with computational efficiency. Context enrichment improved accuracy but added token overhead and processing time. I concluded that in real-world applications, the slight improvements in RAG accuracy from contextual embeddings were minimal compared to the much increased latency as a result of API calls to LLM providers on every chunk.
Status
CompletedTech Stack
Year
2024
Related Projects
Merin
An intelligent email platform reimagined for the AI era, designed to help users process emails faster with AI-powered assistance.
Obsidian Vercel
A tool for Obsidian users to avoid paying for publish/sync and host their notes on Vercel via a CI/CD pipeline.
UBC Purity Test
A fun platform that allows UBC students to test their innocence level with custom surveys for different faculties.