Contextual Retrieval System
A hybrid retrieval system combining semantic search and BM25 with context enrichment, achieving (a naive) 2.92/3.0 average accuracy on complex queries.
Overview
This project implements a hybrid document retrieval system that combines vector embeddings (semantic search) with BM25 (lexical search) to improve query results. The system uses a novel approach of enriching document chunks with AI-generated contextual descriptions before indexing, as guided by Anthropic's blog on Contextual Retrieval.The system processes documents by splitting them into manageable chunks, generating contextual descriptions for each chunk, and then creating both embeddings and BM25 indices. When a query arrives, both retrieval methods are used and results are combined using a rank fusion algorithm.
Key Features
- Hybrid retrieval combining semantic search with BM25
- AI-powered context enrichment of document chunks
- Rank fusion algorithm for result combination
- Comprehensive token usage tracking and statistics
Challenges & Solutions
The main challenge was balancing retrieval accuracy with computational efficiency. Context enrichment improved accuracy but added token overhead and processing time. I concluded that in real-world applications, the slight improvements in RAG accuracy from contextual embeddings were minimal compared to the much increased latency as a result of API calls to LLM providers on every chunk.
Status
CompletedTech Stack
Year
2024
Related Projects
Heida
An AI command center that unifies 220+ AI models with your own API keys, featuring document intelligence, interactive tools, and persistent knowledge graphs.
MNIST Digit Classifier
Neural network implementation from scratch with 98.32% accuracy on the MNIST handwritten digit dataset.
UBC Metrics
Course difficulty prediction system with 4.84% error rate based on historical grade distributions.