Hybrid search: combining embeddings with BM25 for better RAG
Victor HuangMar 15, 2026
Pure vector search misses keyword-heavy queries. I've implemented hybrid search (vector + BM25) and the improvement is significant.
Architecture
1. Store documents in both a vector DB (Pinecone) and a text search engine (Elasticsearch) 2. Query both in parallel 3. Use Reciprocal Rank Fusion to combine results
def hybrid_search(query, k=10, alpha=0.7):
# Vector search
embedding = get_embedding(query)
vector_results = pinecone_index.query(vector=embedding, top_k=k)
# BM25 search
bm25_results = es.search(index="docs", body={"query": {"match": {"text": query}}})
# Reciprocal Rank Fusion
return rrf_combine(vector_results, bm25_results, alpha=alpha)
On our benchmark, hybrid search improved recall@10 from 0.78 to 0.91 compared to vector-only search.
5.7k views29 replies87 likes
Log in to reply to this topic.