Embedding drift: model updates breaking vector search
Has anyone dealt with embedding drift when OpenAI updates the embedding model? I have 2M vectors stored in Pinecone generated with text-embedding-3-large. If the model weights change, all my stored vectors become incompatible with new query vectors.
Questions: 1. Does OpenAI version their embedding models like they do with GPT? (e.g., text-embedding-3-large-20240101) 2. If the model is updated, do I need to re-embed all my documents? 3. Has anyone built a migration strategy for this?
This is a real concern at scale. Re-embedding 2M documents would cost ~$260 and take significant compute time.
I combine approaches #2 and #3: summarize messages older than 20 turns, then embed the summaries. When the user references something from early in the conversation, RAG retrieves the relevant summary.
Works surprisingly well for long customer support sessions.
Log in to reply to this topic.