Add an option to prevent immediate embeddings (re)computation #859
                  
                    
                      bpolaszek
                    
                  
                
                  started this conversation in
                Feedback & Feature Proposal
              
            Replies: 0 comments
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
        
    
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello there,
I've been playing with multiple embedders recently, on a 3M+ datasets, mostly for a POC to decide which embedder would be the most relevant for semantic searches given my dataset.
When your index has too much entries, adding a new embedder via
PATCH /indexes/:indexUid/settings/embedderswill trigger API calls on all your documents, likely leading to 429 / 500 errors because of the amount of entries to process, and the task will actually never complete. The user has no control over the embedder's API rate-limit, and Meilisearch is likely to process documents by huge batches.Another use case I had was to edit the
documentTemplateon an existing embedder: even if the rendered document template wouldn't change, Meilisearch triggers an immediate recomputation for your whole dataset (in my case: an index of 40GB!). If you cancel the task because of that recomputation, Meilisearch also cancels the update of embedders settings.Could you consider adding a parameter like
recompute: true/false(defaultstrue) onPATCH /indexes/:indexUid/settings/embeddersor something similar? The idea behind this is wheneverrecomputeisfalse, just update embedder settings, period. Embeddings would only be computed on subsequent document additions or updates, and the user should provide vectors for existing documents by themself.Beta Was this translation helpful? Give feedback.
All reactions