-
Notifications
You must be signed in to change notification settings - Fork 327
Embedder model router #2232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Embedder model router #2232
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -90,7 +90,11 @@ def maybe_inject_local_hf_embedder(task_config: Dict[str, Any], transform_config | |||||||||||||||||||||
| if has_endpoint or not use_local: | ||||||||||||||||||||||
| return | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| from nemo_retriever.models import create_local_embedder, resolve_embed_model, is_vl_embed_model | ||||||||||||||||||||||
| from nemo_retriever.model import ( | ||||||||||||||||||||||
| create_local_embedder, | ||||||||||||||||||||||
| resolve_embed_model, | ||||||||||||||||||||||
| resolve_embed_model_use_vl, | ||||||||||||||||||||||
| ) | ||||||||||||||||||||||
|
Comment on lines
+93
to
+97
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Prompt To Fix With AIThis is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/models/inference/processor.py
Line: 93-97
Comment:
The import uses `nemo_retriever.model` (singular), but the package is `nemo_retriever.models` (plural). There is no `nemo_retriever/model.py` module in the repo, so this raises `ModuleNotFoundError` at runtime on every call to `maybe_inject_local_hf_embedder` when no remote endpoint is configured — i.e., the exact case this function is meant to handle.
```suggestion
from nemo_retriever.models import (
create_local_embedder,
resolve_embed_model,
resolve_embed_model_use_vl,
)
```
How can I resolve this? If you propose a fix, please make it concise. |
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| embed_model = resolve_embed_model( | ||||||||||||||||||||||
| task_config.get("embed_model_name") | ||||||||||||||||||||||
|
|
@@ -103,6 +107,7 @@ def maybe_inject_local_hf_embedder(task_config: Dict[str, Any], transform_config | |||||||||||||||||||||
|
|
||||||||||||||||||||||
| ingest_backend = (task_config.get("local_ingest_embed_backend") or "vllm").strip().lower() | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| model_arch = task_config.get("embed_model_arch") | ||||||||||||||||||||||
| embedder_instance = create_local_embedder( | ||||||||||||||||||||||
| embed_model, | ||||||||||||||||||||||
| backend=ingest_backend, | ||||||||||||||||||||||
|
|
@@ -112,11 +117,14 @@ def maybe_inject_local_hf_embedder(task_config: Dict[str, Any], transform_config | |||||||||||||||||||||
| enforce_eager=_to_bool(task_config.get("enforce_eager"), default=False), | ||||||||||||||||||||||
| dimensions=task_config.get("dimensions"), | ||||||||||||||||||||||
| query_max_length=int(task_config.get("query_max_length", 128)), | ||||||||||||||||||||||
| model_arch=model_arch, | ||||||||||||||||||||||
| ) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| prefix = f"{transform_config.input_type}: " if getattr(transform_config, "input_type", None) else "" | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| if is_vl_embed_model(embed_model): | ||||||||||||||||||||||
| use_vl = resolve_embed_model_use_vl(embed_model, model_arch=model_arch) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| if use_vl: | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| def _embed(texts): | ||||||||||||||||||||||
| vecs = embedder_instance.embed(texts, batch_size=local_batch_size) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this still here? Shouldn't this go away if we accept universal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like these might be architectures? Please elaborate, on why this distinction would still be required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this implementation I wasn't planning on accepting universal embedding models for now but rather focusing on either vl or text that nvidia already has on huggingface.
We would need this explicit distinction in the case to avoid routing a VL checkpoint through the text embedder and vice versa.