[Bugfix] Fix nomic max_model_len #18755

noooop · 2025-05-27T08:26:23Z

because we did not use the nomic context extension method.

For nomic-ai/nomic-embed-text-v1, input length greater than 2048 will result nan,
For nomic-embed-text-v2-moe the length is set to 512 by sentence_bert_config.json.

context extension

offline

from vllm import LLM

rope_theta = 1000
factor = 4.0
original_max_position_embeddings = 2048

# Use yarn to extend context
hf_overrides = {
    "rope_theta": rope_theta,
    "rope_scaling": {
        "rope_type": "yarn",
        "factor": factor,
        "original_max_position_embeddings": original_max_position_embeddings
    },
    "max_model_len": int(original_max_position_embeddings * factor)
}

llm = LLM(model="nomic-ai/nomic-embed-text-v1",
          trust_remote_code=True,
          task="embed",
          hf_overrides=hf_overrides)

online

vllm serve nomic-ai/nomic-embed-text-v1 --trust-remote-code --hf-overrides '{"rope_theta": 1000, "rope_scaling": {"rope_type": "yarn", "factor": 4.0, "original_max_position_embeddings": 2048}, "model_max_length": 8192}'

github-actions · 2025-05-27T08:26:34Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

noooop · 2025-05-27T09:21:09Z

@DarkLight1337

I have written two examples of context_extension, but I don't know where the documentation should be placed.

examples/offline_inference/context_extension/chat.py

noooop · 2025-05-27T12:40:50Z

@DarkLight1337

This is already the best I could do with the code and tests, this piece of code looks very funny.

noooop · 2025-05-27T12:58:56Z

Is it possible to merge this fix before the v0.9.0 release?

vllm/config.py

vllm/model_executor/models/bert_with_rope.py

noooop · 2025-05-27T13:42:06Z

@DarkLight1337

how about now

vllm/config.py

vllm/model_executor/models/bert_with_rope.py

tests/models/language/pooling/test_nomic_max_model_len.py

DarkLight1337

See if the test passes now

noooop · 2025-05-27T15:09:44Z

Thanks for reviewing

noooop · 2025-05-28T03:24:26Z

FAILED models/language/pooling/test_gte.py::test_models_mteb[model_info9] - AssertionError

should be fixed by #18747


In [1]: import pytest

In [2]: MTEB_EMBED_TOL = 1e-4

In [3]: vllm_main_score = 0.7583957427443586

In [5]: st_main_score = 0.758473459018872

In [6]: st_main_score == pytest.approx(vllm_main_score, abs=MTEB_EMBED_TOL)
Out[6]: True

In [7]: st_main_score == pytest.approx(vllm_main_score, rel=MTEB_EMBED_TOL)
Out[7]: False

fix nomic max_model_len

1e0edcc

+ examples

994ac46

mergify bot added the documentation Improvements or additions to documentation label May 27, 2025

DarkLight1337 reviewed May 27, 2025

View reviewed changes

examples/offline_inference/context_extension/chat.py Outdated Show resolved Hide resolved

noooop added 4 commits May 27, 2025 18:16

fix

6333d80

fix

b2846e2

fix

e1e920d

fix

75ed7be

noooop added 2 commits May 27, 2025 20:43

fix

52d9ce7

fix

c594729

noooop marked this pull request as ready for review May 27, 2025 12:58

noooop requested a review from ywang96 as a code owner May 27, 2025 12:58

fix

3ab82d6

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/model_executor/models/bert_with_rope.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/model_executor/models/bert_with_rope.py Outdated Show resolved Hide resolved

noooop added 2 commits May 27, 2025 21:39

fix

5466920

fix

9966739

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/model_executor/models/bert_with_rope.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/model_executor/models/bert_with_rope.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/model_executor/models/bert_with_rope.py Outdated Show resolved Hide resolved

noooop added 2 commits May 27, 2025 22:02

fix

d06275a

fix

653d573

DarkLight1337 reviewed May 27, 2025

View reviewed changes

vllm/model_executor/models/bert_with_rope.py Show resolved Hide resolved

noooop added 3 commits May 27, 2025 22:23

fix

c04050a

fix

b5fa6bd

fix

c746de9

DarkLight1337 reviewed May 27, 2025

View reviewed changes

tests/models/language/pooling/test_nomic_max_model_len.py Outdated Show resolved Hide resolved

noooop added 2 commits May 27, 2025 23:05

fix

886eb32

fix

de07c5d

DarkLight1337 approved these changes May 27, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) May 27, 2025 15:11

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label May 27, 2025

noooop added 2 commits May 27, 2025 23:13

Merge branch 'vllm-project:main' into fix_nomic

cc0f499

Merge branch 'vllm-project:main' into fix_nomic

ada5fb6

noooop mentioned this pull request May 28, 2025

[Model]: Fused MoE for nomic-embed-text-v2-moe #18321

Open

fix

961ef1a

auto-merge was automatically disabled May 28, 2025 01:56
Head branch was pushed to by a user without write access

Merge branch 'vllm-project:main' into fix_nomic

5ac1f2f

vllm-bot merged commit 3e9ce60 into vllm-project:main May 28, 2025
56 of 67 checks passed

noooop mentioned this pull request May 29, 2025

[Bugfix] Fix the failing gte embedding test #18720

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Fix nomic max_model_len #18755

[Bugfix] Fix nomic max_model_len #18755

Uh oh!

noooop commented May 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 27, 2025

Uh oh!

noooop commented May 27, 2025

Uh oh!

Uh oh!

noooop commented May 27, 2025

Uh oh!

noooop commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

noooop commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Uh oh!

noooop commented May 27, 2025

Uh oh!

noooop commented May 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Bugfix] Fix nomic max_model_len #18755

[Bugfix] Fix nomic max_model_len #18755

Uh oh!

Conversation

noooop commented May 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

context extension

Uh oh!

github-actions bot commented May 27, 2025

Uh oh!

noooop commented May 27, 2025

Uh oh!

Uh oh!

noooop commented May 27, 2025

Uh oh!

noooop commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

noooop commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

noooop commented May 27, 2025

Uh oh!

noooop commented May 28, 2025

Uh oh!

Uh oh!

Uh oh!

noooop commented May 27, 2025 •

edited by github-actions bot

Loading