Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Minimal Tokenizer Implementation #513

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from
Prev Previous commit
Next Next commit
updated main.py to use get_tokenizer() in run_model_cli()
snico432 committed Nov 28, 2024
commit fdcdb24e452d48f183c26e933c5b51cfcc448865
2 changes: 1 addition & 1 deletion exo/main.py
Original file line number Diff line number Diff line change
@@ -172,7 +172,7 @@ async def run_model_cli(node: Node, inference_engine: InferenceEngine, model_nam
if not shard:
print(f"Error: Unsupported model '{model_name}' for inference engine {inference_engine.__class__.__name__}")
return
tokenizer = await resolve_tokenizer(get_repo(shard.model_id, inference_class))
tokenizer = await node.inference_engine.get_tokenizer(shard)
request_id = str(uuid.uuid4())
callback_id = f"cli-wait-response-{request_id}"
callback = node.on_token.register(callback_id)