Skip to content

SAFETENSORS and OpenAI style endpoint #388

Open
@RonanKMcGovern

Description

@RonanKMcGovern

System Info

I have searched the repo here and the main server repo but don't see any information on either a) support for Safetensors (many models are saved that way on HF) and also b) whether there is an openai style endpoint that can be hit?

Seems like this library is getting better speed over vLLM and TGI for DBRX (and in general) so I'd be keen to test this out and see if I can get a docker image running.

Who can help?

@juney-nvidia

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

See docs

Expected behavior

It would be helpful to have info on endpoint style and also supported file types (specifically safetensors) in the docs.

actual behavior

NA

additional notes

NA

Metadata

Metadata

Assignees

Labels

questionFurther information is requestedtriagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions