-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add machine tag and inference timings #4577
Conversation
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
core/http/endpoints/openai/files.go
Outdated
@@ -120,6 +122,7 @@ func getFileFromRequest(c *fiber.Ctx) (*schema.File, error) { | |||
// @Router /v1/files/{file_id} [get] | |||
func GetFilesEndpoint(cm *config.BackendConfigLoader, appConfig *config.ApplicationConfig) func(c *fiber.Ctx) error { | |||
return func(c *fiber.Ctx) error { | |||
c.Set("LocalAI-Machine-Tag", appConfig.MachineTag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be quite sensitive to add to each reply without the user to explicitly set it out. maybe it should be behind a feature flag in the config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i'll create that opt-in as well
core/http/endpoints/openai/files.go
Outdated
@@ -82,6 +83,7 @@ func getNextFileId() int64 { | |||
func ListFilesEndpoint(cm *config.BackendConfigLoader, appConfig *config.ApplicationConfig) func(c *fiber.Ctx) error { | |||
|
|||
return func(c *fiber.Ctx) error { | |||
c.Set("LocalAI-Machine-Tag", appConfig.MachineTag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe shall we have a middleware instead that adds this to each response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me know if you need any help here!
Liking that, thanks! addition of two new fields in the reply field look reasonable, my other comments are mostly on adding the machine tag with a middleware to avoid repetitions, and to control that with a flag if returning that information back or not |
I've written a help message for LOCALAI_MACHINE_TAG config env variable I have also renamed LocalAI-Machine-Tag to just Machine-Tag header, since the header '-' separated parts can only be just capitalized, and in further use that tag transforms into Localai-Machine-Tag and confuses usage with some header parser boilerplate, which is strange, because afaik the headers are case insensitive Waiting for your feedback! |
CompletionTokens: tokenUsage.Completion, | ||
TotalTokens: tokenUsage.Prompt + tokenUsage.Completion, | ||
} | ||
if extraUsage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any specific reason to put this under a feature flag?
Since the data is already part of the response there is no extra penalty in computation as I can see, and as it's just statistical data it would probably be safe to always return it as part of the response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i thought that since LocalAI is a drop-in replacement for OpenAI API i think it needs to have strictly the same response models, since some parsers can fail due to extra fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you think that it's safe - i just remove that feature flag and this just be always included
or that can be made opt-out as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. I'm fine to keep it behind a flag, but we would need to update the docs in this case as otherwise would go unnoticed otherwise (as it does not surfaces in the cli help either)
Thank you @mintyleaf ! just few questions inline - direction looks good to me! Also - could you signoff your commits? |
…> endpoint extraUsage data is broken for now Signed-off-by: mintyleaf <[email protected]>
Signed-off-by: mintyleaf <[email protected]>
Signed-off-by: mintyleaf <[email protected]>
708c566
to
4f9ed3a
Compare
@mintyleaf I'm merging this as-is, but will have to make sure to adapt the docs for the headers that enables extra stats, maybe care to open a follow-up? Thanks anyways! |
Description
For my P2P endeavors i need to measure inference time and get exactly from which machine the request was sent
Since the #3687 is unfinished and didn't even connected to the routers, as well as it didn't provide enough control in P2P environment i created opt-in extension to the OpenAI response with included timings data in ms (given token count and duration in ms is enough to get the other data like tokens per second which is calculated on grpc-server.cpp glue backend just for printing out the timings)
Also there is header with machine hostname or opt-in custom tag is specified, which is useful when roaming through rent machines on services like vast.ai to get the static identifier for them
I'm still not sure about naming and putting into Reply protobuf packet two more doubles, so i'm waiting for feedback
Signed commits