Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add machine tag and inference timings #4577

Merged
merged 3 commits into from
Jan 17, 2025

Conversation

mintyleaf
Copy link
Contributor

@mintyleaf mintyleaf commented Jan 11, 2025

Description
For my P2P endeavors i need to measure inference time and get exactly from which machine the request was sent
Since the #3687 is unfinished and didn't even connected to the routers, as well as it didn't provide enough control in P2P environment i created opt-in extension to the OpenAI response with included timings data in ms (given token count and duration in ms is enough to get the other data like tokens per second which is calculated on grpc-server.cpp glue backend just for printing out the timings)
Also there is header with machine hostname or opt-in custom tag is specified, which is useful when roaming through rent machines on services like vast.ai to get the static identifier for them

I'm still not sure about naming and putting into Reply protobuf packet two more doubles, so i'm waiting for feedback

Signed commits

  • Yes, I signed my commits.

Copy link

netlify bot commented Jan 11, 2025

Deploy Preview for localai ready!

Name Link
🔨 Latest commit 4f9ed3a
🔍 Latest deploy log https://app.netlify.com/sites/localai/deploys/678a29c3f91dc10008c7ba47
😎 Deploy Preview https://deploy-preview-4577--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@@ -120,6 +122,7 @@ func getFileFromRequest(c *fiber.Ctx) (*schema.File, error) {
// @Router /v1/files/{file_id} [get]
func GetFilesEndpoint(cm *config.BackendConfigLoader, appConfig *config.ApplicationConfig) func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
c.Set("LocalAI-Machine-Tag", appConfig.MachineTag)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be quite sensitive to add to each reply without the user to explicitly set it out. maybe it should be behind a feature flag in the config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i'll create that opt-in as well

@@ -82,6 +83,7 @@ func getNextFileId() int64 {
func ListFilesEndpoint(cm *config.BackendConfigLoader, appConfig *config.ApplicationConfig) func(c *fiber.Ctx) error {

return func(c *fiber.Ctx) error {
c.Set("LocalAI-Machine-Tag", appConfig.MachineTag)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe shall we have a middleware instead that adds this to each response?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know if you need any help here!

@mudler
Copy link
Owner

mudler commented Jan 13, 2025

Liking that, thanks! addition of two new fields in the reply field look reasonable, my other comments are mostly on adding the machine tag with a middleware to avoid repetitions, and to control that with a flag if returning that information back or not

@mudler mudler changed the title Machine tag and inference timings data feature feat: add machine tag and inference timings Jan 13, 2025
@mintyleaf
Copy link
Contributor Author

@mudler

I've written a help message for LOCALAI_MACHINE_TAG config env variable
Now if the MachineTag option is not null the http part just uses middleware which is modifying header to that specific tag
The hostname fallback seems unreasonable, since the most time that feature possibly being used - tag shouldn't really depend on varying hostname of rented machine

I have also renamed LocalAI-Machine-Tag to just Machine-Tag header, since the header '-' separated parts can only be just capitalized, and in further use that tag transforms into Localai-Machine-Tag and confuses usage with some header parser boilerplate, which is strange, because afaik the headers are case insensitive

Waiting for your feedback!

CompletionTokens: tokenUsage.Completion,
TotalTokens: tokenUsage.Prompt + tokenUsage.Completion,
}
if extraUsage {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any specific reason to put this under a feature flag?

Since the data is already part of the response there is no extra penalty in computation as I can see, and as it's just statistical data it would probably be safe to always return it as part of the response

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought that since LocalAI is a drop-in replacement for OpenAI API i think it needs to have strictly the same response models, since some parsers can fail due to extra fields

Copy link
Contributor Author

@mintyleaf mintyleaf Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you think that it's safe - i just remove that feature flag and this just be always included

or that can be made opt-out as well

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. I'm fine to keep it behind a flag, but we would need to update the docs in this case as otherwise would go unnoticed otherwise (as it does not surfaces in the cli help either)

@mudler
Copy link
Owner

mudler commented Jan 17, 2025

Thank you @mintyleaf ! just few questions inline - direction looks good to me!

Also - could you signoff your commits? git rebase origin/master --signoff will do

@mintyleaf mintyleaf force-pushed the feature/token_timings branch from 708c566 to 4f9ed3a Compare January 17, 2025 09:58
@mudler
Copy link
Owner

mudler commented Jan 17, 2025

@mintyleaf I'm merging this as-is, but will have to make sure to adapt the docs for the headers that enables extra stats, maybe care to open a follow-up? Thanks anyways!

@mudler mudler merged commit 96f8ec0 into mudler:master Jan 17, 2025
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants