Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Increasing Delay for Probes in Quickstart Model #773

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jolfr
Copy link
Contributor

@jolfr jolfr commented Feb 28, 2025

Pull Request Description

Increasing the initial delay of the liveliness and readiness probes in the quickstart model definition from 120 seconds to 240 seconds.

Related Issues

Resolves: #772

Important: Before submitting, please complete the description above and review the checklist below.


Contribution Guidelines (Expand for Details)

We appreciate your contribution to aibrix! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:

Pull Request Title Format

Your PR title should start with one of these prefixes to indicate the nature of the change:

  • [Bug]: Corrections to existing functionality
  • [CI]: Changes to build process or CI pipeline
  • [Docs]: Updates or additions to documentation
  • [API]: Modifications to aibrix's API or interface
  • [CLI]: Changes or additions to the Command Line Interface
  • [Misc]: For changes not covered above (use sparingly)

Note: For changes spanning multiple categories, use multiple prefixes in order of importance.

Submission Checklist

  • PR title includes appropriate prefix(es)
  • Changes are clearly explained in the PR description
  • New and existing tests pass successfully
  • Code adheres to project style and best practices
  • Documentation updated to reflect changes (if applicable)
  • Thorough testing completed, no regressions introduced

By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.

jolfr and others added 2 commits February 28, 2025 18:21
Changing probe time to 240 seconds, as the model does not need that long to download

Signed-off-by: Thomas Jack Carroll <[email protected]>
@@ -51,7 +51,7 @@ spec:
path: /health
port: 8000
scheme: HTTP
initialDelaySeconds: 120
initialDelaySeconds: 240
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky because the waiting time depends on the model size and network heavily. Maybe quick start with a smaller model makes more sense, like opt-125m.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah reducing the model size is one way to go, it would guarantee quicker startup times. Another way could be increasing the failure threshold, that way we get quick startup times if the network is fast, but don't get pod restarts if it is slow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have included machine specs in my original bug ticket. Running on a g2-standard-4 from GCP, which has a max bandwidth of 10Gb/s, the slowest of the GPU machines, so model download is worst-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug] Probes for quickstart model kill pod
2 participants