Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Unifying versions for helm and router #80

Open
gaocegege opened this issue Feb 7, 2025 · 11 comments
Open

Discussion: Unifying versions for helm and router #80

gaocegege opened this issue Feb 7, 2025 · 11 comments
Labels
question Further information is requested

Comments

@gaocegege
Copy link
Collaborator

Currently, the stack version in the router is hard-coded at this line.

The Helm chart has its own version management. Should we maintain separate versions for different components, or just use the same version across the board?

@gaocegege gaocegege added the question Further information is requested label Feb 7, 2025
@gaocegege
Copy link
Collaborator Author

gaocegege commented Feb 7, 2025

I really think we should steer clear of using the latest tag in the Helm chart for lmcache/lmstack-router:latest. It always pulls from the Docker registry, which can take extra time. Plus, it makes troubleshooting a bit tricky since we won't know exactly which commit the user is on.

Related to #74

@YuhanLiu11
Copy link
Collaborator

Good question! @ApostaC what do you think?

@ApostaC
Copy link
Collaborator

ApostaC commented Feb 7, 2025

@gaocegege Do you know what the best industry practice is towards this?

My current feeling is we should automate the router docker container image build and upload it to github docker registry with the version ID defined in the router (or maybe the commit id) as the tag.

But not sure how to best maintain the compatibility issue between the router and the helm chart.

@gaocegege
Copy link
Collaborator Author

We usually keep the Helm chart and router in separate repos. The Helm chart repo gets published via GitHub Pages (there's a handy guide in the Helm docs here), so users can easily pull and use the chart, with the help of Helm chart operator on K8s (example here) or just via CLI:

helm repo add vllm-production-stack https://production-stack.github.io/charts

The image tags for the deployed applications will use .Chart.AppVersion by default:

      containers:
      - name: router
        image: "{{ .Values.image }}:{{ default .Chart.AppVersion .Values.imageTag }}"

In this case, the router could use Helm Chart App version, while the Chart itself could use Helm Chart version. This setup accommodates different deployment cycles since the Helm chart may have more or fewer changes than the router.

This approach is pretty common in the industry, but I’m not sure it’s the best fit for us since we’re also deploying things like lmcache and other components, each with their own versions. We might want to skip using AppVersion and just define imageTag for each component in the values instead. That way, we can keep versioning separate for each piece, even if it means bumping the Helm chart version more often. Just a thought.

@gaocegege
Copy link
Collaborator Author

Here’s what we could do:

  • Turn this repo into a Helm Chart Repository
    • Set up GitHub Pages to host the charts
    • Add a workflow to automate Helm chart releases
    • Pin the router’s image tags to specific versions

@gaocegege
Copy link
Collaborator Author

I strongly feel we should introduce a Kubernetes Custom Resource Definition (CRD) to handle the complexity, if we add more components like lmcache. At that point, managing configurations with just a Helm chart becomes too cumbersome, and we’ll need to write code to handle the logic and orchestration properly.

@ApostaC
Copy link
Collaborator

ApostaC commented Feb 9, 2025

We usually keep the Helm chart and router in separate repos. The Helm chart repo gets published via GitHub Pages (there's a handy guide in the Helm docs here), so users can easily pull and use the chart, with the help of Helm chart operator on K8s (example here) or just via CLI:

helm repo add vllm-production-stack https://production-stack.github.io/charts
The image tags for the deployed applications will use .Chart.AppVersion by default:

  containers:
  - name: router
    image: "{{ .Values.image }}:{{ default .Chart.AppVersion .Values.imageTag }}"

In this case, the router could use Helm Chart App version, while the Chart itself could use Helm Chart version. This setup accommodates different deployment cycles since the Helm chart may have more or fewer changes than the router.

This approach is pretty common in the industry, but I’m not sure it’s the best fit for us since we’re also deploying things like lmcache and other components, each with their own versions. We might want to skip using AppVersion and just define imageTag for each component in the values instead. That way, we can keep versioning separate for each piece, even if it means bumping the Helm chart version more often. Just a thought.

Thanks for the suggestion!

The helm chart is already hosted on the GitHub page https://vllm-project.github.io/production-stack/index.yaml and the a new release will be created every time the version is increased.

Since the current repo is under vllm-project, I would like to not separating it to multiple repos for now, as it may incur extra complexities in logistics.

@0xThresh
Copy link
Contributor

Since the router component and model component(s) have separate services and deployments, and the router code is independent of the Helm chart, I would say that having separate versioning makes sense. For Open WebUI we have two separate charts with separate versioning, one for the actual Open WebUI deployment and another for an optional add-on called Pipelines: https://github.com/open-webui/helm-charts/tree/main/charts

We then add the Pipelines chart as an optional dependency to the main Open WebUI chart to allow people to easily deploy and manage it as part of their OWUI chart. I don't think the separate chart for the router is necessary here, just throwing that out there as an example.

@mikeengland
Copy link

Out of interest, why is the router container stored under the lmcache project as lmcache/lm-router? This confused me at first, as the router is part of this vllm repository, and not lmcache!

Was this just done for convenience?

+1 on pinning a router version, with the option to override it via a helm variable.

@ApostaC
Copy link
Collaborator

ApostaC commented Feb 11, 2025

Out of interest, why is the router container stored under the lmcache project as lmcache/lm-router? This confused me at first, as the router is part of this vllm repository, and not lmcache!

Was this just done for convenience?

Yeah, it's just for convenience right now before we find a better place to host the docker image.

@gaocegege
Copy link
Collaborator Author

Currently, our Docker image action repushes the image with the same tag if the router version in setup.py remains unchanged, which is not a common practice, The versioned image should be immutable.

https://github.com/vllm-project/production-stack/blob/main/.github/workflows/router-docker-release.yml#L45

I suggest using the Git commit as the tag for the pull request, and only incrementing the version during the release process.

@gaocegege gaocegege changed the title Discussion: Unifying versions for helm and router, or not Discussion: Unifying versions for helm and router Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants