Skip to content

Conversation

ckadner
Copy link
Collaborator

@ckadner ckadner commented Mar 24, 2025

Changes in this PR:

  • Update GH workflow for Spyre tests to ...
    • build and cache a test image
    • use the cached test image for both v0 and v1 tests
    • run v0 and v1 tests in parallel
  • Add build stages to Dockerfile, with separate stage for tests
  • Add .dockerignore
  • Delete untagged cache images
  • AFTER MERGE: Update branch protection rules for test-spyre as merge requirement
    • Remove test-spyre:

      image

    • Add test-spyre-v0 and test-spyre-v1:

      image

Optional:

  • Setup registry for large build cache images (Quay.io)
    • Setup Github secrets to authenticate to image registry (requires repo admin permissions)
  • Reconsider running tests sequentially if image uplodad/download time outweighs overall time savings
    • parallel wins if one of the 2 test suites runs for more than 3 mins (90s upload + 90s download/load)
  • Consider building cache images for branches (release, dev, nighlty, feature-*, ...)

Resolves #34

- Update GH workflow for Spyre tests to ...
  - build and cache a test image
  - use the cached test image for both v0 and v1 tests
  - run v0 and v1 tests in parallel
- Add build stages to Dockerfile, with separate stage for tests
- Add .dockerignore

Signed-off-by: Christian Kadner <[email protected]>
Copy link

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes:

pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

@ckadner ckadner marked this pull request as draft March 24, 2025 23:43
@ckadner
Copy link
Collaborator Author

ckadner commented Mar 27, 2025

The full image build here took 3+ min. A cached image build takes 30+ sec.

There won't be any actual cached images until the next PR after this PR gets merged. Here is a snapshot of the execution times for a cached build on my fork:

image image

@ckadner
Copy link
Collaborator Author

ckadner commented Mar 27, 2025

I updated the TODO list in the PR description (initial comment)

...

Optional:

  • Setup registry for large build cache images (Quay.io)
    • Setup Github secrets to authenticate to image registry (requires repo admin permissions)
  • Reconsider running tests sequentially if image uplodad/download time outweighs overall time savings
    • parallel wins if one of the 2 test suites runs for more than 3 mins (90s upload + 90s download/load)
  • Consider building cache images for branches (release, dev, nighlty, feature-*, ...)

Signed-off-by: Christian Kadner <[email protected]>
@ckadner ckadner force-pushed the test-with-cached-build branch from 564490c to 1079461 Compare March 28, 2025 02:20
@ckadner ckadner changed the title [WIP] [CI/BUILD] Use build cache for Spyre tests [CI/BUILD] Use build cache for Spyre tests Mar 28, 2025
@ckadner ckadner marked this pull request as ready for review March 28, 2025 02:24
@ckadner
Copy link
Collaborator Author

ckadner commented Mar 28, 2025

@joerunde -- I am done with my intended changes and removed draft status for this PR.

Would you kindly give it another review?

@ckadner ckadner requested a review from joerunde March 28, 2025 21:23
@joerunde
Copy link
Collaborator

joerunde commented Apr 3, 2025

@ckadner are we still planning on merging this, or moving to a non-docker approach instead?

@ckadner
Copy link
Collaborator Author

ckadner commented Apr 4, 2025

@ckadner are we still planning on merging this, or moving to a non-docker approach instead?

@joerunde I have a pip based version and a uv based version of the test workflow. The one using uv is in PR #70

Either of them take about a minute longer than this Docker based version (when cached) 😄 🤷🏻 🤯

I don't know what other platforms we may want to test on in the future. Sticking with Docker, we could use cross-platform compilation to build and test Z/OS images.

@joerunde
Copy link
Collaborator

joerunde commented Apr 7, 2025

Thanks @ckadner - just to stick this on here for posterity I think for now all of our multi-platform builds are being done internally, so we're safe to delete the docker workflow here.

Mind re-purposing this PR to delete the Dockerfile after merging #70?

@ckadner ckadner mentioned this pull request Apr 8, 2025
1 task
@ckadner
Copy link
Collaborator Author

ckadner commented Apr 14, 2025

Close in favor of #70

@ckadner ckadner closed this Apr 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] Use proper docker builds with caching in github actions

2 participants