-
Notifications
You must be signed in to change notification settings - Fork 0
Automate production of CAPI VM images #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
elastisys-staffan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! 🎉 This doesn't actually push or publish the artifact anywhere right?
The openstack part now, azure does, since it creates a VM on there and all (it's the default behaviour). |
HaoruiPeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
But I have a question concerning Openstack.
What would the the follow-up steps after the image is created? The workflow stops here, I assume that means the image is gone when the runner is destroyed if you don't upload it anywhere.
Does it make sense to upload the image to the dev projects of all our Openstack infra providers? So we can share the images to production projects from there during maintenance.
Another idea is to have a self-hosted runner that does not get destroyed after the actions finish.
We will need credentials to upload and share the openStack images, that part is on hold until a decision is made about it. |
|
Apologies for not being able to look at this sooner. Do you think we could break the jobs up so that building is one job and publishing is another? The logs are quite long so that would make it easier to follow the process. Also, I think it would add some flexibility. |
That can be done for openStack. Once we decide on how to handle the credentials. |
Would it make sense to use qemu or similar to build the Azure VHD locally and then publish it in a separate job? As I understand it that is how the Openstack flow works, so if we can make it work for Azure too they would harmonize nicely. |
I don't know actually, but it feels like adding more work to us. |
It might be worth the effort actually. I see several benefits:
Imagine we build the images here and push them to object storage or similar. Then, pushing and publishing them can be tied to some other process, like creating a capi release. We could even keep the step manual if we want to. |
I like this, and we could add it as artifacts of the action. Then some other job could just download the images as the output of this action and push it wherever it needs |
| sudo udevadm control --reload-rules | ||
| sudo udevadm trigger --name-match=kvm | ||
| - name: install qemu-kvn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT
| - name: install qemu-kvn | |
| - name: install qemu-kvm |
| path: | | ||
| ~/.config/packer/plugins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Any reason this is multiline? Wouldn't this work?
| path: | | |
| ~/.config/packer/plugins | |
| path: ~/.config/packer/plugins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember exactly, should be changed. Will look at it.
| run: | | ||
| sudo apt update && \ | ||
| sudo apt upgrade -y && \ | ||
| sudo apt install -y qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils qemu-utils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Could we not build an image that has these things already and use that? Would save some time and be less error-prone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will check that.
8c019db to
c791d3a
Compare
| sudo udevadm control --reload-rules | ||
| sudo udevadm trigger --name-match=kvm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Could this also be incorporated into the image? I assume that this is running on the docker_image image now?
Same with other "apt install/pip install" in other workflows as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The building is running on the docker image, but it still needs the host machine to have kvm enabled to work as it is disabled by default on the runners.
the pip install was me trying to get the storing of image work on safespring before i realised its not a openstackclient version problem.
I also avoided modifying the docker_image to keep it as aligned from what is expected in upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so this image is not running on another image. It's the builder that runs on the docker_image image?
If so, shouldn't you be able to build a custom image that already has this set (Like in this example ).
For this particular step it's not as critical since it's fast I guess, but for example this step should be possible to speed up and make more stable if we have a pre-created image with it already installed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have opted out of using that example as it caused so much headache when i tried it for the building part.
But i guess for that step it can be done so i can give it a try.
simonklb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase and request a re-review! 😄
ce5b032 to
caa78fa
Compare
caa78fa to
2dc7ef9
Compare
simonklb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rebase got some wonky changes. Both a failed conflict resolution and some extra whitespace issues. Please sort them out and re-request a review!
update workflow add default value to workflow inputs update workflow testing default values test run update workflow update pin ansible version ansible ver ansible ver ansible-core ver update workflow update remove custom role (temporary) add azure build step add SP login update azure envs fix typo add cache add key testing azure add gh token fix cache update workflow seperate jobs update update logs remove cahce from azure test update update fix typo add artifact upload update store step update path add store workflow add input add install openstckclient fix fix command add image-builder workflow fix branch name testing sed typo create tag quotes echo fix add docker login update openstack to use container remove checkout change workflow update .dockerignore update workflow add option testing binbash hostname test test try deps testing testing typo test enable kvm add logs env TEST test mount mount change mount change kvm testing upload artifact update mount rw add user mdkir privileged test testing enable azure add elastx store update storing inherit secrets fix naming add safespring store add safespring store change auth safespring verbose change openstack final add sshca role testing enable image builder update builder add sshca role build new image run build add volume testing add docker image add envs final1
2dc7ef9 to
ba6d047
Compare
Should be good now. |
simonklb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth inviting to a walkthrough meeting for this. There are a lot of moving parts. 😅
At least update the PR description explaining how all of this should work and how the flow looks like!
Great job doing all of this! 👍
| env: | ||
| version: ${{ inputs.version }} | ||
| tag: ${{ inputs.tag }} | ||
| docker_image: "ghcr.io/elastisys/image-builder-amd64:Automate-production-of-CAPI-VM-images-09c9dac9dc61dc069b72ac55e654cbe1a9190911" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this image never rebuilt or can we really have it hardcoded like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not, the plan is to make as a variable so we can pick up the last built one from main, as what the workflow for will do on push.
I just didn't get to that yet.
| sed -r \ | ||
| -e "s/\\\$KUBERNETES_SERIES/${series}/" \ | ||
| -e "s/\\\$KUBERNETES_VERSION/${version}/" \ | ||
| -e "s/\\\$KUBERNETES_DEB_VERSION/${package}/" \ | ||
| -e "s/\\\$IMAGE_TAG/${tag}/" \ | ||
| <"template.json" >"kubernetes.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is not envsubst available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just went into copying what we had done in other places.
| docker run -i --rm \ | ||
| -e PACKER_VAR_FILES -e PACKER_GITHUB_API_TOKEN=${{ secrets.GITHUB_TOKEN }} \ | ||
| -e SIG_IMAGE_DEFINITION -e SIG_PUBLISHER -e SIG_OFFER -e SIG_SKU \ | ||
| -e AZURE_SUBSCRIPTION_ID -e AZURE_CLIENT_ID -e AZURE_CLIENT_SECRET -e AZURE_TENANT_ID -e AZURE_LOCATION \ | ||
| -e RESOURCE_GROUP_NAME -e GALLERY_NAME -e BUILD_RESOURCE_GROUP_NAME \ | ||
| -v ${{ github.workspace }}/images/capi:/tmp/host \ | ||
| ${{ env.docker_image }} build-azure-sig-ubuntu-2404-gen2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks odd, why do you have to start the container yourself instead of running it as a workflow job task?
Edit: Found multiple manual docker run executions that I don't understand why they couldn't be normal tasks. Please enlighten me! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is this option where to run the job inside a container by default.
I tried it multiple times but i couldn't figure out why it did not work. So i opted out for the surefire method.
| on: | ||
| # push: | ||
|
|
||
| workflow_dispatch: | ||
| inputs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleanup?
| # store-openstack-image-safespring: | ||
| # uses: ./.github/workflows/store-openstack-capi-image-safespring.yml | ||
| # needs: build-openstack-image | ||
| # with: | ||
| # version: ${{ inputs.version || '1.33.1' }} | ||
| # tag: ${{ inputs.tag || '0.8' }} | ||
| # secrets: inherit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleanup?
| on: | ||
| push: | ||
| branches: | ||
| - main | ||
| # pull_request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleanup?
| - name: get tag | ||
| id: get-tag | ||
| run: | | ||
| if [ "${{ github.event_name }}" == "pull_request" ]; then | ||
| PR_TITLE="${{ github.event.pull_request.title }}" | ||
| PR_TAG=$(echo "${PR_TITLE}" | sed -e 's/ /-/g') | ||
| echo "TAG=${PR_TAG}-${{ github.sha }}" >> $GITHUB_OUTPUT | ||
| else | ||
| echo "TAG=${GITHUB_REF##*/}-${{ github.sha }}" >> $GITHUB_OUTPUT | ||
| fi | ||
| shell: bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels brittle. This assumes that everone will be good and name their PR correctly and we never change that policy.
Is it not possible to only trigger this on tags being pushed? Then you can also just rely on ${{ github.ref_name }}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That part actually should be removed, as the image is not intended to be build from anything but main.
The file will be cleaned up accordingly.
| username: ${{github.actor}} | ||
| password: ${{secrets.GITHUB_TOKEN}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
| username: ${{github.actor}} | |
| password: ${{secrets.GITHUB_TOKEN}} | |
| username: ${{ github.actor }} | |
| password: ${{ secrets.GITHUB_TOKEN }} |
| sed -r \ | ||
| -e "s/\\\$KUBERNETES_SERIES/${series}/" \ | ||
| -e "s/\\\$KUBERNETES_VERSION/${version}/" \ | ||
| -e "s/\\\$KUBERNETES_DEB_VERSION/${package}/" \ | ||
| -e "s/\\\$IMAGE_TAG/${tag}/" \ | ||
| <"template.json" >"kubernetes.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use envsubst here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is needed it seems to ssh capabilty, copied it form https://github.com/elastisys/ck8s-cluster-api/blob/main/scripts/image/roles/sshca/README.md
elastisys-staffan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused. The store jobs actually pushes the images straight to the cloud providers right? When we had the discussion meeting, didn't we agree that the images should be built here and stored as artifacts, and handle the distribution/consumption separately for now? Here are the meeting notes for reference: https://docs.google.com/document/d/18inrhGuT2yyaHINowxFbBGj7tYSwafYC2p1vrPy-i3Y/edit?tab=t.736zxsrcd694
Great work on the massive effort you put into this thing so far, and maybe I'm misunderstanding something but I think we need to sort this out before moving forward.
That is true, i will update it accordingly, got side tracked by other things and forgot about it. |
elastisys-staffan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it is work in progress I'm cool with adding this to main for testing purposes, as long as we keep upstream files unaltered. Regardless, I'm really stoked about this! 😄
| COPY --chown=imagebuilder:imagebuilder packer packer/ | ||
| COPY --chown=imagebuilder:imagebuilder Makefile Makefile | ||
| COPY --chown=imagebuilder:imagebuilder azure_targets.sh azure_targets.sh | ||
| COPY --chown=imagebuilder:imagebuilder template.json template.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little hesitant to merge this because Dockerfile and .dockerignore exists upstream and I want to avoid future conflicts if possible. If the files needs altering, maybe we can do it with a git patch or something like that and apply as part of the action?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very familiar with the suggested method 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's how you do it:
- Restore Dockerfile and .dockerignore
- Add the changes but don't commit
git diff > patches/dockerfile.patch(lets use a dedicated dir that can be used for future patches too)- To apply the changes when needed:
git apply patches/dockerfile.patch
The idea is to commit the patch file which can then be applied in the GH action prior to every build. WDYT?
Change description
Related issues
Additional context