Skip to content

Allow building images to any Docker registry and image builds in containers #1545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 46 commits into
base: main
Choose a base branch
from

Conversation

lewijacn
Copy link
Collaborator

@lewijacn lewijacn commented May 28, 2025

Description

This change introduces a new pattern for building images to our tooling. It makes use of Jib for our Java applications (except for RFS currently as it has additional Dockerfile steps) and BuildKit for our other images, to allow building images to any specified docker registry, whether that be a local registry or a remote registry like AWS ECR. This pattern also allows for building the images with a container, which the bootstrapK8s helm chart utilizes to be able to pull this Github repository and build images within a K8s pod.

At a top level this can be executed, after following the setup steps in buildImages.md, like so:

./gradlew buildImagesToRegistry

And includes options for specifying the registry endpoint as well as the architecture to build with:

./gradlew buildImagesToRegistry -PregistryEndpoint=123456789012.dkr.ecr.us-west-2.amazonaws.com/my-ecr-repo -PimageArch=arm64

TODO:

  • References to this branch should be replaced with main before merging

Issues Resolved

https://opensearch.atlassian.net/browse/MIGRATIONS-2523

Testing

Local and EKS testing

Check List

  • New functionality includes testing
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

lewijacn added 30 commits May 10, 2025 10:14
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
@lewijacn lewijacn temporarily deployed to migrations-cicd May 28, 2025 16:37 — with GitHub Actions Inactive
@lewijacn lewijacn temporarily deployed to migrations-cicd May 28, 2025 16:37 — with GitHub Actions Inactive
Copy link
Member

@peternied peternied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't go into much depth on jib or buildKit, but had commends on the overall organization

static def getFullBaseImageIdentifier(String baseImageRegistryEndpoint, String baseImageGroup, String baseImageName,
String baseImageTag) {
def baseImage = ""
def isEcr = baseImageRegistryEndpoint.contains(".ecr.") && baseImageRegistryEndpoint.contains(".amazonaws.com")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code smell: this looks like what a interface with a Docker and ECR classes would be good at solving.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me investigate an interface for this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an interface for this now thanks 👍

}

void applyJibConfigurations(Project rootProject) {
def projectsToConfigure = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets shift how this is declared to the place (/build.gradle?) where the image is defined.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm following this comment. I'd like the keep this common logic together, as well as keep the top level build.gradle as small as I can. Was this geared at trying to add some of this logic to the different subprojects?

Copy link
Member

@peternied peternied May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Html formatting got me, I meant {project}/build.gradle e.g. replayer's build.gradle file would include those details instead of having them spread over the project.

{
  baseImageName : "amazoncorretto",
  baseImageTag : "17-al2023-headless",
  imageName : "traffic_replayer",
  imageTag  : "latest"
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to say +1 - but I really like the idea of keeping consistency for base images. I would be good to get those consolidated to one place and have the image names/tags defined within the subprojects.

After going through the rest of the PR... I think keeping more of the docker stuff outside the scope of the java projects makes sense. If somebody comes along and wants to build RFS as a java application for their own image (or AMI, etc) they shouldn't be concerned with the docker configs at all - even the image name. Those aren't details that matter for a java project in isolation.

Maybe we could rename or lift the configurations for all of the docker projects rather than w/in the Utils script. Would it make sense to have a new gradle project just to track the images?

Co-authored-by: Peter Nied <[email protected]>
Signed-off-by: Tanner Lewis <[email protected]>
@lewijacn lewijacn temporarily deployed to migrations-cicd May 28, 2025 19:56 — with GitHub Actions Inactive
@lewijacn lewijacn temporarily deployed to migrations-cicd May 28, 2025 19:56 — with GitHub Actions Inactive
@lewijacn lewijacn temporarily deployed to migrations-cicd May 28, 2025 20:00 — with GitHub Actions Inactive
@lewijacn lewijacn temporarily deployed to migrations-cicd May 28, 2025 20:00 — with GitHub Actions Inactive
@lewijacn lewijacn temporarily deployed to migrations-cicd May 28, 2025 20:40 — with GitHub Actions Inactive
@lewijacn lewijacn temporarily deployed to migrations-cicd May 28, 2025 20:40 — with GitHub Actions Inactive
@lewijacn lewijacn temporarily deployed to migrations-cicd May 29, 2025 16:50 — with GitHub Actions Inactive
@lewijacn lewijacn temporarily deployed to migrations-cicd May 29, 2025 16:50 — with GitHub Actions Inactive
@@ -1,4 +1,6 @@
import com.google.cloud.tools.jib.gradle.JibTask
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this over a call but capturing it here. I'm apresenhive to approve this PR as is, it adds a new way to create docker images and I suppose I am asking why not replace the exist docker plugin with this change as it is. I'd rather we only use one tool at a time.

I know this might come at odds with the coupling in this PR with the registry publication, but I'd rather we replace com.bmuschko:gradle-docker-plugin with jib all at once and then introduce the container registry copying seperately.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't make a wholesale switch because the new images aren't binary compatible with the old ones. They need to be invoked in a totally separate way. To do that, we'd need to add a large amount of code to other systems that are on the deprecation path (docker compose and ECS).

Leaving the two coexist for a little bit lets us track upstream changes more easily while we transition. Yes, it's a task, but given the staffing/importance of making the switch, it seems like a good tradeoff.

Copy link
Collaborator

@gregschohn gregschohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tanner - there's tons of exciting stuff in here.
I've left a lot of feedback, much of it just so that I can understand more about what is going on. Please provide responses where you know the answers and if you don't, feel free to ignore.
I'd like to get some more understanding, then figure out what should be changed now, what doesn't matter, and what we can do later.
Thanks again!

@@ -28,7 +28,9 @@ dockerFilesForExternalServices.each { dockerImageName, projectName ->
project(":CreateSnapshot"),
project(":MetadataMigration")
]
def syncTask = getMigrationConsoleSyncTask(project, dockerImageName, escapedProjectName, libraries, applications)
// Setup additional sync task which doesn't depend on the base image being built, for other projects to use
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this one some more? Is this for any base image, or the test console one? Either way, I don't understand how this would be useful/how it wouldn't be problematic in some circumstances.

// Create a single sync task to copy the required files
def destDir = "build/docker/${dockerImageName}_${escapedProjectName}"
def syncTask = project.tasks.create("syncArtifact_${dockerImageName}_${escapedProjectName}", Sync) {
def taskName = "syncArtifact_${dockerImageName}_${escapedProjectName}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall putting escapedProjectName in place for some workaround. Do we still need it?

@@ -1,4 +1,5 @@
FROM migrations/elasticsearch_client_test_console:latest AS migration-console-base
ARG BASE_IMAGE=migrations/elasticsearch_client_test_console:latest
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you define this and why is it defined twice in the file (albeit with the same values)?

@@ -0,0 +1,7 @@
java \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this file used for now?

@@ -1,4 +1,6 @@
import com.google.cloud.tools.jib.gradle.JibTask
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't make a wholesale switch because the new images aren't binary compatible with the old ones. They need to be invoked in a totally separate way. To do that, we'd need to add a large amount of code to other systems that are on the deprecation path (docker compose and ECS).

Leaving the two coexist for a little bit lets us track upstream changes more easily while we transition. Yes, it's a task, but given the staffing/importance of making the switch, it seems like a good tradeoff.

Comment on lines +82 to +84
{{- if .Values.keepJobAlive }}
tail -f /dev/null
{{- end }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would anybody NOT working on this job ever use this flag?

ports:
- name: buildkit
protocol: TCP
port: 1234
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does buildkit have a default port? 1234 sounds made-up :)

ports:
- containerPort: 1234
securityContext:
privileged: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, just curious - if this gets turned off, what does the user see

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: migration-role
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be more targeted to the bootstrapping phase or is it used afterwards for migrations?

@@ -18,8 +18,14 @@ start() {
helm repo add opensearch-operator https://opensearch-project.github.io/opensearch-k8s-operator/
helm repo add strimzi https://strimzi.io/charts/

minikube start
# Development setup to allow using an insecure registry
minikube start --insecure-registry="0.0.0.0/0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WOOHOO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants