RHAIENG-1965: Notebooks CLI: implement the AIPCC `replace-markers.sh` functionality, as a start #2682

jiridanek · 2025-11-14T19:04:54Z

https://issues.redhat.com/browse/RHAIENG-1965

Follows up on

feat(scripts): add dockerfile_fragments.py for managing Dockerfile blocks #1588

Description

This script is inspired by the AIPCC replace-markers.sh script, invoked from make regen
https://gitlab.com/redhat/rhel-ai/core/base-images/app/-/blob/main/containerfiles/replace-markers.sh

The original AIPCC version uses the ed command to replace everything between
### BEGIN <filename> and ### END <filename> with the content of the <filename>.

This script currently still has the data inline, but I've

removed the miss-feature to create new block at the end of file if block is not yet present
changed the marker to be ### BEGIN and ### END, for better readability
put all blocks into a python dictionary so I can then check if there are any unexpected blocks (=probably typo)

How Has This Been Tested?

./scripts/dockerfile_fragments.py

Works for me

Self checklist (all need to be checked):

Ensure that you have run make test (gmake on macOS) before asking for review
Changes to everything except Dockerfile.konflux files should be done in odh/notebooks and automatically synced to rhds/notebooks. For Konflux-specific changes, modify Dockerfile.konflux files directly in rhds/notebooks as these require special attention in the downstream repository and flow to the upcoming RHOAI release.

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

Chores
- Standardized build-file section markers across container images for clearer, consistent annotations.
- Updated build tooling to apply and validate multiple named configuration fragments, improving reliability of image assembly.
Refactor
- Streamlined insertion behavior so files without markers remain unchanged; added validation and safer read/write handling.

…w marker format

… the end

… needed before

…ary of replacements

coderabbitai · 2025-11-14T19:05:04Z

Walkthrough

Standardizes Dockerfile section markers across 20+ image definitions by replacing informal "# ... begin/end" comments with explicit "### BEGIN ..."/"### END ..." blocks and refactors scripts/dockerfile_fragments.py to support the new marker format, add replacements-driven insertion, and validate markers.

Changes

Cohort / File(s)	Summary
Codeserver Dockerfile `codeserver/ubi9-python-3.12/Dockerfile.cpu`	Replaced informal begin/end comments with `### BEGIN ...` / `### END ...` markers around upgrade, micropipenv/uv, and oc client sections (cosmetic only).
Jupyter Dockerfiles `jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu`, `jupyter/minimal/ubi9-python-3.12/Dockerfile.*`, `jupyter/pytorch+llmcompressor/.../Dockerfile.cuda`, `jupyter/pytorch/.../Dockerfile.cuda`, `jupyter/rocm/pytorch/.../Dockerfile.rocm`, `jupyter/rocm/tensorflow/.../Dockerfile.rocm`, `jupyter/tensorflow/.../Dockerfile.cuda`, `jupyter/trustyai/.../Dockerfile.cpu`	Replaced multiple inline comment delimiters with standardized `### BEGIN ...` / `### END ...` markers across upgrade, micropipenv/uv, oc client, PDF export, and related dependency blocks (no command or control-flow changes).
RStudio Dockerfiles `rstudio/c9s-python-3.12/Dockerfile.`, `rstudio/rhel9-python-3.12/Dockerfile.`	Converted start/end comments to `### BEGIN`/`### END` markers for upgrade and micropipenv/uv sections (cosmetic only).
Runtimes Dockerfiles `runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu`, `runtimes/minimal/ubi9-python-3.12/Dockerfile.cpu`, `runtimes/pytorch+llmcompressor/.../Dockerfile.cuda`, `runtimes/pytorch/.../Dockerfile.cuda`, `runtimes/rocm-pytorch/.../Dockerfile.rocm`, `runtimes/rocm-tensorflow/.../Dockerfile.rocm`, `runtimes/tensorflow/.../Dockerfile.cuda`	Replaced informal begin/end comments with `### BEGIN`/`### END` markers across upgrade, micropipenv/uv, oc client, and dependency installation sections (no functional changes).
Scripts — fragment handling `scripts/dockerfile_fragments.py`	Refactored: introduced a `replacements` dictionary and looped application, updated `blockinfile()` signature to `blockinfile(filename: str

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Rationale: Many Dockerfile edits are homogeneous and low-risk, but the scripts/dockerfile_fragments.py refactor changes public function signature, behavior (no-op when markers absent), adds validation, and removes helpers — these require careful review and test validation across affected Dockerfiles and CI.
Attention areas:
- Ensure all ### BEGIN ... / ### END ... marker names in Dockerfiles correspond exactly to keys in the script's replacements dictionary.
- Verify the new blockinfile() signature and its callers use the updated parameter types and behavior.
- Confirm removal of is_jupyter() and is_rstudio() doesn't break other tooling or tests.
- Review tests updated to ensure they cover both marker-present replacement and no-op when markers are absent.
- Check file I/O error handling and that dry-run / CI flows are unaffected.

Possibly related PRs

NO-JIRA: refactor(Dockerfile.cpu): wrap multiple RUN commands with bash for improved readability and error handling #2645 — Overlaps changes to scripts/dockerfile_fragments.py and fragment insertion behavior (HEREDOC/newline handling); likely touches the same replacement/marker logic.

Suggested reviewers

atheo89
daniellutz

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: implementing AIPCC replace-markers.sh functionality in the Notebooks CLI, referencing the ticket RHAIENG-1965.
Description check	✅ Passed	The PR description includes all required sections: description of changes, testing instructions, and completion of the self-checklist and merge criteria.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

scripts/dockerfile_fragments.py (3)
36-75: Replacements dict and marker sanity check are solid; consider improving diagnostics

Centralizing the fragment bodies in replacements and rejecting unknown ### BEGIN/END markers is a nice safety net. You can also (a) avoid recomputing begin/end inside the line loop, and (b) make the error easier to act on (and satisfy Ruff’s B007) by including the line number:
-        with open(dockerfile, "rt") as fp:
-            for line_no, line in enumerate(fp):
-                begin = f"{"#" * 3} BEGIN"
-                end = f"{"#" * 3} END"
-                for prefix in (begin, end):
+        begin = "#" * 3 + " BEGIN"
+        end = "#" * 3 + " END"
+        with open(dockerfile, "rt") as fp:
+            for line_no, line in enumerate(fp, start=1):
+                for prefix in (begin, end):
                     if line.rstrip().startswith(prefix):
                         suffix = line[len(prefix) + 1:].rstrip()
                         if suffix not in replacements:
-                            raise ValueError(f"Expected replacement for '{prefix} {suffix}' not found in {dockerfile}")
+                            raise ValueError(
+                                f"Expected replacement for '{prefix} {suffix}' "
+                                f"not found in {dockerfile}:{line_no}"
+                            )
84-127: blockinfile behavior is clear; think about multiple-block and creation semantics

The updated blockinfile correctly (a) enforces matching BEGIN/END pairs, (b) no‑ops when markers are absent (matching the new design), and (c) normalizes trailing newlines so HEREDOCs keep a single empty line before the END marker. One thing to be aware of is that if a file ever contained multiple BEGIN/END pairs for the same prefix, this implementation would silently treat them as one big block (last BEGIN, last END); if that’s undesirable, you might want to detect and error on multiple matches for a given prefix rather than relying on last‑seen indices.

138-166: Inline tests cover key cases; dry‑run could assert on actual replacements

The tests nicely pin down the “no markers → no change”, “update in‑place”, and newline handling behaviors, and test_dry_run at least guarantees main() finishes against real Dockerfile trees. If you want stronger regression protection, you could extend test_dry_run (or add a focused test) to assert that at least one known marker in a small synthetic Dockerfile gets rewritten as expected, decoupled from the full repo layout.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ee8518 and fef65fc.

📒 Files selected for processing (23)

codeserver/ubi9-python-3.12/Dockerfile.cpu (4 hunks)
jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu (5 hunks)
jupyter/minimal/ubi9-python-3.12/Dockerfile.cpu (5 hunks)
jupyter/minimal/ubi9-python-3.12/Dockerfile.cuda (5 hunks)
jupyter/minimal/ubi9-python-3.12/Dockerfile.rocm (5 hunks)
jupyter/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda (5 hunks)
jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda (5 hunks)
jupyter/rocm/pytorch/ubi9-python-3.12/Dockerfile.rocm (5 hunks)
jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm (5 hunks)
jupyter/tensorflow/ubi9-python-3.12/Dockerfile.cuda (5 hunks)
jupyter/trustyai/ubi9-python-3.12/Dockerfile.cpu (5 hunks)
rstudio/c9s-python-3.12/Dockerfile.cpu (2 hunks)
rstudio/c9s-python-3.12/Dockerfile.cuda (3 hunks)
rstudio/rhel9-python-3.12/Dockerfile.cpu (3 hunks)
rstudio/rhel9-python-3.12/Dockerfile.cuda (3 hunks)
runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu (4 hunks)
runtimes/minimal/ubi9-python-3.12/Dockerfile.cpu (4 hunks)
runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda (4 hunks)
runtimes/pytorch/ubi9-python-3.12/Dockerfile.cuda (4 hunks)
runtimes/rocm-pytorch/ubi9-python-3.12/Dockerfile.rocm (4 hunks)
runtimes/rocm-tensorflow/ubi9-python-3.12/Dockerfile.rocm (4 hunks)
runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda (4 hunks)
scripts/dockerfile_fragments.py (5 hunks)

🧰 Additional context used

🪛 Ruff (0.14.4)

scripts/dockerfile_fragments.py

67-67: Loop control variable line_no not used within loop body

Rename unused line_no to _line_no

(B007)

74-74: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (53)

GitHub Check: Red Hat Konflux / odh-workbench-codeserver-datascience-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-rocm-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-trustyai-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-datascience-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-rocm-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-datascience-cpu-py312-ubi9-on-pull-request
GitHub Check: build (jupyter-minimal-ubi9-python-3.12, 3.12, linux/s390x, false) / build
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-rocm-py312-ubi9-on-pull-request
GitHub Check: build (runtime-cuda-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (runtime-cuda-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (jupyter-minimal-ubi9-python-3.12, 3.12, linux/ppc64le, false) / build
GitHub Check: build (codeserver-ubi9-python-3.12, 3.12, linux/arm64, false) / build
GitHub Check: build (runtime-cuda-pytorch-llmcompressor-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (runtime-cuda-tensorflow-ubi9-python-3.12, 3.12, linux/arm64, false) / build
GitHub Check: build (cuda-jupyter-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (jupyter-trustyai-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (jupyter-minimal-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (cuda-rstudio-c9s-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (cuda-jupyter-pytorch-llmcompressor-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (runtime-datascience-ubi9-python-3.12, 3.12, linux/s390x, false) / build
GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/ppc64le, false) / build
GitHub Check: build (codeserver-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (rocm-runtime-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
GitHub Check: build (cuda-jupyter-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (runtime-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (rstudio-c9s-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (cuda-jupyter-minimal-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (rocm-jupyter-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-cuda-py312-ubi9-on-pull-request
GitHub Check: build (rocm-runtime-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (rocm-jupyter-minimal-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (rocm-jupyter-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: build (cuda-jupyter-minimal-ubi9-python-3.12, 3.12, linux/arm64, false) / build
GitHub Check: build (runtime-minimal-ubi9-python-3.12, 3.12, linux/s390x, false) / build
GitHub Check: build (cuda-jupyter-tensorflow-ubi9-python-3.12, 3.12, linux/arm64, false) / build
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-rocm-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-rocm-py312-ubi9-on-pull-request
GitHub Check: build (runtime-minimal-ubi9-python-3.12, 3.12, linux/amd64, false) / build
GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cuda-py312-c9s-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cpu-py312-rhel9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cpu-py312-c9s-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-minimal-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cuda-py312-rhel9-on-pull-request
GitHub Check: build (cuda-rstudio-rhel9-python-3.12, 3.12, linux/amd64, true) / build
GitHub Check: build (rstudio-rhel9-python-3.12, 3.12, linux/amd64, true) / build

🔇 Additional comments (23)

runtimes/pytorch/ubi9-python-3.12/Dockerfile.cuda (1)

26-62: Marker standardization LGTM.

All three sections are properly marked with consistent ### BEGIN/END delimiters. The formatting aligns with the PR's standardization objectives and introduces no functional changes.

jupyter/minimal/ubi9-python-3.12/Dockerfile.rocm (1)

24-90: Marker standardization consistent across sections.

All four marked sections follow the standardized ### BEGIN/END pattern with descriptive labels. The cosmetic updates preserve all build logic unchanged.

runtimes/minimal/ubi9-python-3.12/Dockerfile.cpu (1)

24-66: Consistent marker application across runtime variants.

The three marked sections follow the established pattern. No deviations from the standardization scheme observed.

rstudio/rhel9-python-3.12/Dockerfile.cuda (1)

34-55: Marker standardization appropriate for RStudio variant.

Two sections are marked as expected for this Dockerfile variant. The reduced number of marked sections aligns with the RStudio-specific build requirements.

runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda (1)

28-64: Marker standardization applied consistently.

All three sections follow the established pattern with properly formatted ### BEGIN/END markers.

jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu (1)

62-292: Marker standardization preserves complex build logic.

Four sections are marked with consistent ### BEGIN/END delimiters. The changes preserve all conditional logic, multi-stage builds, and architecture-specific configurations unchanged.

jupyter/tensorflow/ubi9-python-3.12/Dockerfile.cuda (1)

42-100: Marker standardization applied across multi-stage build.

Four sections are marked with proper ### BEGIN/END delimiters. All build stages and their logic remain functionally identical.

runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda (1)

26-62: Marker standardization applied consistently across all variants.

All three sections follow the established pattern. The standardization is complete and uniform across the Dockerfile suite.

codeserver/ubi9-python-3.12/Dockerfile.cpu (1)

86-96: Marker formatting is consistent and properly paired across all sections.

All three marker pairs (upgrade, micropipenv/uv, and oc client) use the standardized ### BEGIN <description> and ### END <description> format with matching descriptions. The Docker instructions within each marked section remain functionally unchanged.

Also applies to: 117-119, 121-130

rstudio/c9s-python-3.12/Dockerfile.cuda (1)

18-28: Marker pairs are properly formatted and closed.

Both sections have consistent BEGIN/END markers with matching descriptions.

Also applies to: 37-39

jupyter/rocm/pytorch/ubi9-python-3.12/Dockerfile.rocm (1)

40-50: All marker pairs are properly formatted with consistent naming.

The four marked sections (upgrade, micropipenv/uv, oc client, and PDF dependencies) follow the standardized format across files.

Also applies to: 63-65, 67-76, 94-98

rstudio/rhel9-python-3.12/Dockerfile.cpu (1)

34-44: Markers are properly formatted and paired.

Both sections have well-formed BEGIN/END markers with matching descriptions.

Also applies to: 53-55

jupyter/minimal/ubi9-python-3.12/Dockerfile.cuda (1)

26-36: All marker pairs are properly formatted with matching descriptions.

The four marked sections follow the standardized convention consistently.

Also applies to: 49-51, 53-62, 88-92

runtimes/rocm-tensorflow/ubi9-python-3.12/Dockerfile.rocm (1)

24-34: Markers are properly formatted and consistently named across sections.

All three marked sections have well-formed BEGIN/END pairs.

Also applies to: 47-49, 51-60

runtimes/rocm-pytorch/ubi9-python-3.12/Dockerfile.rocm (1)

24-34: All marker pairs are properly formatted with consistent naming.

The three marked sections (upgrade, micropipenv/uv, and oc client) follow the standardized format.

Also applies to: 47-49, 51-60

runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu (2)

28-38: Markers are properly formatted and consistently applied across marked sections.

All three marked sections have well-formed BEGIN/END pairs with matching descriptions.

Also applies to: 108-110, 112-121

1-330: Verify marker compatibility with scripts/dockerfile_fragments.py expectations.

All provided Dockerfile changes consistently apply the ### BEGIN <description> and ### END <description> marker format. The marker names across files are standardized (upgrade, micropipenv/uv, oc client, PDF dependencies), but I cannot verify whether they exactly match the expectations and dictionary keys in the dockerfile_fragments.py script since it was not provided for review. Ensure the script's replacements dictionary includes all observed marker names.

jupyter/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda (1)

42-52: Standardized markers look consistent with the fragment script

All four BEGIN/END blocks use names that exactly match the keys in scripts/dockerfile_fragments.py, and the enclosed commands are unchanged, so the new fragment replacement flow should work without behavioral changes.

Also applies to: 65-67, 69-78, 96-100

jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm (1)

40-50: BEGIN/END markerization is consistent and non‑functional

The upgrade, micropipenv/uv, oc client, and PDF export sections are now wrapped in standardized markers with text matching the fragment replacement keys, and the shell/pip commands are unchanged, so behavior should remain identical while enabling scripted updates.

Also applies to: 63-65, 67-76, 92-96

jupyter/trustyai/ubi9-python-3.12/Dockerfile.cpu (1)

65-75: TrustyAI Dockerfile markers align with the fragment replacement scheme

The four annotated sections use standardized BEGIN/END markers with names matching the Python replacements dict, and the underlying upgrade, micropipenv/uv, oc client, and PDF export commands are unchanged, so this should be a no‑op for runtime behavior while making scripted regeneration possible.

Also applies to: 88-90, 92-101, 118-122

jupyter/minimal/ubi9-python-3.12/Dockerfile.cpu (1)

24-34: Minimal Jupyter markers are correctly standardized

The upgrade, micropipenv/uv, oc client, and PDF dependency sections are now wrapped in consistent ### BEGIN/END … markers whose labels match those used by dockerfile_fragments.py, with no changes to the actual commands, so the refactoring should be behavior‑neutral.

Also applies to: 47-49, 51-60, 86-90

jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda (1)

42-52: CUDA PyTorch markers are in sync with the fragment script

The four BEGIN/END blocks are named identically to the keys in the fragment replacement mapping and wrap the same upgrade, micropipenv/uv, oc client, and PDF export logic, so this is a straightforward marker standardization with no functional impact.

Also applies to: 65-67, 69-78, 96-100

rstudio/c9s-python-3.12/Dockerfile.cpu (1)

15-17: RStudio Dockerfile markers are correct and tool‑friendly

The micropipenv/uv and base upgrade sections are wrapped in standardized markers with names matching dockerfile_fragments.py’s replacements, and the underlying commands remain the same, so this is a safe, tooling‑oriented change.

Also applies to: 22-32

jiridanek · 2025-11-14T19:38:05Z

scripts/dockerfile_fragments.py

@coderabbitai create follow-up github issue for

think about multiple-block and creation semantics

The updated blockinfile correctly (a) enforces matching BEGIN/END pairs, (b) no‑ops when markers are absent (matching the new design), and (c) normalizes trailing newlines so HEREDOCs keep a single empty line before the END marker. One thing to be aware of is that if a file ever contained multiple BEGIN/END pairs for the same prefix, this implementation would silently treat them as one big block (last BEGIN, last END); if that’s undesirable, you might want to detect and error on multiple matches for a given prefix rather than relying on last‑seen indices.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Create a follow-up GitHub issue for multiple-block and creation semantics gh issue create \ --repo opendatahub-io/notebooks \ --title "Consider handling of multiple BEGIN/END block pairs with same prefix in blockinfile" \ --body "## Context This issue is a follow-up from PR #2682 and [this comment](https://github.com/opendatahub-io/notebooks/pull/2682#discussion_r2528723448). Requested by: @jiridanek ## Current Behavior The \`blockinfile\` function in \`scripts/dockerfile_fragments.py\` currently uses a simple approach for finding BEGIN/END marker pairs. When scanning a file, it updates the \`begin\` and \`end\` indices each time it encounters a matching marker: \`\`\`python for line_no, line in enumerate(original_lines): if line.rstrip() == begin_marker: begin = line_no elif line.rstrip() == end_marker: end = line_no \`\`\` This means if a file contains multiple BEGIN/END pairs for the same prefix, the function will silently treat them as one large block (using the last BEGIN and last END indices encountered). ## Concern This behavior might be undesirable if: - Multiple blocks with the same prefix are unintentional (e.g., copy-paste errors, typos) - Users expect each block to be independently managed - The \"last seen\" behavior could lead to unexpected replacements spanning unrelated sections of the file ## Suggestion Consider detecting and raising an error when multiple matches for a given prefix are found, rather than silently relying on the last-seen indices. This would make the tool more predictable and help catch configuration errors early. ## Additional Considerations - Should the tool support multiple blocks with the same prefix intentionally? - What are the use cases for creation semantics (when markers are absent)? - Should there be options to control this behavior? --- **Related PR:** #2682 **Related Comment:** https://github.com/opendatahub-io/notebooks/pull/2682#discussion_r2528723448" \ --assignee jiridanek

Length of output: 55

@jiridanek I've created a follow-up GitHub issue for you: #2684

The issue covers:

The current behavior where multiple BEGIN/END pairs with the same prefix are silently treated as one large block (using the last-seen indices)

The concern about this potentially undesirable behavior

A suggestion to detect and error on multiple matches for a given prefix

Additional considerations about creation semantics and whether multiple blocks should be intentionally supported

The issue is assigned to you and includes backlinks to this PR and comment for context.

🧬 Code Graph Analysis Results

File: scripts/dockerfile_fragments.py

Snippet: blockinfile function (full)

Description: The function that reads a file, detects BEGIN/END markers (optionally prefixed), validates marker pairing, replaces the block between BEGIN and END with provided contents, and writes back if changed. Includes error handling for read/write issues and marker mismatches, and preserves trailing newlines for HEREDOC compatibility.

Key details:

Parameters: filename (str or path), contents (str), prefix (str or None), keyword-only: comment (default "#")

Returns: None

Behavior:

Builds begin_marker and end_marker using the optional prefix.

Reads original file content; detects positions of begin and end markers (last seen positions are used if multiple exist).

Validation: raises ValueError if only one of BEGIN/END is found or if BEGIN appears after END.

If no markers found, no changes are made.

Replaces the block with:
BEGIN marker line
contents (dedented, with trailing newline normalized to preserve a single trailing newline)
END marker line

Writes back only if changes occurred.

Error handling:

IOErrors/OSErrors on read raise RuntimeError with context.

File: scripts/dockerfile_fragments.py

Snippet: TestBlockinfile class (full)

Description: Unit tests validating behavior of blockinfile:

test_adding_new_block: ensures file unchanged when no block exists and a new block is added.

test_updating_value_in_block: ensures updating content within an existing BEGIN/END block.

test_lastnewline_removal: ensures proper handling of trailing newlines to maintain a single empty line before END.

test_dry_run: exercises the main flow with real directories/files via fake filesystem integration.

Key details:

Uses pyfakefs FakeFilesystem.

Sets up files with and without existing blocks and verifies the final contents after blockinfile is invoked.

Verifies no modification when no markers present.

Verifies block replacement with normalized content and preserved markers.

jiridanek · 2025-11-14T19:38:42Z

scripts/dockerfile_fragments.py

@coderabbitai create followup github issue for

138-166: Inline tests cover key cases; dry‑run could assert on actual replacements

The tests nicely pin down the “no markers → no change”, “update in‑place”, and newline handling behaviors, and test_dry_run at least guarantees main() finishes against real Dockerfile trees. If you want stronger regression protection, you could extend test_dry_run (or add a focused test) to assert that at least one known marker in a small synthetic Dockerfile gets rewritten as expected, decoupled from the full repo layout.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Create a follow-up GitHub issue for test improvement suggestion gh issue create \ --title "Improve test coverage for dockerfile_fragments.py with synthetic Dockerfile assertions" \ --body "## Context This is a follow-up from PR #2682 to improve test coverage in \`scripts/dockerfile_fragments.py\`. ## Suggestion The current \`test_dry_run\` test validates that \`main()\` completes against real Dockerfile trees, but doesn't assert on actual replacements. For stronger regression protection, consider: - Extending \`test_dry_run\` to assert that at least one known marker in a small synthetic Dockerfile gets rewritten as expected - Adding a focused test that validates marker replacement behavior decoupled from the full repository layout This would provide better regression protection beyond just verifying the script doesn't crash. ## References - PR: https://github.com/opendatahub-io/notebooks/pull/2682 - Review comment: https://github.com/opendatahub-io/notebooks/pull/2682#discussion_r2528725273 - Requested by: @jiridanek" \ --assignee jiridanek

Length of output: 55

@jiridanek I've created a follow-up issue for the test improvement suggestion:

Issue #2683: Improve test coverage for dockerfile_fragments.py with synthetic Dockerfile assertions

The issue has been assigned to you and includes references back to this PR and review comment.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

scripts/dockerfile_fragments.py (2)
36-84: Replacements mapping and marker validation are solid; consider minor refactor & lint appeasement

The replacements dict plus the pre-pass that validates ### BEGIN/END markers against known keys is a nice safety net against typos and drift between Dockerfiles and this script.

Two small, optional improvements:

You rebuild replacements on every Dockerfile; lifting it to a module-level constant (e.g., FRAGMENTS) would avoid repeated construction and make it easier to share with other code/tests if needed.

Ruff’s TRY003 warning on the ValueError message (lines 74-77) can be silenced by either shortening the message or assigning it to a variable before raising, e.g.:
-                            raise ValueError(
-                                f"Expected replacement for '{prefix} {suffix}' "
-                                f"not found in {dockerfile}:{line_no}"
-                            )
+                            msg = (
+                                f"Expected replacement for '{prefix} {suffix}' "
+                                f"not found in {dockerfile}:{line_no}"
+                            )
+                            raise ValueError(msg)
Both are non-blocking quality-of-life tweaks.

87-99: blockinfile behavior change looks correct; note single-block assumption

The updated blockinfile API and behavior look good:

The new comment * 3 convention lines up with the ### BEGIN / ### END markers in the Dockerfiles.

Error handling for I/O plus mismatched or out-of-order markers is clear and defensive.

The “no markers found ⇒ return without modification” branch is consistent with the new semantics and the updated tests.

One design assumption worth calling out: the scan only tracks a single begin/end pair per call (last occurrence wins). If a file ever contains multiple blocks with the same prefix, only the last will be updated. If that’s not desired, you may want to detect multiple matches and raise instead of silently using the last pair.

Also applies to: 125-127

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fef65fc and aedeeb4.

📒 Files selected for processing (1)

scripts/dockerfile_fragments.py (5 hunks)

🧰 Additional context used

🪛 Ruff (0.14.4)

scripts/dockerfile_fragments.py

74-77: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)

GitHub Check: Red Hat Konflux / odh-pipeline-runtime-minimal-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-datascience-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-rocm-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cpu-py312-c9s-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-rocm-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-rocm-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-trustyai-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-rocm-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-rocm-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cpu-py312-rhel9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-datascience-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cuda-py312-c9s-on-pull-request
GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-cuda-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-codeserver-datascience-cpu-py312-ubi9-on-pull-request
GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cuda-py312-rhel9-on-pull-request
GitHub Check: build (cuda-rstudio-rhel9-python-3.12, 3.12, linux/amd64, true) / build
GitHub Check: build (rstudio-rhel9-python-3.12, 3.12, linux/amd64, true) / build

🔇 Additional comments (2)

scripts/dockerfile_fragments.py (2)

4-14: Docstring clearly documents intent and future extensibility

The high-level docstring does a good job explaining the relationship to the original replace-markers.sh and hints at future extensions (file-based or function-based fragments), which will make this script easier to evolve.

143-169: Tests nicely capture the new semantics and key edge cases

The tests around blockinfile look well targeted:

test_adding_new_block now correctly asserts that a file without markers is left untouched, matching the new behavior.

test_updating_value_in_block and test_lastnewline_removal exercise both replacement and the subtle newline trimming logic.

test_dry_run wiring ROOT_DIR’s real Dockerfile directories into pyfakefs and running main() is a pragmatic way to ensure the script stays in sync with the actual Dockerfiles without adding brittle content assertions.

Overall, the test coverage for the changes looks solid.

openshift-ci · 2025-11-14T20:28:43Z

@jiridanek: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci · 2025-11-15T11:35:29Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ide-developer, ysok
Once this PR has been reviewed and has the lgtm label, please assign harshad16 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…nctionality, as a start (opendatahub-io#2682) (#1709) * document the new behavior * migrate existing files to new marker format * apply the migration * and remove the migration code * stop adding missing block at the end of file * remove the exclusions that I needed before * restructure around a dictionary of replacements * implement rabbit suggestion

…sistency with `scripts/dockerfile_fragments.py` * opendatahub-io#2682

…sistency with `scripts/dockerfile_fragments.py` (#1710) * opendatahub-io#2682

jiridanek added 7 commits November 14, 2025 19:00

NO-JIRA: refactor(dockerfile_fragments): document the new behavior

59e16b8

NO-JIRA: refactor(dockerfile_fragments): migrate existing files to ne…

25b84d8

…w marker format

NO-JIRA: refactor(dockerfile_fragments): apply the migration

0a651b2

NO-JIRA: refactor(dockerfile_fragments): and remove the migration code

29539e9

NO-JIRA: refactor(dockerfile_fragments): stop adding missing block at…

ce0ec8d

… the end

NO-JIRA: refactor(dockerfile_fragments): remove the exclusions that I…

b3a0a3f

… needed before

NO-JIRA: refactor(dockerfile_fragments): restructure around a diction…

fef65fc

…ary of replacements

github-actions bot added the review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel label Nov 14, 2025

openshift-ci bot requested review from atheo89 and dibryant November 14, 2025 19:05

openshift-ci bot added size/l and removed size/l labels Nov 14, 2025

jiridanek changed the title ~~Jd fragments hash3 begin~~ RHAIENG-1965: Notebooks CLI: implement the AIPCC replace-markers.sh functionality, as a start Nov 14, 2025

openshift-ci bot added size/l and removed size/l labels Nov 14, 2025

coderabbitai bot reviewed Nov 14, 2025

View reviewed changes

jiridanek added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 14, 2025

openshift-ci bot added size/l and removed size/l labels Nov 14, 2025

NO-JIRA: refactor(dockerfile_fragments): implement rabbit suggestion

aedeeb4

openshift-ci bot added size/l and removed size/l labels Nov 14, 2025

jiridanek commented Nov 14, 2025

View reviewed changes

coderabbitai bot mentioned this pull request Nov 14, 2025

Improve test coverage for dockerfile_fragments.py with synthetic Dockerfile assertions #2683

Open

openshift-ci bot removed the size/l label Nov 14, 2025

openshift-ci bot added the size/l label Nov 14, 2025

coderabbitai bot mentioned this pull request Nov 14, 2025

Consider handling of multiple BEGIN/END block pairs with same prefix in blockinfile #2684

Open

coderabbitai bot reviewed Nov 14, 2025

View reviewed changes

ysok approved these changes Nov 14, 2025

View reviewed changes

openshift-ci bot assigned ysok Nov 14, 2025

openshift-ci bot added lgtm size/l and removed size/l labels Nov 14, 2025

ide-developer approved these changes Nov 15, 2025

View reviewed changes

openshift-ci bot assigned ide-developer Nov 15, 2025

jiridanek merged commit 8da6079 into opendatahub-io:main Nov 15, 2025
45 of 82 checks passed

jiridanek deleted the jd_fragments_Hash3Begin branch November 15, 2025 11:45

jiridanek added a commit to jiridanek/notebooks that referenced this pull request Nov 15, 2025

RHAIENG-1965: chore(cli): update Dockerfile.konflux* comments for con…

b8c9dad

…sistency with `scripts/dockerfile_fragments.py` * opendatahub-io#2682

jiridanek added a commit to jiridanek/notebooks that referenced this pull request Nov 15, 2025

RHAIENG-1965: chore(cli): update Dockerfile.konflux* comments for con…

97a89f7

…sistency with `scripts/dockerfile_fragments.py` * opendatahub-io#2682

jiridanek mentioned this pull request Nov 15, 2025

RHAIENG-1965: chore(cli): update Dockerfile.konflux* comments for consistency with scripts/dockerfile_fragments.py red-hat-data-services/notebooks#1710

Merged

5 tasks

jiridanek added a commit to red-hat-data-services/notebooks that referenced this pull request Nov 15, 2025

RHAIENG-1965: chore(cli): update Dockerfile.konflux* comments for con…

602fe25

…sistency with `scripts/dockerfile_fragments.py` (#1710) * opendatahub-io#2682

jiridanek mentioned this pull request Nov 15, 2025

RHAIENG-1965: chore(cli): introduce Download Elyra Bootstrapper template and subscription-manager refresh template #2685

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RHAIENG-1965: Notebooks CLI: implement the AIPCC `replace-markers.sh` functionality, as a start #2682

RHAIENG-1965: Notebooks CLI: implement the AIPCC `replace-markers.sh` functionality, as a start #2682

Uh oh!

jiridanek commented Nov 14, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Nov 14, 2025 •

edited by openshift-ci bot

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

jiridanek Nov 14, 2025

Uh oh!

coderabbitai bot Nov 14, 2025

Uh oh!

jiridanek Nov 14, 2025

Uh oh!

coderabbitai bot Nov 14, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

openshift-ci bot commented Nov 14, 2025

Uh oh!

openshift-ci bot commented Nov 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RHAIENG-1965: Notebooks CLI: implement the AIPCC replace-markers.sh functionality, as a start #2682

RHAIENG-1965: Notebooks CLI: implement the AIPCC replace-markers.sh functionality, as a start #2682

Uh oh!

Conversation

jiridanek commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Merge criteria:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 14, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

jiridanek Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

jiridanek Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Nov 14, 2025

Uh oh!

openshift-ci bot commented Nov 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RHAIENG-1965: Notebooks CLI: implement the AIPCC `replace-markers.sh` functionality, as a start #2682

RHAIENG-1965: Notebooks CLI: implement the AIPCC `replace-markers.sh` functionality, as a start #2682

jiridanek commented Nov 14, 2025 •

edited

Loading

coderabbitai bot commented Nov 14, 2025 •

edited by openshift-ci bot

Loading