Skip to content

Conversation

@jiridanek
Copy link
Member

@jiridanek jiridanek commented Nov 14, 2025

https://issues.redhat.com/browse/RHAIENG-1965

Follows up on

Description

This script is inspired by the AIPCC replace-markers.sh script, invoked from make regen
https://gitlab.com/redhat/rhel-ai/core/base-images/app/-/blob/main/containerfiles/replace-markers.sh

The original AIPCC version uses the ed command to replace everything between
### BEGIN <filename> and ### END <filename> with the content of the <filename>.

This script currently still has the data inline, but I've

  • removed the miss-feature to create new block at the end of file if block is not yet present
  • changed the marker to be ### BEGIN and ### END, for better readability
  • put all blocks into a python dictionary so I can then check if there are any unexpected blocks (=probably typo)

How Has This Been Tested?

./scripts/dockerfile_fragments.py

Works for me

Self checklist (all need to be checked):

  • Ensure that you have run make test (gmake on macOS) before asking for review
  • Changes to everything except Dockerfile.konflux files should be done in odh/notebooks and automatically synced to rhds/notebooks. For Konflux-specific changes, modify Dockerfile.konflux files directly in rhds/notebooks as these require special attention in the downstream repository and flow to the upcoming RHOAI release.

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Chores
    • Standardized build-file section markers across container images for clearer, consistent annotations.
    • Updated build tooling to apply and validate multiple named configuration fragments, improving reliability of image assembly.
  • Refactor
    • Streamlined insertion behavior so files without markers remain unchanged; added validation and safer read/write handling.

@github-actions github-actions bot added the review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel label Nov 14, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 14, 2025

Walkthrough

Standardizes Dockerfile section markers across 20+ image definitions by replacing informal "# ... begin/end" comments with explicit "### BEGIN ..."/"### END ..." blocks and refactors scripts/dockerfile_fragments.py to support the new marker format, add replacements-driven insertion, and validate markers.

Changes

Cohort / File(s) Summary
Codeserver Dockerfile
codeserver/ubi9-python-3.12/Dockerfile.cpu
Replaced informal begin/end comments with ### BEGIN ... / ### END ... markers around upgrade, micropipenv/uv, and oc client sections (cosmetic only).
Jupyter Dockerfiles
jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu, jupyter/minimal/ubi9-python-3.12/Dockerfile.*, jupyter/pytorch+llmcompressor/.../Dockerfile.cuda, jupyter/pytorch/.../Dockerfile.cuda, jupyter/rocm/pytorch/.../Dockerfile.rocm, jupyter/rocm/tensorflow/.../Dockerfile.rocm, jupyter/tensorflow/.../Dockerfile.cuda, jupyter/trustyai/.../Dockerfile.cpu
Replaced multiple inline comment delimiters with standardized ### BEGIN ... / ### END ... markers across upgrade, micropipenv/uv, oc client, PDF export, and related dependency blocks (no command or control-flow changes).
RStudio Dockerfiles
rstudio/c9s-python-3.12/Dockerfile.*, rstudio/rhel9-python-3.12/Dockerfile.*
Converted start/end comments to ### BEGIN/### END markers for upgrade and micropipenv/uv sections (cosmetic only).
Runtimes Dockerfiles
runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu, runtimes/minimal/ubi9-python-3.12/Dockerfile.cpu, runtimes/pytorch+llmcompressor/.../Dockerfile.cuda, runtimes/pytorch/.../Dockerfile.cuda, runtimes/rocm-pytorch/.../Dockerfile.rocm, runtimes/rocm-tensorflow/.../Dockerfile.rocm, runtimes/tensorflow/.../Dockerfile.cuda
Replaced informal begin/end comments with ### BEGIN/### END markers across upgrade, micropipenv/uv, oc client, and dependency installation sections (no functional changes).
Scripts — fragment handling
scripts/dockerfile_fragments.py
Refactored: introduced a replacements dictionary and looped application, updated blockinfile() signature to `blockinfile(filename: str

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Rationale: Many Dockerfile edits are homogeneous and low-risk, but the scripts/dockerfile_fragments.py refactor changes public function signature, behavior (no-op when markers absent), adds validation, and removes helpers — these require careful review and test validation across affected Dockerfiles and CI.
  • Attention areas:
    • Ensure all ### BEGIN ... / ### END ... marker names in Dockerfiles correspond exactly to keys in the script's replacements dictionary.
    • Verify the new blockinfile() signature and its callers use the updated parameter types and behavior.
    • Confirm removal of is_jupyter() and is_rstudio() doesn't break other tooling or tests.
    • Review tests updated to ensure they cover both marker-present replacement and no-op when markers are absent.
    • Check file I/O error handling and that dry-run / CI flows are unaffected.

Possibly related PRs

Suggested reviewers

  • atheo89
  • daniellutz

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: implementing AIPCC replace-markers.sh functionality in the Notebooks CLI, referencing the ticket RHAIENG-1965.
Description check ✅ Passed The PR description includes all required sections: description of changes, testing instructions, and completion of the self-checklist and merge criteria.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from atheo89 and dibryant November 14, 2025 19:05
@openshift-ci openshift-ci bot added size/l and removed size/l labels Nov 14, 2025
@jiridanek jiridanek changed the title Jd fragments hash3 begin RHAIENG-1965: Notebooks CLI: implement the AIPCC replace-markers.sh functionality, as a start Nov 14, 2025
@openshift-ci openshift-ci bot added size/l and removed size/l labels Nov 14, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
scripts/dockerfile_fragments.py (3)

36-75: Replacements dict and marker sanity check are solid; consider improving diagnostics

Centralizing the fragment bodies in replacements and rejecting unknown ### BEGIN/END markers is a nice safety net. You can also (a) avoid recomputing begin/end inside the line loop, and (b) make the error easier to act on (and satisfy Ruff’s B007) by including the line number:

-        with open(dockerfile, "rt") as fp:
-            for line_no, line in enumerate(fp):
-                begin = f"{"#" * 3} BEGIN"
-                end = f"{"#" * 3} END"
-                for prefix in (begin, end):
+        begin = "#" * 3 + " BEGIN"
+        end = "#" * 3 + " END"
+        with open(dockerfile, "rt") as fp:
+            for line_no, line in enumerate(fp, start=1):
+                for prefix in (begin, end):
                     if line.rstrip().startswith(prefix):
                         suffix = line[len(prefix) + 1:].rstrip()
                         if suffix not in replacements:
-                            raise ValueError(f"Expected replacement for '{prefix} {suffix}' not found in {dockerfile}")
+                            raise ValueError(
+                                f"Expected replacement for '{prefix} {suffix}' "
+                                f"not found in {dockerfile}:{line_no}"
+                            )

84-127: blockinfile behavior is clear; think about multiple-block and creation semantics

The updated blockinfile correctly (a) enforces matching BEGIN/END pairs, (b) no‑ops when markers are absent (matching the new design), and (c) normalizes trailing newlines so HEREDOCs keep a single empty line before the END marker. One thing to be aware of is that if a file ever contained multiple BEGIN/END pairs for the same prefix, this implementation would silently treat them as one big block (last BEGIN, last END); if that’s undesirable, you might want to detect and error on multiple matches for a given prefix rather than relying on last‑seen indices.


138-166: Inline tests cover key cases; dry‑run could assert on actual replacements

The tests nicely pin down the “no markers → no change”, “update in‑place”, and newline handling behaviors, and test_dry_run at least guarantees main() finishes against real Dockerfile trees. If you want stronger regression protection, you could extend test_dry_run (or add a focused test) to assert that at least one known marker in a small synthetic Dockerfile gets rewritten as expected, decoupled from the full repo layout.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ee8518 and fef65fc.

📒 Files selected for processing (23)
  • codeserver/ubi9-python-3.12/Dockerfile.cpu (4 hunks)
  • jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu (5 hunks)
  • jupyter/minimal/ubi9-python-3.12/Dockerfile.cpu (5 hunks)
  • jupyter/minimal/ubi9-python-3.12/Dockerfile.cuda (5 hunks)
  • jupyter/minimal/ubi9-python-3.12/Dockerfile.rocm (5 hunks)
  • jupyter/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda (5 hunks)
  • jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda (5 hunks)
  • jupyter/rocm/pytorch/ubi9-python-3.12/Dockerfile.rocm (5 hunks)
  • jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm (5 hunks)
  • jupyter/tensorflow/ubi9-python-3.12/Dockerfile.cuda (5 hunks)
  • jupyter/trustyai/ubi9-python-3.12/Dockerfile.cpu (5 hunks)
  • rstudio/c9s-python-3.12/Dockerfile.cpu (2 hunks)
  • rstudio/c9s-python-3.12/Dockerfile.cuda (3 hunks)
  • rstudio/rhel9-python-3.12/Dockerfile.cpu (3 hunks)
  • rstudio/rhel9-python-3.12/Dockerfile.cuda (3 hunks)
  • runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu (4 hunks)
  • runtimes/minimal/ubi9-python-3.12/Dockerfile.cpu (4 hunks)
  • runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda (4 hunks)
  • runtimes/pytorch/ubi9-python-3.12/Dockerfile.cuda (4 hunks)
  • runtimes/rocm-pytorch/ubi9-python-3.12/Dockerfile.rocm (4 hunks)
  • runtimes/rocm-tensorflow/ubi9-python-3.12/Dockerfile.rocm (4 hunks)
  • runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda (4 hunks)
  • scripts/dockerfile_fragments.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.14.4)
scripts/dockerfile_fragments.py

67-67: Loop control variable line_no not used within loop body

Rename unused line_no to _line_no

(B007)


74-74: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (53)
  • GitHub Check: Red Hat Konflux / odh-workbench-codeserver-datascience-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-rocm-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-trustyai-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-datascience-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-rocm-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-datascience-cpu-py312-ubi9-on-pull-request
  • GitHub Check: build (jupyter-minimal-ubi9-python-3.12, 3.12, linux/s390x, false) / build
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-rocm-py312-ubi9-on-pull-request
  • GitHub Check: build (runtime-cuda-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (runtime-cuda-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (jupyter-minimal-ubi9-python-3.12, 3.12, linux/ppc64le, false) / build
  • GitHub Check: build (codeserver-ubi9-python-3.12, 3.12, linux/arm64, false) / build
  • GitHub Check: build (runtime-cuda-pytorch-llmcompressor-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (runtime-cuda-tensorflow-ubi9-python-3.12, 3.12, linux/arm64, false) / build
  • GitHub Check: build (cuda-jupyter-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (jupyter-trustyai-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (jupyter-minimal-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (cuda-rstudio-c9s-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (cuda-jupyter-pytorch-llmcompressor-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (runtime-datascience-ubi9-python-3.12, 3.12, linux/s390x, false) / build
  • GitHub Check: build (jupyter-datascience-ubi9-python-3.12, 3.12, linux/ppc64le, false) / build
  • GitHub Check: build (codeserver-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (rocm-runtime-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
  • GitHub Check: build (cuda-jupyter-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (runtime-datascience-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (rstudio-c9s-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (cuda-jupyter-minimal-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (rocm-jupyter-pytorch-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-cuda-py312-ubi9-on-pull-request
  • GitHub Check: build (rocm-runtime-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (rocm-jupyter-minimal-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (rocm-jupyter-tensorflow-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: build (cuda-jupyter-minimal-ubi9-python-3.12, 3.12, linux/arm64, false) / build
  • GitHub Check: build (runtime-minimal-ubi9-python-3.12, 3.12, linux/s390x, false) / build
  • GitHub Check: build (cuda-jupyter-tensorflow-ubi9-python-3.12, 3.12, linux/arm64, false) / build
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-rocm-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-rocm-py312-ubi9-on-pull-request
  • GitHub Check: build (runtime-minimal-ubi9-python-3.12, 3.12, linux/amd64, false) / build
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cuda-py312-c9s-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cpu-py312-rhel9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cpu-py312-c9s-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-minimal-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cuda-py312-rhel9-on-pull-request
  • GitHub Check: build (cuda-rstudio-rhel9-python-3.12, 3.12, linux/amd64, true) / build
  • GitHub Check: build (rstudio-rhel9-python-3.12, 3.12, linux/amd64, true) / build
🔇 Additional comments (23)
runtimes/pytorch/ubi9-python-3.12/Dockerfile.cuda (1)

26-62: Marker standardization LGTM.

All three sections are properly marked with consistent ### BEGIN/END delimiters. The formatting aligns with the PR's standardization objectives and introduces no functional changes.

jupyter/minimal/ubi9-python-3.12/Dockerfile.rocm (1)

24-90: Marker standardization consistent across sections.

All four marked sections follow the standardized ### BEGIN/END pattern with descriptive labels. The cosmetic updates preserve all build logic unchanged.

runtimes/minimal/ubi9-python-3.12/Dockerfile.cpu (1)

24-66: Consistent marker application across runtime variants.

The three marked sections follow the established pattern. No deviations from the standardization scheme observed.

rstudio/rhel9-python-3.12/Dockerfile.cuda (1)

34-55: Marker standardization appropriate for RStudio variant.

Two sections are marked as expected for this Dockerfile variant. The reduced number of marked sections aligns with the RStudio-specific build requirements.

runtimes/tensorflow/ubi9-python-3.12/Dockerfile.cuda (1)

28-64: Marker standardization applied consistently.

All three sections follow the established pattern with properly formatted ### BEGIN/END markers.

jupyter/datascience/ubi9-python-3.12/Dockerfile.cpu (1)

62-292: Marker standardization preserves complex build logic.

Four sections are marked with consistent ### BEGIN/END delimiters. The changes preserve all conditional logic, multi-stage builds, and architecture-specific configurations unchanged.

jupyter/tensorflow/ubi9-python-3.12/Dockerfile.cuda (1)

42-100: Marker standardization applied across multi-stage build.

Four sections are marked with proper ### BEGIN/END delimiters. All build stages and their logic remain functionally identical.

runtimes/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda (1)

26-62: Marker standardization applied consistently across all variants.

All three sections follow the established pattern. The standardization is complete and uniform across the Dockerfile suite.

codeserver/ubi9-python-3.12/Dockerfile.cpu (1)

86-96: Marker formatting is consistent and properly paired across all sections.

All three marker pairs (upgrade, micropipenv/uv, and oc client) use the standardized ### BEGIN <description> and ### END <description> format with matching descriptions. The Docker instructions within each marked section remain functionally unchanged.

Also applies to: 117-119, 121-130

rstudio/c9s-python-3.12/Dockerfile.cuda (1)

18-28: Marker pairs are properly formatted and closed.

Both sections have consistent BEGIN/END markers with matching descriptions.

Also applies to: 37-39

jupyter/rocm/pytorch/ubi9-python-3.12/Dockerfile.rocm (1)

40-50: All marker pairs are properly formatted with consistent naming.

The four marked sections (upgrade, micropipenv/uv, oc client, and PDF dependencies) follow the standardized format across files.

Also applies to: 63-65, 67-76, 94-98

rstudio/rhel9-python-3.12/Dockerfile.cpu (1)

34-44: Markers are properly formatted and paired.

Both sections have well-formed BEGIN/END markers with matching descriptions.

Also applies to: 53-55

jupyter/minimal/ubi9-python-3.12/Dockerfile.cuda (1)

26-36: All marker pairs are properly formatted with matching descriptions.

The four marked sections follow the standardized convention consistently.

Also applies to: 49-51, 53-62, 88-92

runtimes/rocm-tensorflow/ubi9-python-3.12/Dockerfile.rocm (1)

24-34: Markers are properly formatted and consistently named across sections.

All three marked sections have well-formed BEGIN/END pairs.

Also applies to: 47-49, 51-60

runtimes/rocm-pytorch/ubi9-python-3.12/Dockerfile.rocm (1)

24-34: All marker pairs are properly formatted with consistent naming.

The three marked sections (upgrade, micropipenv/uv, and oc client) follow the standardized format.

Also applies to: 47-49, 51-60

runtimes/datascience/ubi9-python-3.12/Dockerfile.cpu (2)

28-38: Markers are properly formatted and consistently applied across marked sections.

All three marked sections have well-formed BEGIN/END pairs with matching descriptions.

Also applies to: 108-110, 112-121


1-330: Verify marker compatibility with scripts/dockerfile_fragments.py expectations.

All provided Dockerfile changes consistently apply the ### BEGIN <description> and ### END <description> marker format. The marker names across files are standardized (upgrade, micropipenv/uv, oc client, PDF dependencies), but I cannot verify whether they exactly match the expectations and dictionary keys in the dockerfile_fragments.py script since it was not provided for review. Ensure the script's replacements dictionary includes all observed marker names.

jupyter/pytorch+llmcompressor/ubi9-python-3.12/Dockerfile.cuda (1)

42-52: Standardized markers look consistent with the fragment script

All four BEGIN/END blocks use names that exactly match the keys in scripts/dockerfile_fragments.py, and the enclosed commands are unchanged, so the new fragment replacement flow should work without behavioral changes.

Also applies to: 65-67, 69-78, 96-100

jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm (1)

40-50: BEGIN/END markerization is consistent and non‑functional

The upgrade, micropipenv/uv, oc client, and PDF export sections are now wrapped in standardized markers with text matching the fragment replacement keys, and the shell/pip commands are unchanged, so behavior should remain identical while enabling scripted updates.

Also applies to: 63-65, 67-76, 92-96

jupyter/trustyai/ubi9-python-3.12/Dockerfile.cpu (1)

65-75: TrustyAI Dockerfile markers align with the fragment replacement scheme

The four annotated sections use standardized BEGIN/END markers with names matching the Python replacements dict, and the underlying upgrade, micropipenv/uv, oc client, and PDF export commands are unchanged, so this should be a no‑op for runtime behavior while making scripted regeneration possible.

Also applies to: 88-90, 92-101, 118-122

jupyter/minimal/ubi9-python-3.12/Dockerfile.cpu (1)

24-34: Minimal Jupyter markers are correctly standardized

The upgrade, micropipenv/uv, oc client, and PDF dependency sections are now wrapped in consistent ### BEGIN/END … markers whose labels match those used by dockerfile_fragments.py, with no changes to the actual commands, so the refactoring should be behavior‑neutral.

Also applies to: 47-49, 51-60, 86-90

jupyter/pytorch/ubi9-python-3.12/Dockerfile.cuda (1)

42-52: CUDA PyTorch markers are in sync with the fragment script

The four BEGIN/END blocks are named identically to the keys in the fragment replacement mapping and wrap the same upgrade, micropipenv/uv, oc client, and PDF export logic, so this is a straightforward marker standardization with no functional impact.

Also applies to: 65-67, 69-78, 96-100

rstudio/c9s-python-3.12/Dockerfile.cpu (1)

15-17: RStudio Dockerfile markers are correct and tool‑friendly

The micropipenv/uv and base upgrade sections are wrapped in standardized markers with names matching dockerfile_fragments.py’s replacements, and the underlying commands remain the same, so this is a safe, tooling‑oriented change.

Also applies to: 22-32

@jiridanek jiridanek added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Nov 14, 2025
@openshift-ci openshift-ci bot added size/l and removed size/l labels Nov 14, 2025
@openshift-ci openshift-ci bot added size/l and removed size/l labels Nov 14, 2025
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai create follow-up github issue for

think about multiple-block and creation semantics

The updated blockinfile correctly (a) enforces matching BEGIN/END pairs, (b) no‑ops when markers are absent (matching the new design), and (c) normalizes trailing newlines so HEREDOCs keep a single empty line before the END marker. One thing to be aware of is that if a file ever contained multiple BEGIN/END pairs for the same prefix, this implementation would silently treat them as one big block (last BEGIN, last END); if that’s undesirable, you might want to detect and error on multiple matches for a given prefix rather than relying on last‑seen indices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Create a follow-up GitHub issue for multiple-block and creation semantics

gh issue create \
  --repo opendatahub-io/notebooks \
  --title "Consider handling of multiple BEGIN/END block pairs with same prefix in blockinfile" \
  --body "## Context

This issue is a follow-up from PR #2682 and [this comment](https://github.com/opendatahub-io/notebooks/pull/2682#discussion_r2528723448).

Requested by: @jiridanek

## Current Behavior

The \`blockinfile\` function in \`scripts/dockerfile_fragments.py\` currently uses a simple approach for finding BEGIN/END marker pairs. When scanning a file, it updates the \`begin\` and \`end\` indices each time it encounters a matching marker:

\`\`\`python
for line_no, line in enumerate(original_lines):
    if line.rstrip() == begin_marker:
        begin = line_no
    elif line.rstrip() == end_marker:
        end = line_no
\`\`\`

This means if a file contains multiple BEGIN/END pairs for the same prefix, the function will silently treat them as one large block (using the last BEGIN and last END indices encountered).

## Concern

This behavior might be undesirable if:
- Multiple blocks with the same prefix are unintentional (e.g., copy-paste errors, typos)
- Users expect each block to be independently managed
- The \"last seen\" behavior could lead to unexpected replacements spanning unrelated sections of the file

## Suggestion

Consider detecting and raising an error when multiple matches for a given prefix are found, rather than silently relying on the last-seen indices. This would make the tool more predictable and help catch configuration errors early.

## Additional Considerations

- Should the tool support multiple blocks with the same prefix intentionally?
- What are the use cases for creation semantics (when markers are absent)?
- Should there be options to control this behavior?

---
**Related PR:** #2682  
**Related Comment:** https://github.com/opendatahub-io/notebooks/pull/2682#discussion_r2528723448" \
  --assignee jiridanek

Length of output: 55


@jiridanek I've created a follow-up GitHub issue for you: #2684

The issue covers:

  • The current behavior where multiple BEGIN/END pairs with the same prefix are silently treated as one large block (using the last-seen indices)
  • The concern about this potentially undesirable behavior
  • A suggestion to detect and error on multiple matches for a given prefix
  • Additional considerations about creation semantics and whether multiple blocks should be intentionally supported

The issue is assigned to you and includes backlinks to this PR and comment for context.


🧬 Code Graph Analysis Results
  • File: scripts/dockerfile_fragments.py

    • Snippet: blockinfile function (full)
      • Description: The function that reads a file, detects BEGIN/END markers (optionally prefixed), validates marker pairing, replaces the block between BEGIN and END with provided contents, and writes back if changed. Includes error handling for read/write issues and marker mismatches, and preserves trailing newlines for HEREDOC compatibility.
      • Key details:
        • Parameters: filename (str or path), contents (str), prefix (str or None), keyword-only: comment (default "#")
        • Returns: None
        • Behavior:
          • Builds begin_marker and end_marker using the optional prefix.
          • Reads original file content; detects positions of begin and end markers (last seen positions are used if multiple exist).
          • Validation: raises ValueError if only one of BEGIN/END is found or if BEGIN appears after END.
          • If no markers found, no changes are made.
          • Replaces the block with:
            BEGIN marker line
            contents (dedented, with trailing newline normalized to preserve a single trailing newline)
            END marker line
          • Writes back only if changes occurred.
        • Error handling:
          • IOErrors/OSErrors on read raise RuntimeError with context.
  • File: scripts/dockerfile_fragments.py

    • Snippet: TestBlockinfile class (full)
      • Description: Unit tests validating behavior of blockinfile:
        • test_adding_new_block: ensures file unchanged when no block exists and a new block is added.
        • test_updating_value_in_block: ensures updating content within an existing BEGIN/END block.
        • test_lastnewline_removal: ensures proper handling of trailing newlines to maintain a single empty line before END.
        • test_dry_run: exercises the main flow with real directories/files via fake filesystem integration.
      • Key details:
        • Uses pyfakefs FakeFilesystem.
        • Sets up files with and without existing blocks and verifies the final contents after blockinfile is invoked.
        • Verifies no modification when no markers present.
        • Verifies block replacement with normalized content and preserved markers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai create followup github issue for

138-166: Inline tests cover key cases; dry‑run could assert on actual replacements

The tests nicely pin down the “no markers → no change”, “update in‑place”, and newline handling behaviors, and test_dry_run at least guarantees main() finishes against real Dockerfile trees. If you want stronger regression protection, you could extend test_dry_run (or add a focused test) to assert that at least one known marker in a small synthetic Dockerfile gets rewritten as expected, decoupled from the full repo layout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create a follow-up GitHub issue for test improvement suggestion

gh issue create \
  --title "Improve test coverage for dockerfile_fragments.py with synthetic Dockerfile assertions" \
  --body "## Context

This is a follow-up from PR #2682 to improve test coverage in \`scripts/dockerfile_fragments.py\`.

## Suggestion

The current \`test_dry_run\` test validates that \`main()\` completes against real Dockerfile trees, but doesn't assert on actual replacements. For stronger regression protection, consider:

- Extending \`test_dry_run\` to assert that at least one known marker in a small synthetic Dockerfile gets rewritten as expected
- Adding a focused test that validates marker replacement behavior decoupled from the full repository layout

This would provide better regression protection beyond just verifying the script doesn't crash.

## References

- PR: https://github.com/opendatahub-io/notebooks/pull/2682
- Review comment: https://github.com/opendatahub-io/notebooks/pull/2682#discussion_r2528725273
- Requested by: @jiridanek" \
  --assignee jiridanek

Length of output: 55


@jiridanek I've created a follow-up issue for the test improvement suggestion:

Issue #2683: Improve test coverage for dockerfile_fragments.py with synthetic Dockerfile assertions

The issue has been assigned to you and includes references back to this PR and review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
scripts/dockerfile_fragments.py (2)

36-84: Replacements mapping and marker validation are solid; consider minor refactor & lint appeasement

The replacements dict plus the pre-pass that validates ### BEGIN/END markers against known keys is a nice safety net against typos and drift between Dockerfiles and this script.

Two small, optional improvements:

  • You rebuild replacements on every Dockerfile; lifting it to a module-level constant (e.g., FRAGMENTS) would avoid repeated construction and make it easier to share with other code/tests if needed.
  • Ruff’s TRY003 warning on the ValueError message (lines 74-77) can be silenced by either shortening the message or assigning it to a variable before raising, e.g.:
-                            raise ValueError(
-                                f"Expected replacement for '{prefix} {suffix}' "
-                                f"not found in {dockerfile}:{line_no}"
-                            )
+                            msg = (
+                                f"Expected replacement for '{prefix} {suffix}' "
+                                f"not found in {dockerfile}:{line_no}"
+                            )
+                            raise ValueError(msg)

Both are non-blocking quality-of-life tweaks.


87-99: blockinfile behavior change looks correct; note single-block assumption

The updated blockinfile API and behavior look good:

  • The new comment * 3 convention lines up with the ### BEGIN / ### END markers in the Dockerfiles.
  • Error handling for I/O plus mismatched or out-of-order markers is clear and defensive.
  • The “no markers found ⇒ return without modification” branch is consistent with the new semantics and the updated tests.

One design assumption worth calling out: the scan only tracks a single begin/end pair per call (last occurrence wins). If a file ever contains multiple blocks with the same prefix, only the last will be updated. If that’s not desired, you may want to detect multiple matches and raise instead of silently using the last pair.

Also applies to: 125-127

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fef65fc and aedeeb4.

📒 Files selected for processing (1)
  • scripts/dockerfile_fragments.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.14.4)
scripts/dockerfile_fragments.py

74-77: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-minimal-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-datascience-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-rocm-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cpu-py312-c9s-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-rocm-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-rocm-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-llmcompressor-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-trustyai-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-rocm-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-rocm-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cpu-py312-rhel9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-datascience-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cuda-py312-c9s-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-cuda-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-codeserver-datascience-cpu-py312-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cuda-py312-rhel9-on-pull-request
  • GitHub Check: build (cuda-rstudio-rhel9-python-3.12, 3.12, linux/amd64, true) / build
  • GitHub Check: build (rstudio-rhel9-python-3.12, 3.12, linux/amd64, true) / build
🔇 Additional comments (2)
scripts/dockerfile_fragments.py (2)

4-14: Docstring clearly documents intent and future extensibility

The high-level docstring does a good job explaining the relationship to the original replace-markers.sh and hints at future extensions (file-based or function-based fragments), which will make this script easier to evolve.


143-169: Tests nicely capture the new semantics and key edge cases

The tests around blockinfile look well targeted:

  • test_adding_new_block now correctly asserts that a file without markers is left untouched, matching the new behavior.
  • test_updating_value_in_block and test_lastnewline_removal exercise both replacement and the subtle newline trimming logic.
  • test_dry_run wiring ROOT_DIR’s real Dockerfile directories into pyfakefs and running main() is a pragmatic way to ensure the script stays in sync with the actual Dockerfiles without adding brittle content assertions.

Overall, the test coverage for the changes looks solid.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 14, 2025

@jiridanek: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/rocm-runtime-pt-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test rocm-runtime-pt-ubi9-python-3-12-pr-image-mirror
ci/prow/notebook-cuda-jupyter-tf-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test notebook-cuda-jupyter-tf-ubi9-python-3-12-pr-image-mirror
ci/prow/runtime-cuda-tf-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test runtime-cuda-tf-ubi9-python-3-12-pr-image-mirror
ci/prow/notebook-jupyter-tai-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test notebook-jupyter-tai-ubi9-python-3-12-pr-image-mirror
ci/prow/runtime-ds-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test runtime-ds-ubi9-python-3-12-pr-image-mirror
ci/prow/notebook-rocm-jupyter-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test notebook-rocm-jupyter-ubi9-python-3-12-pr-image-mirror
ci/prow/notebook-jupyter-ds-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test notebook-jupyter-ds-ubi9-python-3-12-pr-image-mirror
ci/prow/notebook-jupyter-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test notebook-jupyter-ubi9-python-3-12-pr-image-mirror
ci/prow/notebook-rocm-jupyter-pt-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test notebook-rocm-jupyter-pt-ubi9-python-3-12-pr-image-mirror
ci/prow/codeserver-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test codeserver-ubi9-python-3-12-pr-image-mirror
ci/prow/notebook-cuda-jupyter-pt-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test notebook-cuda-jupyter-pt-ubi9-python-3-12-pr-image-mirror
ci/prow/runtime-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test runtime-ubi9-python-3-12-pr-image-mirror
ci/prow/notebook-cuda-jupyter-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test notebook-cuda-jupyter-ubi9-python-3-12-pr-image-mirror
ci/prow/runtime-cuda-pt-ubi9-python-3-12-pr-image-mirror aedeeb4 link true /test runtime-cuda-pt-ubi9-python-3-12-pr-image-mirror
ci/prow/images aedeeb4 link true /test images
ci/prow/rocm-notebooks-py312-e2e-tests aedeeb4 link true /test rocm-notebooks-py312-e2e-tests
ci/prow/notebooks-py312-ubi9-e2e-tests aedeeb4 link true /test notebooks-py312-ubi9-e2e-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 15, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ide-developer, ysok
Once this PR has been reviewed and has the lgtm label, please assign harshad16 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jiridanek jiridanek merged commit 8da6079 into opendatahub-io:main Nov 15, 2025
45 of 82 checks passed
@jiridanek jiridanek deleted the jd_fragments_Hash3Begin branch November 15, 2025 11:45
jiridanek added a commit to red-hat-data-services/notebooks that referenced this pull request Nov 15, 2025
…nctionality, as a start (opendatahub-io#2682) (#1709)

* document the new behavior

* migrate existing files to new marker format

* apply the migration

* and remove the migration code

* stop adding missing block at the end of file

* remove the exclusions that I needed before

* restructure around a dictionary of replacements

* implement rabbit suggestion
jiridanek added a commit to jiridanek/notebooks that referenced this pull request Nov 15, 2025
jiridanek added a commit to jiridanek/notebooks that referenced this pull request Nov 15, 2025
jiridanek added a commit to red-hat-data-services/notebooks that referenced this pull request Nov 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel size/l tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants