Skip to content

Conversation

@shruthis4
Copy link
Contributor

@shruthis4 shruthis4 commented Nov 17, 2025

Description

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Tests
    • Notebooks are automatically executed and validated (with timeout handling, cleanup, and a CLI entry) and include checks that notebook directories and files exist.
  • Chores
    • CI workflow now runs notebook-related checks on PRs and push events.
    • Build system updated with new composite targets to run notebook formatting, parameter checks, execution tests, and full test suites.

- Updated Makefile to include new test targets for notebook execution and overall testing.
- Added a new test script for executing notebooks using papermill, ensuring they run without errors.
- Modified GitHub Actions workflow to set up Python, install dependencies, and execute all notebooks as part of the CI process.
@shruthis4 shruthis4 requested a review from a team as a code owner November 17, 2025 15:54
@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

Walkthrough

The pull request adds CI notebook execution: a GitHub Actions workflow now triggers on PRs touching notebooks and the workflow file, runs setup steps on an EC2 runner (checkout, Python 3.12, deps, kernel registration) and executes pytest; the Makefile gains notebook test targets; a new pytest module runs notebooks via papermill with timeout and cleanup.

Changes

Cohort / File(s) Summary
CI/CD Workflow Automation
\.github/workflows/execute-all-notebooks.yml
Added pull_request trigger scoped to notebooks/**/*.ipynb and the workflow file; preserved push; inserted EC2-run path setup steps: actions/checkout@v5, actions/setup-python@v6 (Python 3.12 + pip cache), install requirements-dev.txt plus papermill, ipykernel, jupyter, register IPython kernel, then run pytest for notebook verification.
Test Target Orchestration
Makefile
Added ALL_NOTEBOOKS and ALL_PYTHON_FILES variables; extended .PHONY with test-notebook-execution, test-notebooks, test-all; added targets test-notebook-execution (runs pytest tests/test_notebook_execution.py), test-notebooks (composite: format/params/execution), and test-all (runs all tests), with progress echo messages.
Notebook Execution Testing
tests/test_notebook_execution.py
New test module using papermill to execute notebooks discovered by get_notebook_files; adds get_test_parameters(), execute_single_notebook(notebook_path: Path, timeout: int = 300) -> bool, parameterized test_notebook_executes_without_error() (skips backup/checkpoint files), test_notebooks_directory_exists(), and a CLI-style main block for direct invocation.

Sequence Diagram(s)

sequenceDiagram
    participant PR as Pull Request
    participant GH as GitHub Actions
    participant EC2 as EC2 Runner
    participant OS as Runner OS
    participant PY as Python Setup
    participant TEST as pytest / Papermill
    participant NB as Notebook Files

    PR->>GH: open/update PR touching notebooks/** or workflow
    GH->>EC2: dispatch job (execute-all-notebooks)
    EC2->>OS: run pre-setup steps
    EC2->>PY: actions/checkout@v5
    EC2->>PY: actions/setup-python@v6 (3.12 + pip cache)
    EC2->>PY: install requirements-dev.txt, papermill, ipykernel, jupyter
    EC2->>PY: register IPython kernel
    EC2->>TEST: run pytest tests/test_notebook_execution.py
    TEST->>NB: for each notebook from discovery
    NB->>TEST: executed via papermill (with params, timeout)
    alt success
        TEST-->>EC2: pass
    else failure / timeout
        TEST-->>EC2: fail (error reported)
    end
    EC2-->>GH: post job results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review points:
    • GitHub Actions: verify trigger filters, runner selection, setup-python inputs, and dependency install steps.
    • Makefile: ensure targets and variable expansions integrate with existing targets and CI usage.
    • tests/test_notebook_execution.py: inspect notebook discovery, papermill invocation, timeout/cleanup handling, and CLI block.

Poem

🐰
Notebooks wake and softly hum,
Papermill hops, the tests will run,
Kernels bloom in Python light,
CI cheers on each notebook flight,
A rabbit hops — automation done.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add test to execute notebooks' directly and concisely describes the main change: adding a new test module that executes notebooks, which is the core objective reflected across the workflow, Makefile, and test file changes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
tests/test_notebook_execution.py (1)

27-54: Solid implementation with proper error handling and cleanup.

The function correctly handles temporary files and preserves exception context. One minor suggestion: consider making the timeout configurable via an environment variable (e.g., NOTEBOOK_TIMEOUT) to allow easier adjustment in different CI environments without code changes.

.github/workflows/execute-all-notebooks.yml (1)

92-95: Simplify pytest invocation by running from repository root.

Instead of changing to the tests directory, run pytest from the repository root to maintain consistency with typical pytest usage patterns.

Apply this diff:

       - name: Execute All Notebooks via Pytest
         run: |
-          cd tests
-          python -m pytest test_notebook_execution.py -v --tb=short
+          python -m pytest tests/test_notebook_execution.py -v --tb=short
Makefile (1)

1-1: Optional: Consider adding conventional Makefile targets for broader compatibility.

While not required, adding conventional targets like all, clean, and test (as an alias to test-all) can improve compatibility with standard Make conventions and developer expectations.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96e05f7 and 770fec6.

📒 Files selected for processing (3)
  • .github/workflows/execute-all-notebooks.yml (2 hunks)
  • Makefile (2 hunks)
  • tests/test_notebook_execution.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_notebook_execution.py (1)
tests/conftest.py (1)
  • get_notebook_files (11-19)
🪛 checkmake (0.2.2)
Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Summary
🔇 Additional comments (9)
tests/test_notebook_execution.py (5)

1-16: LGTM!

The imports and module structure are well-organized and the docstring clearly describes the module's purpose.


18-24: LGTM!

The test parameters are appropriate for CI/CD execution with test_mode and quick_run enabled.


70-77: LGTM!

Good validation test that ensures the notebooks directory is properly configured before executing notebooks.


80-95: LGTM!

The CLI interface is a nice addition for local testing and debugging of individual notebooks.


57-67: Redundant assertion after function that raises on failure.

Since execute_single_notebook() raises an exception on failure (line 51) and doesn't return False, the assertion on line 67 is redundant. The test will fail from the raised exception if the notebook execution fails.

Apply this diff:

     # Execute the notebook
-    success = execute_single_notebook(notebook_path, timeout=300)
-    assert success, f"Failed to execute notebook: {notebook_path}"
+    execute_single_notebook(notebook_path, timeout=300)

Likely an incorrect or invalid review comment.

.github/workflows/execute-all-notebooks.yml (2)

14-18: LGTM!

The pull request trigger is properly configured with appropriate event types and path filtering to run only when notebooks or the workflow itself changes.


78-85: LGTM!

Modern action versions and Python 3.12 with pip caching configured properly.

Makefile (2)

1-1: LGTM!

The PHONY declaration correctly includes all the new test targets.


33-44: LGTM!

The new test targets are well-structured with clear dependencies and helpful progress messages. The composite test-notebooks target appropriately chains formatting, parameter validation, and execution tests.

Comment on lines +86 to +91
- name: Install Dependencies
run: |
pip install -r requirements-dev.txt
pip install papermill ipykernel jupyter
ipython kernel install --name "python3" --user
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Potential dependency version conflicts from dual installation.

Installing from requirements-dev.txt and then separately installing papermill ipykernel jupyter could lead to version conflicts if these packages are already specified in requirements-dev.txt. Consider consolidating all dependencies into requirements-dev.txt to ensure consistent versions.


🏁 Script executed:

#!/bin/bash
# Check if papermill, ipykernel, or jupyter are already in requirements-dev.txt

echo "=== Checking requirements-dev.txt for potential conflicts ==="
if [ -f requirements-dev.txt ]; then
  echo "File exists. Checking for papermill, ipykernel, jupyter..."
  grep -E "(papermill|ipykernel|jupyter)" requirements-dev.txt || echo "None found in requirements-dev.txt"
else
  echo "requirements-dev.txt not found"
fi

Length of output: 130


🏁 Script executed:

#!/bin/bash
# Get full requirements-dev.txt and check papermill line + context
echo "=== Full requirements-dev.txt content ==="
cat requirements-dev.txt

echo ""
echo "=== Workflow file lines 86-95 for context ==="
sed -n '86,95p' .github/workflows/execute-all-notebooks.yml

Length of output: 482


Remove redundant papermill installation to ensure consistent dependency resolution.

Papermill is already specified in requirements-dev.txt but is reinstalled on line 89 alongside ipykernel and jupyter. This dual installation can cause version conflicts if requirements-dev.txt pins a specific version of papermill. Instead, add ipykernel and jupyter to requirements-dev.txt and remove the redundant pip install papermill ipykernel jupyter line.

🤖 Prompt for AI Agents
In .github/workflows/execute-all-notebooks.yml around lines 86 to 91, the
workflow redundantly reinstalls papermill (via pip install papermill ipykernel
jupyter) even though papermill is already pinned in requirements-dev.txt; remove
the extra pip install line and instead add ipykernel and jupyter to
requirements-dev.txt, then keep only pip install -r requirements-dev.txt
followed by the ipython kernel install --name "python3" --user step so all
dependencies (including papermill, ipykernel, jupyter) are installed from the
single requirements file.

import tempfile
from pathlib import Path

import papermill as pm
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Verify papermill dependency is declared in requirements.

Ensure that papermill and its required dependencies (ipykernel) are declared in the project's requirements files.


🏁 Script executed:

#!/bin/bash
# Check if papermill is declared in requirements files

echo "=== Checking for papermill in requirements files ==="
fd -t f "requirements.*\.txt$" --exec grep -H "papermill" {} \;

echo ""
echo "=== Checking for ipykernel in requirements files ==="
fd -t f "requirements.*\.txt$" --exec grep -H "ipykernel" {} \;

echo ""
echo "=== Listing all requirements files ==="
fd -t f "requirements.*\.txt$"

Length of output: 343


I need to verify whether ipykernel is actually a required dependency and check for other dependency declaration methods.

Now let me check if there are other dependency declaration files in the repo:

Let me check the project's own dependency declarations:


🏁 Script executed:

#!/bin/bash
# Check for setup.py and pyproject.toml in the project root

echo "=== Checking for setup.py ==="
fd -t f "setup.py" --max-depth 1

echo ""
echo "=== Checking for pyproject.toml ==="
fd -t f "pyproject.toml" --max-depth 1

echo ""
echo "=== Checking contents of requirements-dev.txt ==="
cat requirements-dev.txt

Length of output: 170


Let me re-run the script with proper repository access:


🏁 Script executed:

#!/bin/bash
# Check for setup.py and pyproject.toml in the project root

echo "=== Checking for setup.py ==="
fd -t f "setup.py" --max-depth 1

echo ""
echo "=== Checking for pyproject.toml ==="
fd -t f "pyproject.toml" --max-depth 1

echo ""
echo "=== Checking contents of requirements-dev.txt ==="
cat requirements-dev.txt

echo ""
echo "=== Checking if ipykernel is a transitive dependency of papermill ==="
grep -E "^(jupyter|nbclient|ipykernel)" requirements-dev.txt || echo "Not explicitly listed in requirements-dev.txt"

Length of output: 290


Let me check the pyproject.toml and verify papermill's dependencies:


🏁 Script executed:

#!/bin/bash
# Check pyproject.toml contents and look for dependency sections

echo "=== Contents of pyproject.toml ==="
cat pyproject.toml

echo ""
echo "=== Checking test dependencies in pyproject.toml ==="
grep -A 20 "\[project.optional-dependencies\]" pyproject.toml || echo "No optional dependencies found"

Length of output: 1013



🌐 Web query:

papermill package dependencies ipykernel

💡 Result:

Short answer

  • papermill itself (the core package) does not strictly declare ipykernel as a direct dependency on PyPI, but it requires a Jupyter kernel to execute notebooks — so you must have an appropriate kernel (usually provided by ipykernel) available in the environment that will run the notebook. [1][2]

Details / recommended installs

  • If you run notebooks with papermill, install ipykernel in the execution environment (or ensure the notebook's kernelspec points to an environment that exposes a kernel). Example: pip/conda install ipykernel and register the kernel. [2][6]
  • If you use the Apache Airflow papermill provider, that provider does declare ipykernel as a dependency (ipykernel >= 6.29.4). So installing apache-airflow-providers-papermill will require ipykernel >= 6.29.4. [3][4]

Useful references

  • papermill on PyPI (install info): papermill package. [1]
  • Papermill troubleshooting (NoSuchKernel / advice to install ipykernel or register kernels): Papermill docs. [2]
  • Apache Airflow provider requirements (lists ipykernel>=6.29.4): apache-airflow-providers-papermill docs/PyPI. [3][4]

Now let me check what the test file actually does to determine if ipykernel is needed:


🏁 Script executed:

#!/bin/bash
# Check what the test file actually does

echo "=== Contents of tests/test_notebook_execution.py ==="
cat tests/test_notebook_execution.py

Length of output: 3008


Add ipykernel to requirements-dev.txt

The test file executes notebooks using pm.execute_notebook() (line 48-54), which requires a Jupyter kernel to be available at runtime. While papermill itself doesn't declare ipykernel as a strict dependency, ipykernel is necessary for notebook execution in this test environment. Add ipykernel to requirements-dev.txt alongside papermill to ensure tests can execute successfully.

🤖 Prompt for AI Agents
In tests/test_notebook_execution.py around line 12, the test uses papermill to
execute notebooks but the development requirements are missing ipykernel; update
requirements-dev.txt to include ipykernel (e.g., add a new line with "ipykernel"
next to "papermill") so the Jupyter kernel is available at test runtime, then
re-run tests to confirm notebooks execute successfully.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
.github/workflows/execute-all-notebooks.yml (1)

86-90: Redundant dependency installation can cause version conflicts.

Line 88 installs from requirements-dev.txt, then line 89 separately reinstalls papermill, ipykernel, and jupyter. If these packages are already pinned in requirements-dev.txt, the redundant pip install can override those versions and create conflicts. Consolidate all dependencies into requirements-dev.txt and remove the redundant line 89 to ensure consistent, single-source-of-truth dependency resolution.

Apply this diff to resolve the redundancy:

       - name: Install Dependencies
         run: |
           pip install -r requirements-dev.txt
-          pip install papermill ipykernel jupyter
           ipython kernel install --name "python3" --user

Then ensure requirements-dev.txt includes papermill, ipykernel, and jupyter with appropriate version pins.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 770fec6 and 3467502.

📒 Files selected for processing (1)
  • .github/workflows/execute-all-notebooks.yml (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Summary
🔇 Additional comments (3)
.github/workflows/execute-all-notebooks.yml (3)

14-18: PR trigger and path filters look appropriate.

The pull_request trigger with type filters and path-based filtering for notebooks and the workflow file itself enables CI execution on relevant changes while avoiding unnecessary runs.


80-84: Python setup and caching configuration looks good.

Python 3.12 with pip cache is appropriately configured for the EC2 runner.


92-95: Test execution command and path are clear.

The pytest invocation in the tests directory targeting test_notebook_execution.py with verbose output and short traceback is appropriate for notebook execution testing.

run: |
echo "hello. i'm running from inside of EC2 instance ${{ needs.launch-ec2-runner.outputs.ec2-instance-id }}"
- uses: actions/checkout@v5
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Checkout action version inconsistency.

Line 78 uses actions/checkout@v5, but line 40 uses actions/checkout@v4. Maintain consistency across the workflow by upgrading line 78 to @v5 or downgrading line 40 to @v4 (preferably keep both on the latest stable version).

🤖 Prompt for AI Agents
.github/workflows/execute-all-notebooks.yml around lines 40 and 78: the workflow
uses mixed checkout action versions (line 40 is actions/checkout@v4 while line
78 is actions/checkout@v5); make them consistent by updating the older reference
to the preferred stable version (recommend upgrading the @v4 usage at line 40 to
actions/checkout@v5) or alternatively downgrade line 78 to @v4—ensure both
checkout steps reference the same tag and run the workflow to verify no breaking
changes.

@courtneypacheco courtneypacheco changed the base branch from main to aws-test November 17, 2025 16:35
Copy link
Collaborator

@courtneypacheco courtneypacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing

@courtneypacheco courtneypacheco merged commit aff21c8 into opendatahub-io:aws-test Nov 17, 2025
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants