Add test to execute notebooks #58

shruthis4 · 2025-11-17T15:54:08Z

Description

How Has This Been Tested?

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

Tests
- Notebooks are automatically executed and validated (with timeout handling, cleanup, and a CLI entry) and include checks that notebook directories and files exist.
Chores
- CI workflow now runs notebook-related checks on PRs and push events.
- Build system updated with new composite targets to run notebook formatting, parameter checks, execution tests, and full test suites.

- Updated Makefile to include new test targets for notebook execution and overall testing. - Added a new test script for executing notebooks using papermill, ensuring they run without errors. - Modified GitHub Actions workflow to set up Python, install dependencies, and execute all notebooks as part of the CI process.

coderabbitai · 2025-11-17T15:54:17Z

Walkthrough

The pull request adds CI notebook execution: a GitHub Actions workflow now triggers on PRs touching notebooks and the workflow file, runs setup steps on an EC2 runner (checkout, Python 3.12, deps, kernel registration) and executes pytest; the Makefile gains notebook test targets; a new pytest module runs notebooks via papermill with timeout and cleanup.

Changes

Cohort / File(s)	Summary
CI/CD Workflow Automation `\.github/workflows/execute-all-notebooks.yml`	Added `pull_request` trigger scoped to `notebooks/*/.ipynb` and the workflow file; preserved `push`; inserted EC2-run path setup steps: `actions/checkout@v5`, `actions/setup-python@v6` (Python 3.12 + pip cache), install `requirements-dev.txt` plus `papermill`, `ipykernel`, `jupyter`, register IPython kernel, then run `pytest` for notebook verification.
Test Target Orchestration `Makefile`	Added `ALL_NOTEBOOKS` and `ALL_PYTHON_FILES` variables; extended `.PHONY` with `test-notebook-execution`, `test-notebooks`, `test-all`; added targets `test-notebook-execution` (runs pytest tests/test_notebook_execution.py), `test-notebooks` (composite: format/params/execution), and `test-all` (runs all tests), with progress echo messages.
Notebook Execution Testing `tests/test_notebook_execution.py`	New test module using `papermill` to execute notebooks discovered by `get_notebook_files`; adds `get_test_parameters()`, `execute_single_notebook(notebook_path: Path, timeout: int = 300) -> bool`, parameterized `test_notebook_executes_without_error()` (skips backup/checkpoint files), `test_notebooks_directory_exists()`, and a CLI-style main block for direct invocation.

Sequence Diagram(s)

sequenceDiagram
    participant PR as Pull Request
    participant GH as GitHub Actions
    participant EC2 as EC2 Runner
    participant OS as Runner OS
    participant PY as Python Setup
    participant TEST as pytest / Papermill
    participant NB as Notebook Files

    PR->>GH: open/update PR touching notebooks/** or workflow
    GH->>EC2: dispatch job (execute-all-notebooks)
    EC2->>OS: run pre-setup steps
    EC2->>PY: actions/checkout@v5
    EC2->>PY: actions/setup-python@v6 (3.12 + pip cache)
    EC2->>PY: install requirements-dev.txt, papermill, ipykernel, jupyter
    EC2->>PY: register IPython kernel
    EC2->>TEST: run pytest tests/test_notebook_execution.py
    TEST->>NB: for each notebook from discovery
    NB->>TEST: executed via papermill (with params, timeout)
    alt success
        TEST-->>EC2: pass
    else failure / timeout
        TEST-->>EC2: fail (error reported)
    end
    EC2-->>GH: post job results

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Review points:
- GitHub Actions: verify trigger filters, runner selection, setup-python inputs, and dependency install steps.
- Makefile: ensure targets and variable expansions integrate with existing targets and CI usage.
- tests/test_notebook_execution.py: inspect notebook discovery, papermill invocation, timeout/cleanup handling, and CLI block.

Poem

🐰
Notebooks wake and softly hum,
Papermill hops, the tests will run,
Kernels bloom in Python light,
CI cheers on each notebook flight,
A rabbit hops — automation done.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add test to execute notebooks' directly and concisely describes the main change: adding a new test module that executes notebooks, which is the core objective reflected across the workflow, Makefile, and test file changes.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

tests/test_notebook_execution.py (1)

27-54: Solid implementation with proper error handling and cleanup.

The function correctly handles temporary files and preserves exception context. One minor suggestion: consider making the timeout configurable via an environment variable (e.g., NOTEBOOK_TIMEOUT) to allow easier adjustment in different CI environments without code changes.
.github/workflows/execute-all-notebooks.yml (1)
92-95: Simplify pytest invocation by running from repository root.

Instead of changing to the tests directory, run pytest from the repository root to maintain consistency with typical pytest usage patterns.

Apply this diff:
       - name: Execute All Notebooks via Pytest
         run: |
-          cd tests
-          python -m pytest test_notebook_execution.py -v --tb=short
+          python -m pytest tests/test_notebook_execution.py -v --tb=short
Makefile (1)

1-1: Optional: Consider adding conventional Makefile targets for broader compatibility.

While not required, adding conventional targets like all, clean, and test (as an alias to test-all) can improve compatibility with standard Make conventions and developer expectations.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96e05f7 and 770fec6.

📒 Files selected for processing (3)

.github/workflows/execute-all-notebooks.yml (2 hunks)
Makefile (2 hunks)
tests/test_notebook_execution.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/test_notebook_execution.py (1)

tests/conftest.py (1)

get_notebook_files (11-19)

🪛 checkmake (0.2.2)

Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)

[warning] 1-1: Missing required phony target "clean"

(minphony)

[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Summary

🔇 Additional comments (9)

tests/test_notebook_execution.py (5)
1-16: LGTM!

The imports and module structure are well-organized and the docstring clearly describes the module's purpose.

18-24: LGTM!

The test parameters are appropriate for CI/CD execution with test_mode and quick_run enabled.

70-77: LGTM!

Good validation test that ensures the notebooks directory is properly configured before executing notebooks.

80-95: LGTM!

The CLI interface is a nice addition for local testing and debugging of individual notebooks.

57-67: Redundant assertion after function that raises on failure.

Since execute_single_notebook() raises an exception on failure (line 51) and doesn't return False, the assertion on line 67 is redundant. The test will fail from the raised exception if the notebook execution fails.

Apply this diff:
     # Execute the notebook
-    success = execute_single_notebook(notebook_path, timeout=300)
-    assert success, f"Failed to execute notebook: {notebook_path}"
+    execute_single_notebook(notebook_path, timeout=300)
Likely an incorrect or invalid review comment.
.github/workflows/execute-all-notebooks.yml (2)

14-18: LGTM!

The pull request trigger is properly configured with appropriate event types and path filtering to run only when notebooks or the workflow itself changes.

78-85: LGTM!

Modern action versions and Python 3.12 with pip caching configured properly.

Makefile (2)

1-1: LGTM!

The PHONY declaration correctly includes all the new test targets.

33-44: LGTM!

The new test targets are well-structured with clear dependencies and helpful progress messages. The composite test-notebooks target appropriately chains formatting, parameter validation, and execution tests.

coderabbitai · 2025-11-17T15:59:45Z

.github/workflows/execute-all-notebooks.yml

+      - name: Install Dependencies
+        run: |
+          pip install -r requirements-dev.txt
+          pip install papermill ipykernel jupyter
+          ipython kernel install --name "python3" --user
+


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Potential dependency version conflicts from dual installation.

Installing from requirements-dev.txt and then separately installing papermill ipykernel jupyter could lead to version conflicts if these packages are already specified in requirements-dev.txt. Consider consolidating all dependencies into requirements-dev.txt to ensure consistent versions.

🏁 Script executed:

#!/bin/bash # Check if papermill, ipykernel, or jupyter are already in requirements-dev.txt echo "=== Checking requirements-dev.txt for potential conflicts ===" if [ -f requirements-dev.txt ]; then echo "File exists. Checking for papermill, ipykernel, jupyter..." grep -E "(papermill|ipykernel|jupyter)" requirements-dev.txt || echo "None found in requirements-dev.txt" else echo "requirements-dev.txt not found" fi

Length of output: 130

🏁 Script executed:

#!/bin/bash # Get full requirements-dev.txt and check papermill line + context echo "=== Full requirements-dev.txt content ===" cat requirements-dev.txt echo "" echo "=== Workflow file lines 86-95 for context ===" sed -n '86,95p' .github/workflows/execute-all-notebooks.yml

Length of output: 482

Remove redundant papermill installation to ensure consistent dependency resolution.

Papermill is already specified in requirements-dev.txt but is reinstalled on line 89 alongside ipykernel and jupyter. This dual installation can cause version conflicts if requirements-dev.txt pins a specific version of papermill. Instead, add ipykernel and jupyter to requirements-dev.txt and remove the redundant pip install papermill ipykernel jupyter line.

🤖 Prompt for AI Agents

In .github/workflows/execute-all-notebooks.yml around lines 86 to 91, the workflow redundantly reinstalls papermill (via pip install papermill ipykernel jupyter) even though papermill is already pinned in requirements-dev.txt; remove the extra pip install line and instead add ipykernel and jupyter to requirements-dev.txt, then keep only pip install -r requirements-dev.txt followed by the ipython kernel install --name "python3" --user step so all dependencies (including papermill, ipykernel, jupyter) are installed from the single requirements file.

coderabbitai · 2025-11-17T15:59:46Z

tests/test_notebook_execution.py

+import tempfile
+from pathlib import Path
+
+import papermill as pm


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Verify papermill dependency is declared in requirements.

Ensure that papermill and its required dependencies (ipykernel) are declared in the project's requirements files.

🏁 Script executed:

#!/bin/bash # Check if papermill is declared in requirements files echo "=== Checking for papermill in requirements files ===" fd -t f "requirements.*\.txt$" --exec grep -H "papermill" {} \; echo "" echo "=== Checking for ipykernel in requirements files ===" fd -t f "requirements.*\.txt$" --exec grep -H "ipykernel" {} \; echo "" echo "=== Listing all requirements files ===" fd -t f "requirements.*\.txt$"

Length of output: 343

I need to verify whether ipykernel is actually a required dependency and check for other dependency declaration methods.

Now let me check if there are other dependency declaration files in the repo:

Let me check the project's own dependency declarations:

🏁 Script executed:

#!/bin/bash # Check for setup.py and pyproject.toml in the project root echo "=== Checking for setup.py ===" fd -t f "setup.py" --max-depth 1 echo "" echo "=== Checking for pyproject.toml ===" fd -t f "pyproject.toml" --max-depth 1 echo "" echo "=== Checking contents of requirements-dev.txt ===" cat requirements-dev.txt

Length of output: 170

Let me re-run the script with proper repository access:

🏁 Script executed:

#!/bin/bash # Check for setup.py and pyproject.toml in the project root echo "=== Checking for setup.py ===" fd -t f "setup.py" --max-depth 1 echo "" echo "=== Checking for pyproject.toml ===" fd -t f "pyproject.toml" --max-depth 1 echo "" echo "=== Checking contents of requirements-dev.txt ===" cat requirements-dev.txt echo "" echo "=== Checking if ipykernel is a transitive dependency of papermill ===" grep -E "^(jupyter|nbclient|ipykernel)" requirements-dev.txt || echo "Not explicitly listed in requirements-dev.txt"

Length of output: 290

Let me check the pyproject.toml and verify papermill's dependencies:

🏁 Script executed:

#!/bin/bash # Check pyproject.toml contents and look for dependency sections echo "=== Contents of pyproject.toml ===" cat pyproject.toml echo "" echo "=== Checking test dependencies in pyproject.toml ===" grep -A 20 "\[project.optional-dependencies\]" pyproject.toml || echo "No optional dependencies found"

Length of output: 1013

🌐 Web query:

papermill package dependencies ipykernel

💡 Result:

Short answer

papermill itself (the core package) does not strictly declare ipykernel as a direct dependency on PyPI, but it requires a Jupyter kernel to execute notebooks — so you must have an appropriate kernel (usually provided by ipykernel) available in the environment that will run the notebook. [1][2]

Details / recommended installs

If you run notebooks with papermill, install ipykernel in the execution environment (or ensure the notebook's kernelspec points to an environment that exposes a kernel). Example: pip/conda install ipykernel and register the kernel. [2][6]

If you use the Apache Airflow papermill provider, that provider does declare ipykernel as a dependency (ipykernel >= 6.29.4). So installing apache-airflow-providers-papermill will require ipykernel >= 6.29.4. [3][4]

Useful references

papermill on PyPI (install info): papermill package. [1]

Papermill troubleshooting (NoSuchKernel / advice to install ipykernel or register kernels): Papermill docs. [2]

Apache Airflow provider requirements (lists ipykernel>=6.29.4): apache-airflow-providers-papermill docs/PyPI. [3][4]

Now let me check what the test file actually does to determine if ipykernel is needed:

🏁 Script executed:

#!/bin/bash # Check what the test file actually does echo "=== Contents of tests/test_notebook_execution.py ===" cat tests/test_notebook_execution.py

Length of output: 3008

Add ipykernel to requirements-dev.txt

The test file executes notebooks using pm.execute_notebook() (line 48-54), which requires a Jupyter kernel to be available at runtime. While papermill itself doesn't declare ipykernel as a strict dependency, ipykernel is necessary for notebook execution in this test environment. Add ipykernel to requirements-dev.txt alongside papermill to ensure tests can execute successfully.

🤖 Prompt for AI Agents

In tests/test_notebook_execution.py around line 12, the test uses papermill to execute notebooks but the development requirements are missing ipykernel; update requirements-dev.txt to include ipykernel (e.g., add a new line with "ipykernel" next to "papermill") so the Jupyter kernel is available at test runtime, then re-run tests to confirm notebooks execute successfully.

… execution

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

.github/workflows/execute-all-notebooks.yml (1)
86-90: Redundant dependency installation can cause version conflicts.

Line 88 installs from requirements-dev.txt, then line 89 separately reinstalls papermill, ipykernel, and jupyter. If these packages are already pinned in requirements-dev.txt, the redundant pip install can override those versions and create conflicts. Consolidate all dependencies into requirements-dev.txt and remove the redundant line 89 to ensure consistent, single-source-of-truth dependency resolution.

Apply this diff to resolve the redundancy:
       - name: Install Dependencies
         run: |
           pip install -r requirements-dev.txt
-          pip install papermill ipykernel jupyter
           ipython kernel install --name "python3" --user
Then ensure requirements-dev.txt includes papermill, ipykernel, and jupyter with appropriate version pins.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 770fec6 and 3467502.

📒 Files selected for processing (1)

.github/workflows/execute-all-notebooks.yml (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Summary

🔇 Additional comments (3)

.github/workflows/execute-all-notebooks.yml (3)

14-18: PR trigger and path filters look appropriate.

The pull_request trigger with type filters and path-based filtering for notebooks and the workflow file itself enables CI execution on relevant changes while avoiding unnecessary runs.

80-84: Python setup and caching configuration looks good.

Python 3.12 with pip cache is appropriately configured for the EC2 runner.

92-95: Test execution command and path are clear.

The pytest invocation in the tests directory targeting test_notebook_execution.py with verbose output and short traceback is appropriate for notebook execution testing.

coderabbitai · 2025-11-17T16:34:20Z

.github/workflows/execute-all-notebooks.yml

        run: |
          echo "hello. i'm running from inside of EC2 instance ${{ needs.launch-ec2-runner.outputs.ec2-instance-id }}"

+      - uses: actions/checkout@v5


⚠️ Potential issue | 🟡 Minor

Checkout action version inconsistency.

Line 78 uses actions/checkout@v5, but line 40 uses actions/checkout@v4. Maintain consistency across the workflow by upgrading line 78 to @v5 or downgrading line 40 to @v4 (preferably keep both on the latest stable version).

🤖 Prompt for AI Agents

.github/workflows/execute-all-notebooks.yml around lines 40 and 78: the workflow uses mixed checkout action versions (line 40 is actions/checkout@v4 while line 78 is actions/checkout@v5); make them consistent by updating the older reference to the preferred stable version (recommend upgrading the @v4 usage at line 40 to actions/checkout@v5) or alternatively downgrade line 78 to @v4—ensure both checkout steps reference the same tag and run the workflow to verify no breaking changes.

courtneypacheco

For testing

shruthis4 added 2 commits November 17, 2025 10:36

Modify to run the workflow on pull request

770fec6

shruthis4 requested a review from a team as a code owner November 17, 2025 15:54

coderabbitai bot reviewed Nov 17, 2025

View reviewed changes

Update GitHub Actions workflow to trigger on push events for notebook…

3467502

… execution

coderabbitai bot reviewed Nov 17, 2025

View reviewed changes

courtneypacheco changed the base branch from main to aws-test November 17, 2025 16:35

courtneypacheco approved these changes Nov 17, 2025

View reviewed changes

courtneypacheco merged commit aff21c8 into opendatahub-io:aws-test Nov 17, 2025
4 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add test to execute notebooks #58

Add test to execute notebooks #58

Uh oh!

shruthis4 commented Nov 17, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 17, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 17, 2025

Uh oh!

coderabbitai bot Nov 17, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 17, 2025

Uh oh!

courtneypacheco left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add test to execute notebooks #58

Add test to execute notebooks #58

Uh oh!

Conversation

shruthis4 commented Nov 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Merge criteria:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

courtneypacheco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shruthis4 commented Nov 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 17, 2025 •

edited

Loading