-
Notifications
You must be signed in to change notification settings - Fork 8
Add test to execute notebooks #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Updated Makefile to include new test targets for notebook execution and overall testing. - Added a new test script for executing notebooks using papermill, ensuring they run without errors. - Modified GitHub Actions workflow to set up Python, install dependencies, and execute all notebooks as part of the CI process.
WalkthroughThe pull request adds CI notebook execution: a GitHub Actions workflow now triggers on PRs touching notebooks and the workflow file, runs setup steps on an EC2 runner (checkout, Python 3.12, deps, kernel registration) and executes pytest; the Makefile gains notebook test targets; a new pytest module runs notebooks via papermill with timeout and cleanup. Changes
Sequence Diagram(s)sequenceDiagram
participant PR as Pull Request
participant GH as GitHub Actions
participant EC2 as EC2 Runner
participant OS as Runner OS
participant PY as Python Setup
participant TEST as pytest / Papermill
participant NB as Notebook Files
PR->>GH: open/update PR touching notebooks/** or workflow
GH->>EC2: dispatch job (execute-all-notebooks)
EC2->>OS: run pre-setup steps
EC2->>PY: actions/checkout@v5
EC2->>PY: actions/setup-python@v6 (3.12 + pip cache)
EC2->>PY: install requirements-dev.txt, papermill, ipykernel, jupyter
EC2->>PY: register IPython kernel
EC2->>TEST: run pytest tests/test_notebook_execution.py
TEST->>NB: for each notebook from discovery
NB->>TEST: executed via papermill (with params, timeout)
alt success
TEST-->>EC2: pass
else failure / timeout
TEST-->>EC2: fail (error reported)
end
EC2-->>GH: post job results
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
tests/test_notebook_execution.py (1)
27-54: Solid implementation with proper error handling and cleanup.The function correctly handles temporary files and preserves exception context. One minor suggestion: consider making the timeout configurable via an environment variable (e.g.,
NOTEBOOK_TIMEOUT) to allow easier adjustment in different CI environments without code changes..github/workflows/execute-all-notebooks.yml (1)
92-95: Simplify pytest invocation by running from repository root.Instead of changing to the tests directory, run pytest from the repository root to maintain consistency with typical pytest usage patterns.
Apply this diff:
- name: Execute All Notebooks via Pytest run: | - cd tests - python -m pytest test_notebook_execution.py -v --tb=short + python -m pytest tests/test_notebook_execution.py -v --tb=shortMakefile (1)
1-1: Optional: Consider adding conventional Makefile targets for broader compatibility.While not required, adding conventional targets like
all,clean, andtest(as an alias totest-all) can improve compatibility with standard Make conventions and developer expectations.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/execute-all-notebooks.yml(2 hunks)Makefile(2 hunks)tests/test_notebook_execution.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_notebook_execution.py (1)
tests/conftest.py (1)
get_notebook_files(11-19)
🪛 checkmake (0.2.2)
Makefile
[warning] 1-1: Missing required phony target "all"
(minphony)
[warning] 1-1: Missing required phony target "clean"
(minphony)
[warning] 1-1: Missing required phony target "test"
(minphony)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Summary
🔇 Additional comments (9)
tests/test_notebook_execution.py (5)
1-16: LGTM!The imports and module structure are well-organized and the docstring clearly describes the module's purpose.
18-24: LGTM!The test parameters are appropriate for CI/CD execution with test_mode and quick_run enabled.
70-77: LGTM!Good validation test that ensures the notebooks directory is properly configured before executing notebooks.
80-95: LGTM!The CLI interface is a nice addition for local testing and debugging of individual notebooks.
57-67: Redundant assertion after function that raises on failure.Since
execute_single_notebook()raises an exception on failure (line 51) and doesn't return False, the assertion on line 67 is redundant. The test will fail from the raised exception if the notebook execution fails.Apply this diff:
# Execute the notebook - success = execute_single_notebook(notebook_path, timeout=300) - assert success, f"Failed to execute notebook: {notebook_path}" + execute_single_notebook(notebook_path, timeout=300)Likely an incorrect or invalid review comment.
.github/workflows/execute-all-notebooks.yml (2)
14-18: LGTM!The pull request trigger is properly configured with appropriate event types and path filtering to run only when notebooks or the workflow itself changes.
78-85: LGTM!Modern action versions and Python 3.12 with pip caching configured properly.
Makefile (2)
1-1: LGTM!The PHONY declaration correctly includes all the new test targets.
33-44: LGTM!The new test targets are well-structured with clear dependencies and helpful progress messages. The composite
test-notebookstarget appropriately chains formatting, parameter validation, and execution tests.
| - name: Install Dependencies | ||
| run: | | ||
| pip install -r requirements-dev.txt | ||
| pip install papermill ipykernel jupyter | ||
| ipython kernel install --name "python3" --user | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Potential dependency version conflicts from dual installation.
Installing from requirements-dev.txt and then separately installing papermill ipykernel jupyter could lead to version conflicts if these packages are already specified in requirements-dev.txt. Consider consolidating all dependencies into requirements-dev.txt to ensure consistent versions.
🏁 Script executed:
#!/bin/bash
# Check if papermill, ipykernel, or jupyter are already in requirements-dev.txt
echo "=== Checking requirements-dev.txt for potential conflicts ==="
if [ -f requirements-dev.txt ]; then
echo "File exists. Checking for papermill, ipykernel, jupyter..."
grep -E "(papermill|ipykernel|jupyter)" requirements-dev.txt || echo "None found in requirements-dev.txt"
else
echo "requirements-dev.txt not found"
fiLength of output: 130
🏁 Script executed:
#!/bin/bash
# Get full requirements-dev.txt and check papermill line + context
echo "=== Full requirements-dev.txt content ==="
cat requirements-dev.txt
echo ""
echo "=== Workflow file lines 86-95 for context ==="
sed -n '86,95p' .github/workflows/execute-all-notebooks.ymlLength of output: 482
Remove redundant papermill installation to ensure consistent dependency resolution.
Papermill is already specified in requirements-dev.txt but is reinstalled on line 89 alongside ipykernel and jupyter. This dual installation can cause version conflicts if requirements-dev.txt pins a specific version of papermill. Instead, add ipykernel and jupyter to requirements-dev.txt and remove the redundant pip install papermill ipykernel jupyter line.
🤖 Prompt for AI Agents
In .github/workflows/execute-all-notebooks.yml around lines 86 to 91, the
workflow redundantly reinstalls papermill (via pip install papermill ipykernel
jupyter) even though papermill is already pinned in requirements-dev.txt; remove
the extra pip install line and instead add ipykernel and jupyter to
requirements-dev.txt, then keep only pip install -r requirements-dev.txt
followed by the ipython kernel install --name "python3" --user step so all
dependencies (including papermill, ipykernel, jupyter) are installed from the
single requirements file.
| import tempfile | ||
| from pathlib import Path | ||
|
|
||
| import papermill as pm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify papermill dependency is declared in requirements.
Ensure that papermill and its required dependencies (ipykernel) are declared in the project's requirements files.
🏁 Script executed:
#!/bin/bash
# Check if papermill is declared in requirements files
echo "=== Checking for papermill in requirements files ==="
fd -t f "requirements.*\.txt$" --exec grep -H "papermill" {} \;
echo ""
echo "=== Checking for ipykernel in requirements files ==="
fd -t f "requirements.*\.txt$" --exec grep -H "ipykernel" {} \;
echo ""
echo "=== Listing all requirements files ==="
fd -t f "requirements.*\.txt$"Length of output: 343
I need to verify whether ipykernel is actually a required dependency and check for other dependency declaration methods.
Now let me check if there are other dependency declaration files in the repo:
Let me check the project's own dependency declarations:
🏁 Script executed:
#!/bin/bash
# Check for setup.py and pyproject.toml in the project root
echo "=== Checking for setup.py ==="
fd -t f "setup.py" --max-depth 1
echo ""
echo "=== Checking for pyproject.toml ==="
fd -t f "pyproject.toml" --max-depth 1
echo ""
echo "=== Checking contents of requirements-dev.txt ==="
cat requirements-dev.txtLength of output: 170
Let me re-run the script with proper repository access:
🏁 Script executed:
#!/bin/bash
# Check for setup.py and pyproject.toml in the project root
echo "=== Checking for setup.py ==="
fd -t f "setup.py" --max-depth 1
echo ""
echo "=== Checking for pyproject.toml ==="
fd -t f "pyproject.toml" --max-depth 1
echo ""
echo "=== Checking contents of requirements-dev.txt ==="
cat requirements-dev.txt
echo ""
echo "=== Checking if ipykernel is a transitive dependency of papermill ==="
grep -E "^(jupyter|nbclient|ipykernel)" requirements-dev.txt || echo "Not explicitly listed in requirements-dev.txt"Length of output: 290
Let me check the pyproject.toml and verify papermill's dependencies:
🏁 Script executed:
#!/bin/bash
# Check pyproject.toml contents and look for dependency sections
echo "=== Contents of pyproject.toml ==="
cat pyproject.toml
echo ""
echo "=== Checking test dependencies in pyproject.toml ==="
grep -A 20 "\[project.optional-dependencies\]" pyproject.toml || echo "No optional dependencies found"Length of output: 1013
🌐 Web query:
papermill package dependencies ipykernel
💡 Result:
Short answer
- papermill itself (the core package) does not strictly declare ipykernel as a direct dependency on PyPI, but it requires a Jupyter kernel to execute notebooks — so you must have an appropriate kernel (usually provided by ipykernel) available in the environment that will run the notebook. [1][2]
Details / recommended installs
- If you run notebooks with papermill, install ipykernel in the execution environment (or ensure the notebook's kernelspec points to an environment that exposes a kernel). Example: pip/conda install ipykernel and register the kernel. [2][6]
- If you use the Apache Airflow papermill provider, that provider does declare ipykernel as a dependency (ipykernel >= 6.29.4). So installing apache-airflow-providers-papermill will require ipykernel >= 6.29.4. [3][4]
Useful references
- papermill on PyPI (install info): papermill package. [1]
- Papermill troubleshooting (NoSuchKernel / advice to install ipykernel or register kernels): Papermill docs. [2]
- Apache Airflow provider requirements (lists ipykernel>=6.29.4): apache-airflow-providers-papermill docs/PyPI. [3][4]
Now let me check what the test file actually does to determine if ipykernel is needed:
🏁 Script executed:
#!/bin/bash
# Check what the test file actually does
echo "=== Contents of tests/test_notebook_execution.py ==="
cat tests/test_notebook_execution.pyLength of output: 3008
Add ipykernel to requirements-dev.txt
The test file executes notebooks using pm.execute_notebook() (line 48-54), which requires a Jupyter kernel to be available at runtime. While papermill itself doesn't declare ipykernel as a strict dependency, ipykernel is necessary for notebook execution in this test environment. Add ipykernel to requirements-dev.txt alongside papermill to ensure tests can execute successfully.
🤖 Prompt for AI Agents
In tests/test_notebook_execution.py around line 12, the test uses papermill to
execute notebooks but the development requirements are missing ipykernel; update
requirements-dev.txt to include ipykernel (e.g., add a new line with "ipykernel"
next to "papermill") so the Jupyter kernel is available at test runtime, then
re-run tests to confirm notebooks execute successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
.github/workflows/execute-all-notebooks.yml (1)
86-90: Redundant dependency installation can cause version conflicts.Line 88 installs from
requirements-dev.txt, then line 89 separately reinstallspapermill,ipykernel, andjupyter. If these packages are already pinned inrequirements-dev.txt, the redundantpip installcan override those versions and create conflicts. Consolidate all dependencies intorequirements-dev.txtand remove the redundant line 89 to ensure consistent, single-source-of-truth dependency resolution.Apply this diff to resolve the redundancy:
- name: Install Dependencies run: | pip install -r requirements-dev.txt - pip install papermill ipykernel jupyter ipython kernel install --name "python3" --userThen ensure
requirements-dev.txtincludespapermill,ipykernel, andjupyterwith appropriate version pins.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/execute-all-notebooks.yml(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Summary
🔇 Additional comments (3)
.github/workflows/execute-all-notebooks.yml (3)
14-18: PR trigger and path filters look appropriate.The pull_request trigger with type filters and path-based filtering for notebooks and the workflow file itself enables CI execution on relevant changes while avoiding unnecessary runs.
80-84: Python setup and caching configuration looks good.Python 3.12 with pip cache is appropriately configured for the EC2 runner.
92-95: Test execution command and path are clear.The pytest invocation in the
testsdirectory targetingtest_notebook_execution.pywith verbose output and short traceback is appropriate for notebook execution testing.
| run: | | ||
| echo "hello. i'm running from inside of EC2 instance ${{ needs.launch-ec2-runner.outputs.ec2-instance-id }}" | ||
| - uses: actions/checkout@v5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checkout action version inconsistency.
Line 78 uses actions/checkout@v5, but line 40 uses actions/checkout@v4. Maintain consistency across the workflow by upgrading line 78 to @v5 or downgrading line 40 to @v4 (preferably keep both on the latest stable version).
🤖 Prompt for AI Agents
.github/workflows/execute-all-notebooks.yml around lines 40 and 78: the workflow
uses mixed checkout action versions (line 40 is actions/checkout@v4 while line
78 is actions/checkout@v5); make them consistent by updating the older reference
to the preferred stable version (recommend upgrading the @v4 usage at line 40 to
actions/checkout@v5) or alternatively downgrade line 78 to @v4—ensure both
checkout steps reference the same tag and run the workflow to verify no breaking
changes.
courtneypacheco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing
aff21c8
into
opendatahub-io:aws-test
Description
How Has This Been Tested?
Merge criteria:
Summary by CodeRabbit