Notebook unittests #346

AmandaBirmingham · 2025-11-06T01:01:00Z

This is a big one that adds unit tests for the most frequently used notebooks (metaG, metaT, and amplicon). Basically notebook_test_helpers now defines a small framework for injecting variable settings and running notebooks programmatically using papermill, and then checking if the output files they create are as expected. Note that this is hardly comprehensive--for example, it is not checking that the notebook ipynb files themselves have the expected contents when run (which would be useful for checking e.g. the inline plots they create) nor is it checking to make sure that the notebooks aren't ALSO creating some other unexpected files.

In order for this to work, I had to comment out the inline cells scattered throughout the notebooks that set the default inputs as examples (because otherwise they overwrite the test settings injected into the first cell of the notebook by papermill). I marked each with ## INPUT to make them easy to see and moved them so they are not sharing cells with non-input code. They are set up so a real user can uncomment the entire cell with a keyboard shortcut and then modify the input setting for their specific run (which they MUST DO ANYWAY and it is a PEBKAC when they do not). If they fail to do this and try to keep running the notebook anyway, they will get an immediate and fairly helpful error from Jupyter saying that they are trying to use a variable that isn't defined.

I also had to deal with a few input variables that were being defined and then redefined again later in the notebooks, as this is not compatible with papermill's "set all the variables all at once at the top" strategy. To get around this, I gave all input variables distinct names. I also fixed some bugs in the file path checking code in the metaT notebook.

…s for amplicon notebook

…e lines

…human investigation

…tyle

…med whitespace at the end of code lines

… settings to string constants

…nts checking

Copilot

Pull Request Overview

This PR refactors notebook testing infrastructure by introducing a base TestNotebook class and renaming test classes for consistency. The changes include:

Creating a new helper base class (TestNotebook) for notebook testing
Renaming test classes to follow a consistent naming pattern with "Notebook" suffix
Adding three new comprehensive test files for different notebook pipelines
Updating an output test data file with additional columns

Reviewed Changes

Copilot reviewed 10 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
notebooks/tests/test_tellseq_D_variable_volume_pooling.py	Renamed test class from `TestTellseqD` to `TestTellseqDNotebook` and test method for clarity
notebooks/tests/test_metatranscriptomics_matrix_pipeline_seqcount_norm.py	New test file for metatranscriptomics pipeline with comprehensive parameter setup
notebooks/tests/test_matrix_tube_pipeline_seqcount_norm.py	New test file for matrix tube pipeline with both standard and absquant test cases
notebooks/tests/test_amplicon_pre_prep_file_generator.py	New test file for amplicon preprocessing with detailed compression layout configuration
notebooks/tests/notebook_test_helpers.py	New base class providing common test infrastructure for notebook validation
notebooks/test_output/amplicon/20230203_IL515fBC_806_ABTX_11052_Plates_174_178_182_185_ADAPT_12986_Plate_16_17_18_21_merged.txt	Updated test output file with new sample_name column and additional location columns

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-06T01:03:00Z

notebooks/tests/notebook_test_helpers.py

+import os
+import re
+
+SAVE_DIR = "/Users/amandabirmingham/Desktop"


Hardcoded user-specific path should not be committed to the repository. This path is specific to a single developer's machine and will not work for other contributors or in CI/CD environments. Consider using a temporary directory or making this configurable through environment variables.

Suggested change

SAVE_DIR = "/Users/amandabirmingham/Desktop"

SAVE_DIR = os.environ.get("UNMATCHED_OUTPUT_SAVE_DIR", tempfile.gettempdir())

Copilot · 2025-11-06T01:03:00Z

notebooks/tests/notebook_test_helpers.py

+    _ZERO_DATES_FUNC_KEY = "zero_dates_func"  # func to replace for dates
+
+    # TODO: turn off before committing
+    _SAVE_UNMATCHED_OUTPUTS = True  # whether to save unmatched outputs


The TODO comment indicates this debug flag should be set to False before committing. This flag enables saving unmatched output files to the hardcoded SAVE_DIR, which should be disabled in production code to avoid unexpected file writes.

Suggested change

_SAVE_UNMATCHED_OUTPUTS = True # whether to save unmatched outputs

_SAVE_UNMATCHED_OUTPUTS = False # whether to save unmatched outputs

Copilot · 2025-11-06T01:03:01Z

notebooks/tests/notebook_test_helpers.py

+                                          zero_dates_func=None):
+        """Helper function to compare two text files for exact match."""
+
+        filename = filename if not filename else f"{filename} "


The ternary logic is inverted. When filename is truthy (not None/empty), it should be formatted with a space; when falsy, it should be an empty string. The current code does the opposite: it returns filename when it's falsy and formats it when truthy. Should be: filename = f"{filename} " if filename else ""

Suggested change

filename = filename if not filename else f"{filename} "

filename = f"{filename} " if filename else ""

coveralls · 2025-11-06T01:10:34Z

Pull Request Test Coverage Report for Build 19121750241

Details

91 of 101 (90.1%) changed or added relevant lines in 5 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.1%) to 92.687%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
notebooks/tests/notebook_test_helpers.py	54	64	84.38%

Totals
Change from base Build 19117964099:	0.1%
Covered Lines:	6477
Relevant Lines:	6988

💛 - Coveralls

AmandaBirmingham added 14 commits November 5, 2025 16:50

extend notebook test framework, add updated known good files and test…

576efd2

…s for amplicon notebook

rename reused vars, comment out inputs, remove trailing spaces on cod…

8b89d06

…e lines

initial notebook tests

1fc4177

extend definitions of output params

54b5b90

utils for notebook testing

b3fb37f

restructure output params, add ability to save non-matched files for …

b13b97a

…human investigation

updated metaT test files to have limited decimal places, in current s…

50f9846

…tyle

labeled inputs, removed contains_replicates from bioinformatics, trim…

de2ff5a

…med whitespace at the end of code lines

moved inputs into own cells, fixed errors in valid file check

e420a00

correct typo in comment

d76e621

commented out inputs, replaced re-defined variables, changed well col…

25cd3e2

… settings to string constants

added metatranscriptomics tests, linted other notebook test files

dba31ea

temporarily set iseqnormpool output as NOT a filepath to remove conte…

38355a0

…nts checking

fix typo

da6e07e

AmandaBirmingham requested a review from Copilot November 6, 2025 01:01

AmandaBirmingham changed the title ~~Pacbio unittests~~ Notebook unittests Nov 6, 2025

Copilot AI reviewed Nov 6, 2025

View reviewed changes

AmandaBirmingham added 2 commits November 5, 2025 17:12

fix copilot code review issues

9fd4e18

removed superfluous __main__s

95d0d61

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Notebook unittests #346

Notebook unittests #346

Uh oh!

AmandaBirmingham commented Nov 6, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 6, 2025

Uh oh!

Copilot AI Nov 6, 2025

Uh oh!

Copilot AI Nov 6, 2025

Uh oh!

coveralls commented Nov 6, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	SAVE_DIR = "/Users/amandabirmingham/Desktop"
	SAVE_DIR = os.environ.get("UNMATCHED_OUTPUT_SAVE_DIR", tempfile.gettempdir())

	_SAVE_UNMATCHED_OUTPUTS = True # whether to save unmatched outputs
	_SAVE_UNMATCHED_OUTPUTS = False # whether to save unmatched outputs

	filename = filename if not filename else f"{filename} "
	filename = f"{filename} " if filename else ""

Notebook unittests #346

Are you sure you want to change the base?

Notebook unittests #346

Uh oh!

Conversation

AmandaBirmingham commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

coveralls commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 19121750241

Details

💛 - Coveralls

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AmandaBirmingham commented Nov 6, 2025 •

edited

Loading

coveralls commented Nov 6, 2025 •

edited

Loading