Add comprehensive tests for kwargs validation across all functions#314
Add comprehensive tests for kwargs validation across all functions#314Shlokpalrecha wants to merge 2 commits intoarviz-devs:mainfrom
Conversation
- Add 26 tests covering kwargs validation for all major user-facing functions - Tests ensure functions properly raise TypeError for invalid keyword arguments - Covers sampling diagnostics, summary statistics, LOO functions, metrics, and accessors - Includes tests for common typos (e.g., 'dims' instead of 'dim') - All tests pass successfully (26/26) - Full test suite passes (2475 passed, 2 skipped) This addresses issue arviz-devs#142 by adding comprehensive test coverage to ensure kwargs validation continues to work properly and prevent regressions. While PR arviz-devs#143 initially fixed the validation for hdi/eti/kde/histogram, investigation shows all functions now properly validate kwargs, but test coverage was incomplete. Related to arviz-devs#142
OriolAbril
left a comment
There was a problem hiding this comment.
Thanks this is a good start. I am starting with a few global comments and we can iterate from there. There might be some exceptions to the general comments I am mentioning but it will be easier to see those once the file is shorter and with less redundancy
tests/test_kwargs_validation.py
Outdated
|
|
||
| # pylint: disable=redefined-outer-name, no-self-use, unexpected-keyword-arg | ||
| import pytest | ||
| from arviz_base import load_arviz_data |
There was a problem hiding this comment.
take a look at the other test files that import arviz_base. We want arviz-stats to also be usable without arviz-base installed and we run tests in minimal environments so it needs some extra care. Ref: https://python.arviz.org/projects/stats/en/latest/contributing/testing.html#how-to-write-tests
tests/test_kwargs_validation.py
Outdated
| @pytest.fixture(scope="class") | ||
| def idata(self): | ||
| """Load test data.""" | ||
| return load_arviz_data("non_centered_eight") |
There was a problem hiding this comment.
there is no need to define this fixture, the ones in conftest.py are available from any test file. For these tests you can use datatree directly
tests/test_kwargs_validation.py
Outdated
| def test_ess_rejects_invalid_kwargs(self, idata): | ||
| """Test that ess() raises TypeError for invalid kwargs.""" | ||
| import arviz_stats as azs | ||
|
|
||
| with pytest.raises(TypeError, match=".*unexpected keyword argument.*"): | ||
| azs.ess(idata.posterior, invalid_kwarg="test") | ||
|
|
||
| # Also test common typos | ||
| with pytest.raises(TypeError, match=".*unexpected keyword argument.*"): | ||
| azs.ess(idata.posterior, methods="bulk") # should be 'method' |
There was a problem hiding this comment.
for testing functions a couple comments.
The first and main one is we want to test we don't silently ignore **kwargs. Therefore, these types of tests will only be relevant to functions that take **kwargs. There are many functions that do but not all of them do. ess in particular does not, so this test is testing basic builtin features of python functions. If we can't rely on that this test failing would be the least of our problems. Only functions that take **kwargs should get these kind of tests.
The second is about the test content and style so it is not relevant for ess, rhat, and other functions that do not take **kwargs but it is for all the ones who do. We ignore doc related link checks in tests because the test name and body is usually self explanatory, we don't see any value added by adding a docstring that is basically the test name with better grammar. arviz_stats is the library we are testing so it should be imported only once at the top of the file with the other imports, and the two pytest.raises are doing the same check so one is enough, when it comes to checking if invalid_kwarg or methods is an argument of the function the function is doing exactly the same, we'd add the two different ones if we are triggering a different error or triggering the same error through a different code branch.
After doing all of that, the tests can be parametrized by the function to test on. As you probably have seen in #143, we have a single test function that checks multiple accessors through pytest.mark.parametrize. Here the same should happen with the relevant top level functions which I would rope together even if their source code is defined in different source files.
INVESTIGATION_SUMMARY.md
Outdated
There was a problem hiding this comment.
this file should not be commited
Address review feedback from OriolAbril on PR arviz-devs#314: - Only test accessor methods that accept **kwargs - Use pytest.mark.parametrize to reduce test duplication - Remove test docstrings (test names are self-explanatory) - Use importorskip pattern for minimal environment support - Use datatree fixture from conftest.py instead of redefining - Remove INVESTIGATION_SUMMARY.md from repository - Import numpy at module level instead of inside test functions - Follow the same pattern as tests in PR arviz-devs#143 The parametrized test covers 20 accessor methods, with 3 additional tests for methods that require positional arguments before **kwargs. All 23 tests pass successfully.
|
Thanks for the feedback! I've addressed all your points:
All 23 tests pass. Ready for another look. |
OriolAbril
left a comment
There was a problem hiding this comment.
now that tests are parametrized and don't take as much space they should be moved (and combined/integrated with) the tests in base/test_stats.py. Otherwise we will be extremely confused in the future as to why the kde accessor is tested in one place but the ecdf one is tested somewhere else. Organization of tests here in arviz-stats is quite a disaster so we might also want to move them somewhere else which will be much easier if they are all in the same place
tests/test_kwargs_validation.py
Outdated
| getattr(accessor, func)(invalid_kwarg="value") | ||
|
|
||
|
|
||
| def test_loo_score_kwargs_raise(datatree): |
There was a problem hiding this comment.
these bottom 3 can also be parametrized into one test function
tests/test_kwargs_validation.py
Outdated
| ), | ||
| ) | ||
| def test_accessor_kwargs_raise(datatree, func): | ||
| accessor = datatree.posterior.ds.azstats |
There was a problem hiding this comment.
| accessor = datatree.posterior.ds.azstats | |
| accessor = datatree.posterior.dataset.azstats |
minor nit, .ds is the older syntax. not fully deprecated but discouraged so it should not be used in new code
tests/test_kwargs_validation.py
Outdated
|
|
||
| def test_loo_score_kwargs_raise(datatree): | ||
| accessor = datatree.posterior.ds.azstats | ||
| y_obs = xr.DataArray(np.random.randn(8)) |
There was a problem hiding this comment.
| y_obs = xr.DataArray(np.random.randn(8)) | |
| rng = np.random.default_rng() | |
| y_obs = xr.DataArray(rng.normal(size=8)) |
we should also use the numpy random Generators instead of RandomState classes (either explicitly or through the global np.random.<distribution>.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #314 +/- ##
==========================================
- Coverage 84.42% 84.16% -0.26%
==========================================
Files 42 42
Lines 5829 5930 +101
==========================================
+ Hits 4921 4991 +70
- Misses 908 939 +31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Address review feedback from OriolAbril on PR arviz-devs#314: - Only test accessor methods that accept **kwargs - Use pytest.mark.parametrize to reduce test duplication - Remove test docstrings (test names are self-explanatory) - Use importorskip pattern for minimal environment support - Use datatree fixture from conftest.py instead of redefining - Remove INVESTIGATION_SUMMARY.md from repository - Import numpy at module level instead of inside test functions - Follow the same pattern as tests in PR arviz-devs#143 The parametrized test covers 20 accessor methods, with 3 additional tests for methods that require positional arguments before **kwargs. All 23 tests pass successfully.
142ae20 to
d869747
Compare
|
Moved everything into |
What this PR does
Adds test coverage for kwargs validation across arviz-stats functions (relates to #142).
Background
While working on #142, I tested whether functions properly reject invalid kwargs. Turns out they all work correctly! But I noticed we only have tests for 4 functions (from PR #143), so I added comprehensive coverage.
My investigation process
I wrote a few scripts to systematically test functions with:
invalid_kwarg="test"dimsinstead ofdim, etc.).azstatsaccessorAll 24+ functions I tested handle invalid kwargs properly by raising TypeError. Great!
What I added
tests/test_kwargs_validation.py - 26 tests organized by function category:
.azstats.*calls)Testing