Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions INVESTIGATION_SUMMARY.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should not be commited

Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Investigation Notes: Kwargs Validation

Looking into issue #142 - checking if functions properly reject invalid kwargs.

## Background

Issue #142 is about making sure functions raise TypeError for invalid keyword arguments instead of silently ignoring them. PR #143 already fixed this for hdi(), eti(), kde(), and histogram().

Wanted to see if other functions validate kwargs too.

## What I did

Wrote some test scripts to check the main functions. My approach:
- Tested 24+ functions with invalid kwargs like `invalid_kwarg="test"`
- Also tried common typos (`dims` vs `dim`, `methods` vs `method`)
- Checked both direct calls and `.azstats` accessor methods

## Results

Good news - everything works! All the functions I tested properly reject invalid kwargs.

Functions tested:
- Sampling diagnostics: ess(), rhat(), mcse(), bfmi(), diagnose()
- Summary stats: summary(), mean(), median(), mode(), ci_in_rope()
- Density/viz: hdi(), eti(), kde(), histogram(), qds(), ecdf()
- LOO stuff: loo(), compare()
- Metrics: bayesian_r2(), metrics()
- Other: thin(), weight_predictions(), bayes_factor(), psense()

### How it works

Functions with `**kwargs` either pass them downstream (where they get validated) or check for unexpected kwargs explicitly and raise TypeError.

Two error patterns I saw:
- `"got an unexpected keyword argument..."` (most common)
- `"got multiple values for argument 'dims'"` (when typo conflicts with internal param)

Both work fine for catching bad inputs.

## The actual problem

Only 4 functions have tests for kwargs validation (from PR #143). The functionality is there, but test coverage is incomplete.

So I added 26 tests covering all the functions above. Tests check both direct calls and accessor methods, plus common typo scenarios.

## Test results

```
tests/test_kwargs_validation.py: 26 passed
Full test suite: 2475 passed, 2 skipped
```

## Tests added

`tests/test_kwargs_validation.py` has 26 tests organized into:
- TestKwargsValidationSamplingDiagnostics (5 tests)
- TestKwargsValidationSummaryStatistics (5 tests)
- TestKwargsValidationVisualizationFunctions (2 tests)
- TestKwargsValidationLOOFunctions (2 tests)
- TestKwargsValidationMetrics (2 tests)
- TestKwargsValidationOtherFunctions (4 tests)
- TestKwargsValidationAccessors (6 tests)

## Why add tests if validation already works?

- Prevents regressions if someone refactors the validation logic later
- Documents expected behavior for new contributors
- Covers edge cases and common typos
- Matches the test coverage level I saw elsewhere in the codebase

Issue #142 is technically resolved functionality-wise, but this adds the test documentation we were missing.
274 changes: 274 additions & 0 deletions tests/test_kwargs_validation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
"""
Comprehensive tests for kwargs validation across arviz-stats functions.

These tests ensure that functions properly raise TypeError when given
unexpected keyword arguments, preventing silent failures from typos.

Related to issue #142 and PR #143.
"""

# pylint: disable=redefined-outer-name, no-self-use, unexpected-keyword-arg
import pytest
from arviz_base import load_arviz_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take a look at the other test files that import arviz_base. We want arviz-stats to also be usable without arviz-base installed and we run tests in minimal environments so it needs some extra care. Ref: https://python.arviz.org/projects/stats/en/latest/contributing/testing.html#how-to-write-tests



class TestKwargsValidationSamplingDiagnostics:
"""Test kwargs validation for sampling diagnostic functions."""

@pytest.fixture(scope="class")
def idata(self):
"""Load test data."""
return load_arviz_data("non_centered_eight")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no need to define this fixture, the ones in conftest.py are available from any test file. For these tests you can use datatree directly


def test_ess_rejects_invalid_kwargs(self, idata):
"""Test that ess() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.ess(idata.posterior, invalid_kwarg="test")

# Also test common typos
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.ess(idata.posterior, methods="bulk") # should be 'method'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for testing functions a couple comments.


The first and main one is we want to test we don't silently ignore **kwargs. Therefore, these types of tests will only be relevant to functions that take **kwargs. There are many functions that do but not all of them do. ess in particular does not, so this test is testing basic builtin features of python functions. If we can't rely on that this test failing would be the least of our problems. Only functions that take **kwargs should get these kind of tests.


The second is about the test content and style so it is not relevant for ess, rhat, and other functions that do not take **kwargs but it is for all the ones who do. We ignore doc related link checks in tests because the test name and body is usually self explanatory, we don't see any value added by adding a docstring that is basically the test name with better grammar. arviz_stats is the library we are testing so it should be imported only once at the top of the file with the other imports, and the two pytest.raises are doing the same check so one is enough, when it comes to checking if invalid_kwarg or methods is an argument of the function the function is doing exactly the same, we'd add the two different ones if we are triggering a different error or triggering the same error through a different code branch.

After doing all of that, the tests can be parametrized by the function to test on. As you probably have seen in #143, we have a single test function that checks multiple accessors through pytest.mark.parametrize. Here the same should happen with the relevant top level functions which I would rope together even if their source code is defined in different source files.


def test_rhat_rejects_invalid_kwargs(self, idata):
"""Test that rhat() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.rhat(idata.posterior, invalid_kwarg="test")

# Test common typo
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.rhat(idata.posterior, methods="rank") # should be 'method'

def test_mcse_rejects_invalid_kwargs(self, idata):
"""Test that mcse() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.mcse(idata.posterior, invalid_kwarg="test")

# Test common typo
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.mcse(idata.posterior, methods="mean") # should be 'method'

def test_bfmi_rejects_invalid_kwargs(self, idata):
"""Test that bfmi() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.bfmi(idata, invalid_kwarg="test")

def test_diagnose_rejects_invalid_kwargs(self, idata):
"""Test that diagnose() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.diagnose(idata, invalid_kwarg="test")


class TestKwargsValidationSummaryStatistics:
"""Test kwargs validation for summary statistic functions."""

@pytest.fixture(scope="class")
def idata(self):
"""Load test data."""
return load_arviz_data("non_centered_eight")

def test_summary_rejects_invalid_kwargs(self, idata):
"""Test that summary() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.summary(idata, invalid_kwarg="test")

# Test common typo
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.summary(idata, sample_dim=["chain", "draw"]) # should be 'sample_dims'

def test_mean_rejects_invalid_kwargs(self, idata):
"""Test that mean() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.mean(idata.posterior, invalid_kwarg="test")

# Test common typo (errors with "multiple values" due to internal dims variable)
with pytest.raises(TypeError):
azs.mean(idata.posterior, dims="draw") # should be 'dim'

def test_median_rejects_invalid_kwargs(self, idata):
"""Test that median() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.median(idata.posterior, invalid_kwarg="test")

# Test common typo (errors with "multiple values" due to internal dims variable)
with pytest.raises(TypeError):
azs.median(idata.posterior, dims="draw") # should be 'dim'

def test_mode_rejects_invalid_kwargs(self, idata):
"""Test that mode() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.mode(idata.posterior, invalid_kwarg="test")

# Test common typo (errors with "multiple values" due to internal dims variable)
with pytest.raises(TypeError):
azs.mode(idata.posterior, dims="draw") # should be 'dim'

def test_ci_in_rope_rejects_invalid_kwargs(self, idata):
"""Test that ci_in_rope() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.ci_in_rope(idata.posterior["mu"], rope=(-1, 1), invalid_kwarg="test")


class TestKwargsValidationVisualizationFunctions:
"""Test kwargs validation for visualization/density functions."""

@pytest.fixture(scope="class")
def idata(self):
"""Load test data."""
return load_arviz_data("non_centered_eight")

def test_qds_rejects_invalid_kwargs(self, idata):
"""Test that qds() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.qds(idata.posterior["mu"], invalid_kwarg="test")

def test_ecdf_rejects_invalid_kwargs(self, idata):
"""Test that ecdf() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.ecdf(idata.posterior["mu"], invalid_kwarg="test")


class TestKwargsValidationLOOFunctions:
"""Test kwargs validation for LOO/model comparison functions."""

@pytest.fixture(scope="class")
def idata(self):
"""Load test data."""
return load_arviz_data("non_centered_eight")

def test_loo_rejects_invalid_kwargs(self, idata):
"""Test that loo() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.loo(idata, invalid_kwarg="test")

def test_compare_rejects_invalid_kwargs(self, idata):
"""Test that compare() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.compare({"model": idata}, invalid_kwarg="test")


class TestKwargsValidationMetrics:
"""Test kwargs validation for metric functions."""

@pytest.fixture(scope="class")
def idata(self):
"""Load test data."""
return load_arviz_data("non_centered_eight")

def test_bayesian_r2_rejects_invalid_kwargs(self, idata):
"""Test that bayesian_r2() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.bayesian_r2(idata, invalid_kwarg="test")

def test_metrics_rejects_invalid_kwargs(self, idata):
"""Test that metrics() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.metrics(idata, invalid_kwarg="test")


class TestKwargsValidationOtherFunctions:
"""Test kwargs validation for other utility functions."""

@pytest.fixture(scope="class")
def idata(self):
"""Load test data."""
return load_arviz_data("non_centered_eight")

def test_thin_rejects_invalid_kwargs(self, idata):
"""Test that thin() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.thin(idata, factor=2, invalid_kwarg="test")

def test_weight_predictions_rejects_invalid_kwargs(self, idata):
"""Test that weight_predictions() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.weight_predictions(idata.posterior_predictive, invalid_kwarg="test")

def test_bayes_factor_rejects_invalid_kwargs(self, idata):
"""Test that bayes_factor() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.bayes_factor(idata.posterior, var_names="mu", invalid_kwarg="test")

def test_psense_rejects_invalid_kwargs(self, idata):
"""Test that psense() raises TypeError for invalid kwargs."""
import arviz_stats as azs

with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
azs.psense(idata, var_name="mu", invalid_kwarg="test")


class TestKwargsValidationAccessors:
"""Test kwargs validation for accessor methods."""

@pytest.fixture(scope="class")
def data_array(self):
"""Load test data and extract DataArray."""
idata = load_arviz_data("non_centered_eight")
return idata.posterior["mu"]

def test_accessor_ess_rejects_invalid_kwargs(self, data_array):
"""Test that .azstats.ess() raises TypeError for invalid kwargs."""
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
data_array.azstats.ess(invalid_kwarg="test")

def test_accessor_rhat_rejects_invalid_kwargs(self, data_array):
"""Test that .azstats.rhat() raises TypeError for invalid kwargs."""
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
data_array.azstats.rhat(invalid_kwarg="test")

def test_accessor_mcse_rejects_invalid_kwargs(self, data_array):
"""Test that .azstats.mcse() raises TypeError for invalid kwargs."""
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
data_array.azstats.mcse(invalid_kwarg="test")

def test_accessor_mean_rejects_invalid_kwargs(self, data_array):
"""Test that .azstats.mean() raises TypeError for invalid kwargs."""
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
data_array.azstats.mean(invalid_kwarg="test")

def test_accessor_median_rejects_invalid_kwargs(self, data_array):
"""Test that .azstats.median() raises TypeError for invalid kwargs."""
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
data_array.azstats.median(invalid_kwarg="test")

def test_accessor_mode_rejects_invalid_kwargs(self, data_array):
"""Test that .azstats.mode() raises TypeError for invalid kwargs."""
with pytest.raises(TypeError, match=".*unexpected keyword argument.*"):
data_array.azstats.mode(invalid_kwarg="test")