Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Remove CrossValidationReport._MetricsAccessor #1318

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

auguste-probabl
Copy link
Contributor

Since a CrossValidationReport can be considered a special case of a ComparisonReport, this PR uses inheritance to avoid code duplication. Of note are the following design choices:

  • ComparisonReport metrics have the aggregate parameter in order to make them available to CrossValidationReport.
  • ComparisonReport can now accept just 1 EstimatorReport, so that even if cross-validation is interrupted, the initialization succeeds.

Todo:

  • Make sure plots are correct for both Reports

Copy link
Contributor

Coverage

Coverage Report for backend
FileStmtsMissCoverMissing
venv/lib/python3.12/site-packages/skore
   __init__.py140100% 
   __main__.py880%3–19
   exceptions.py440%4–23
venv/lib/python3.12/site-packages/skore/cli
   __init__.py550%3–8
   cli.py22220%3–70
   color_format.py49490%3–116
venv/lib/python3.12/site-packages/skore/persistence
   __init__.py00100% 
venv/lib/python3.12/site-packages/skore/persistence/item
   __init__.py56393%96–99
   altair_chart_item.py19191%14
   item.py22195%86
   matplotlib_figure_item.py36195%19
   media_item.py220100% 
   numpy_array_item.py27194%16
   pandas_dataframe_item.py29194%14
   pandas_series_item.py29194%14
   pickle_item.py220100% 
   pillow_image_item.py25193%15
   plotly_figure_item.py20192%14
   polars_dataframe_item.py27194%14
   polars_series_item.py22192%14
   primitive_item.py23291%13–15
   sklearn_base_estimator_item.py29194%15
   skrub_table_report_item.py10186%11
venv/lib/python3.12/site-packages/skore/persistence/repository
   __init__.py20100% 
   item_repository.py59591%15–16, 202–203, 226
venv/lib/python3.12/site-packages/skore/persistence/storage
   __init__.py40100% 
   abstract_storage.py220100% 
   disk_cache_storage.py33195%44
   in_memory_storage.py200100% 
venv/lib/python3.12/site-packages/skore/persistence/view
   __init__.py220%3–5
   view.py550%3–20
venv/lib/python3.12/site-packages/skore/project
   __init__.py30100% 
   _launch.py150199%278
   _open.py90100% 
   project.py62199%236
venv/lib/python3.12/site-packages/skore/sklearn
   __init__.py50100% 
   _base.py1401490%36, 48, 58, 91, 94, 147–156, 168–>173, 183–184
   find_ml_task.py49197%103–>111, 135
   types.py20100% 
venv/lib/python3.12/site-packages/skore/sklearn/_comparison
   __init__.py50100% 
   metrics_accessor.py141099%162–>173
   report.py400100% 
venv/lib/python3.12/site-packages/skore/sklearn/_cross_validation
   __init__.py20100% 
   report.py890100% 
venv/lib/python3.12/site-packages/skore/sklearn/_estimator
   __init__.py50100% 
   metrics_accessor.py255995%148–157, 182–>191, 190, 230–>232, 256, 283–287, 302, 320
   report.py120099%213–>219, 221–>223
   utils.py11110%1–19
venv/lib/python3.12/site-packages/skore/sklearn/_plot
   __init__.py40100% 
   precision_recall_curve.py119198%229–>246, 317
   prediction_error.py95198%159, 173–>176
   roc_curve.py1260100% 
   utils.py89593%23, 47–49, 53
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split
   __init__.py00100% 
   train_test_split.py36294%16–17
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split/warning
   __init__.py80100% 
   high_class_imbalance_too_few_examples_warning.py17378%16–18, 80
   high_class_imbalance_warning.py18288%16–18
   random_state_unset_warning.py11187%15
   shuffle_true_warning.py9091%44–>exit
   stratify_is_set_warning.py11187%15
   time_based_column_warning.py22189%17, 69–>exit
   train_test_split_warning.py5180%21
venv/lib/python3.12/site-packages/skore/ui
   __init__.py00100% 
   app.py32320%3–83
   dependencies.py440%3–10
   project_routes.py41410%3–92
   serializers.py77770%3–194
   server.py17170%3–40
venv/lib/python3.12/site-packages/skore/utils
   __init__.py60100% 
   _accessor.py70100% 
   _environment.py26097%29–>34
   _logger.py21484%14–18
   _patch.py11546%19–35
   _progress_bar.py300100% 
   _show_versions.py310100% 
TOTAL262835386% 

Tests Skipped Failures Errors Time
570 3 💤 0 ❌ 0 🔥 53.549s ⏱️

Copy link
Contributor

github-actions bot commented Feb 13, 2025

Documentation preview @ 6395ad4

@glemaitre
Copy link
Member

I would delay this refactoring once the ComparisonReport is accepting CrossValidationReport, i.e.:

ComparisonReport(reports={
    "LR": CrossValidationReport(LogisticRegression(), X=X, y=y),
    "GBDT": CrossValidationReport(HistGradientBoostingClassifier(), X=X, y=y),
})

I want to know if we should have a BaseComparisonReport from which both classes should inherit from or we are fine if the current proposal.

Here, for the moment I find it weird that some child of ComparisonReport are given to the ComparisonReport itself.

Base automatically changed from comparator to main February 19, 2025 15:17
@auguste-probabl auguste-probabl force-pushed the refactor-cv-metrics-accessor branch from a191cbb to 99a3530 Compare February 19, 2025 16:40
@thomass-dev
Copy link
Collaborator

Please delay this PR after the merge of #1309.

@auguste-probabl auguste-probabl force-pushed the refactor-cv-metrics-accessor branch 2 times, most recently from 1883b29 to 239c6db Compare February 26, 2025 15:35
@auguste-probabl auguste-probabl force-pushed the refactor-cv-metrics-accessor branch 3 times, most recently from a6f497c to 52f19c3 Compare March 4, 2025 16:10
@thomass-dev thomass-dev removed their request for review March 4, 2025 16:15
@auguste-probabl auguste-probabl added this to the spare week 0.8-0.9 milestone Mar 6, 2025
Trying to make `cross_validation._MetricsAccessor` inherit from `comparison._MetricsAccessor` fails because CrossValidationReport inherits from ComparisonReport, therefore it already has a `metrics` accessor registered, and pandas's `_register_accessor` refuses to register a new one with the same name. I could simply bypass it with `setattr`, but that's dirty.
@auguste-probabl auguste-probabl force-pushed the refactor-cv-metrics-accessor branch from 52f19c3 to 6e4b387 Compare March 12, 2025 13:19
@auguste-probabl auguste-probabl force-pushed the refactor-cv-metrics-accessor branch from 6e4b387 to 6395ad4 Compare March 12, 2025 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants