Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Change the default of aggregate parameter for cross-validation report #1440

Merged
merged 11 commits into from
Apr 1, 2025

Conversation

thomasloux
Copy link
Contributor

Following the corresponding issue, I change the default behaviour of CrossValidationReport to aggregate fold metrics as mean and std, rather than report metrics for all folds.

I choose to use Ellipsis as default value to make the difference with aggregate=None that outputs metrics for all folds.

I've also updated the tests, most are just column names, 1 is actual values, 2 behaviours in case of interruption (you may want to check these ones, as I am less familiar with the expected behaviour of the object is this setting).

Docstring still needs to be changed.

@thomass-dev
Copy link
Collaborator

thomass-dev commented Mar 17, 2025

Nitpick: on form, and this is open to debate, i find it more explicit to use a dedicated sentinel DEFAULT=object(); def foo([...], parameter=DEFAULT) instead of the ellipsis when None is a value that makes sense.

@thomass-dev thomass-dev changed the title enh: Change the default of aggregate for cross-validation report feat: Change the default of aggregate parameter for cross-validation report Mar 17, 2025
Copy link
Contributor

github-actions bot commented Mar 17, 2025

Documentation preview @ fa20358

@thomasloux
Copy link
Contributor Author

I wasn't sure how to write this, so of course it can be changed!

@glemaitre
Copy link
Member

glemaitre commented Mar 17, 2025

I think that we can pass the default aggregate=("mean", "std") in the signature. It means that we can relax the type by accepting list or tuple of literal or more generally a Sequence.

@thomasloux
Copy link
Contributor Author

updated, did not think about this simpler possibility. Sequence of str also makes more sense, as more aggregate options exists in pandas than mean and std

@glemaitre
Copy link
Member

Now, we should amend a couple of things:

  • change the docstring of the metric to reflect the default
  • change the output of the "Examples" section of the metric
  • make a pass in the example gallery to make sure that the discussion are aligned with what we compute: whenever possible, remove the aggregate=["mean", "std"] since it is the default and when we relied on aggregate=None, then we should explicitly use it.

@glemaitre
Copy link
Member

@thomasloux would you be able to make the latest requested changes?

@thomasloux
Copy link
Contributor Author

thomasloux commented Mar 26, 2025

Hello, I've changed the doctstring. Not sure what you mean by "Examples" section if not the one in the docstring. You may want to check that briefly if you want to check exact formatting (where to truncate results...). For all metrics, I've remove any argument link with aggregate and report then report the results with new behavior, so: mean and std. But you may want to provide one with aggregate=None to show this specific behaviour.

Additionally, here is the new description of the argument:

aggregate : None or list of str, default=("mean", "std")
            Function to aggregate the scores across the cross-validation splits.
            None will return the scores for each split.

Let me know if you have suggestion on these change!

@glemaitre glemaitre self-requested a review March 27, 2025 11:47
@glemaitre
Copy link
Member

glemaitre commented Mar 27, 2025

@thomasloux I went through the file. I did not wanted to push in your branch directly. So here is the diff of what I would propose as changes:

diff --git a/examples/getting_started/plot_skore_getting_started.py b/examples/getting_started/plot_skore_getting_started.py
index 993555fd..1e45f320 100644
--- a/examples/getting_started/plot_skore_getting_started.py
+++ b/examples/getting_started/plot_skore_getting_started.py
@@ -129,16 +129,16 @@ cv_report = CrossValidationReport(rf, X, y, cv_splitter=5)
 cv_report.help()
 
 # %%
-# We display the metrics for each fold:
+# We display the mean and standard deviation of for each metric:
 
 # %%
 cv_report.metrics.report_metrics(pos_label=1)
 
 # %%
-# or the aggregated results:
+# or by individual fold:
 
 # %%
-cv_report.metrics.report_metrics(aggregate=["mean", "std"], pos_label=1)
+cv_report.metrics.report_metrics(aggregate=None, pos_label=1)
 
 # %%
 # We display the ROC curves for each fold:
diff --git a/examples/technical_details/plot_cache_mechanism.py b/examples/technical_details/plot_cache_mechanism.py
index 65ba5ddb..04a0d12e 100644
--- a/examples/technical_details/plot_cache_mechanism.py
+++ b/examples/technical_details/plot_cache_mechanism.py
@@ -272,7 +272,7 @@ report.help()
 # exposed.
 # The first call will be slow because it computes the predictions for each fold.
 start = time.time()
-result = report.metrics.report_metrics(aggregate=["mean", "std"])
+result = report.metrics.report_metrics()
 end = time.time()
 result
 
@@ -283,7 +283,7 @@ print(f"Time taken: {end - start:.2f} seconds")
 #
 # But the subsequent calls are fast because the predictions are cached.
 start = time.time()
-result = report.metrics.report_metrics(aggregate=["mean", "std"])
+result = report.metrics.report_metrics()
 end = time.time()
 result
 
diff --git a/examples/use_cases/plot_employee_salaries.py b/examples/use_cases/plot_employee_salaries.py
index 60793db0..168576f6 100644
--- a/examples/use_cases/plot_employee_salaries.py
+++ b/examples/use_cases/plot_employee_salaries.py
@@ -198,7 +198,7 @@ my_project.put("Linear model report", report)
 
 # %%
 # We can now have a look at the performance of the model with some standard metrics.
-report.metrics.report_metrics(aggregate=["mean", "std"], indicator_favorability=True)
+report.metrics.report_metrics(indicator_favorability=True)
 
 # %%
 # Second model
@@ -241,7 +241,7 @@ my_project.put("HGBDT model report", report)
 # %%
 #
 # We can now have a look at the performance of the model with some standard metrics.
-report.metrics.report_metrics(aggregate=["mean", "std"])
+report.metrics.report_metrics()
 
 # %%
 # Investigating the models
@@ -264,8 +264,8 @@ import pandas as pd
 
 results = pd.concat(
     [
-        linear_model_report.metrics.report_metrics(aggregate=["mean", "std"]),
-        hgbdt_model_report.metrics.report_metrics(aggregate=["mean", "std"]),
+        linear_model_report.metrics.report_metrics(),
+        hgbdt_model_report.metrics.report_metrics(),
     ],
     axis=1,
 )
@@ -289,13 +289,11 @@ results = pd.concat(
             scoring=scoring,
             scoring_kwargs=scoring_kwargs,
             scoring_names=scoring_names,
-            aggregate=["mean", "std"],
         ),
         hgbdt_model_report.metrics.report_metrics(
             scoring=scoring,
             scoring_kwargs=scoring_kwargs,
             scoring_names=scoring_names,
-            aggregate=["mean", "std"],
         ),
     ],
     axis=1,
diff --git a/skore/src/skore/sklearn/_cross_validation/metrics_accessor.py b/skore/src/skore/sklearn/_cross_validation/metrics_accessor.py
index c65ef2b8..abb4250e 100644
--- a/skore/src/skore/sklearn/_cross_validation/metrics_accessor.py
+++ b/skore/src/skore/sklearn/_cross_validation/metrics_accessor.py
@@ -60,7 +60,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         pos_label: Optional[Union[int, float, bool, str]] = None,
         indicator_favorability: bool = False,
         flat_index: bool = False,
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Report a set of metrics for our estimator.
 
@@ -99,7 +101,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             Whether to flatten the `MultiIndex` columns. Flat index will always be lower
             case, do not include spaces and remove the hash symbol to ease indexing.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -118,7 +120,6 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> report.metrics.report_metrics(
         ...     scoring=["precision", "recall"],
         ...     pos_label=1,
-        ...     aggregate=["mean", "std"],
         ...     indicator_favorability=True,
         ... )
                   LogisticRegression           Favorability
@@ -150,7 +151,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         report_metric_name: str,
         *,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = None,
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
         **metric_kwargs: Any,
     ) -> pd.DataFrame:
         # build the cache key components to finally create a tuple that will be used
@@ -237,7 +240,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         self,
         *,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the accuracy score.
 
@@ -249,7 +254,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             - "test" : use the test set provided when creating the report.
             - "train" : use the train set provided when creating the report.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -266,10 +271,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.accuracy()
-                    LogisticRegression
-                                Split #0  Split #1
+                LogisticRegression
+                            mean      std
         Metric
-        Accuracy                0.94...   0.94...
+        Accuracy         0.94...  0.00...
         """
         return self.report_metrics(
             scoring=["accuracy"],
@@ -290,7 +295,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             Literal["binary", "macro", "micro", "weighted", "samples"]
         ] = None,
         pos_label: Optional[Union[int, float, bool, str]] = None,
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the precision score.
 
@@ -330,7 +337,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         pos_label : int, float, bool or str, default=None
             The positive class.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -347,11 +354,11 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.precision()
-                                    LogisticRegression
-                                                Split #0  Split #1
-        Metric         Label / Average
-        Precision     0                         0.96...   0.90...
-                      1                         0.93...   0.96...
+                                LogisticRegression
+                                                mean       std
+        Metric    Label / Average
+        Precision 0                          0.93...   0.04...
+                  1                          0.94...   0.02...
         """
         return self.report_metrics(
             scoring=["precision"],
@@ -374,7 +381,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             Literal["binary", "macro", "micro", "weighted", "samples"]
         ] = None,
         pos_label: Optional[Union[int, float, bool, str]] = None,
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the recall score.
 
@@ -415,7 +424,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         pos_label : int, float, bool or str, default=None
             The positive class.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -432,11 +441,11 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.recall()
-                                    LogisticRegression
-                                            Split #0  Split #1
-        Metric      Label / Average
-        Recall     0                         0.87...  0.94...
-                   1                         0.98...  0.94...
+                            LogisticRegression
+                                            mean       std
+        Metric Label / Average
+        Recall 0                         0.91...  0.046...
+               1                         0.96...  0.027...
         """
         return self.report_metrics(
             scoring=["recall"],
@@ -453,7 +462,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         self,
         *,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the Brier score.
 
@@ -465,7 +476,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             - "test" : use the test set provided when creating the report.
             - "train" : use the train set provided when creating the report.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -482,10 +493,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.brier_score()
-                        LogisticRegression
-                                Split #0  Split #1
+                    LogisticRegression
+                                mean       std
         Metric
-        Brier score             0.04...   0.04...
+        Brier score          0.04...   0.00...
         """
         return self.report_metrics(
             scoring=["brier_score"],
@@ -504,7 +515,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         data_source: DataSource = "test",
         average: Optional[Literal["macro", "micro", "weighted", "samples"]] = None,
         multi_class: Literal["raise", "ovr", "ovo"] = "ovr",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the ROC AUC score.
 
@@ -549,7 +562,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
               pairwise combinations of classes. Insensitive to class imbalance when
               `average == "macro"`.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -566,10 +579,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.roc_auc()
-                    LogisticRegression
-                            Split #0  Split #1
+                LogisticRegression
+                            mean       std
         Metric
-        ROC AUC             0.99...   0.98...
+        ROC AUC          0.98...   0.00...
         """
         return self.report_metrics(
             scoring=["roc_auc"],
@@ -587,7 +600,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         self,
         *,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the log loss.
 
@@ -599,7 +614,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             - "test" : use the test set provided when creating the report.
             - "train" : use the train set provided when creating the report.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -616,10 +631,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.log_loss()
-                    LogisticRegression
-                                Split #0  Split #1
+                LogisticRegression
+                            mean       std
         Metric
-        Log loss                0.1...     0.1...
+        Log loss         0.14...   0.03...
         """
         return self.report_metrics(
             scoring=["log_loss"],
@@ -637,7 +652,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         *,
         data_source: DataSource = "test",
         multioutput: Literal["raw_values", "uniform_average"] = "raw_values",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the R² score.
 
@@ -659,7 +676,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
 
             By default, no averaging is done.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -676,10 +693,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> regressor = Ridge()
         >>> report = CrossValidationReport(regressor, X=X, y=y, cv_splitter=2)
         >>> report.metrics.r2()
-                    Ridge
-                Split #0  Split #1
+                Ridge
+                    mean       std
         Metric
-        R²      0.36...   0.39...
+        R²       0.37...   0.02...
         """
         return self.report_metrics(
             scoring=["r2"],
@@ -698,7 +715,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         *,
         data_source: DataSource = "test",
         multioutput: Literal["raw_values", "uniform_average"] = "raw_values",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the root mean squared error.
 
@@ -720,7 +739,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
 
             By default, no averaging is done.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -738,9 +757,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> report = CrossValidationReport(regressor, X=X, y=y, cv_splitter=2)
         >>> report.metrics.rmse()
                     Ridge
-                    Split #0   Split #1
+                    mean       std
         Metric
-        RMSE        59.9...    61.4...
+        RMSE     60.7...    1.0...
         """
         return self.report_metrics(
             scoring=["rmse"],
@@ -756,7 +775,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         *,
         metric_name: Optional[str] = None,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
         **kwargs,
     ) -> pd.DataFrame:
         """Compute a custom metric.
@@ -791,7 +812,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             - "test" : use the test set provided when creating the report.
             - "train" : use the train set provided when creating the report.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         **kwargs : dict
@@ -816,10 +837,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         ...     response_method="predict",
         ...     metric_name="MAE",
         ... )
-                    Ridge
-                Split #0   Split #1
+            Ridge
+                    mean       std
         Metric
-        MAE     50.1...   52.6...
+        MAE     51.4...   1.7...
         """
         # create a scorer with `greater_is_better=True` to not alter the output of
         # `metric_function`

They are mainly documentation changes.

NB: I corrected the typing from list[..] to Sequence[...]. Some copy pasting mistakes.

@thomasloux
Copy link
Contributor Author

Hey, seems like I've forgotten to push a previous commit where I also changed docstrings, sorry. To be sure, don't you want to emphasize the aggregate=None in docstring as well. I feel like this one is slightly less clear

aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.

@glemaitre
Copy link
Member

@thomasloux Indeed, it was the default (and this no need to specify it) before and I forgot to add it explicitly:

aggregate : {"mean", "std"}, list of such str or None, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.

Copy link
Contributor

github-actions bot commented Mar 29, 2025

Coverage

Coverage Report for backend
FileStmtsMissCoverMissing
venv/lib/python3.12/site-packages/skore
   __init__.py220100% 
   _config.py280100% 
   exceptions.py440%4–23
venv/lib/python3.12/site-packages/skore/persistence
   __init__.py00100% 
venv/lib/python3.12/site-packages/skore/persistence/item
   __init__.py55198%97
   altair_chart_item.py19191%14
   item.py22195%86
   matplotlib_figure_item.py36195%19
   media_item.py220100% 
   numpy_array_item.py27194%16
   pandas_dataframe_item.py29194%14
   pandas_series_item.py29194%14
   pickle_item.py220100% 
   pillow_image_item.py25193%15
   plotly_figure_item.py20192%14
   polars_dataframe_item.py27194%14
   polars_series_item.py22192%14
   primitive_item.py23291%13–15
   sklearn_base_estimator_item.py29194%15
venv/lib/python3.12/site-packages/skore/persistence/repository
   __init__.py20100% 
   item_repository.py59591%15–16, 202–203, 226
venv/lib/python3.12/site-packages/skore/persistence/storage
   __init__.py40100% 
   abstract_storage.py220100% 
   disk_cache_storage.py33195%44
   in_memory_storage.py200100% 
venv/lib/python3.12/site-packages/skore/project
   __init__.py20100% 
   project.py84298%282, 394
venv/lib/python3.12/site-packages/skore/sklearn
   __init__.py60100% 
   _base.py1701492%44, 57, 125, 128, 181–190, 202–>208, 223, 226–227
   find_ml_task.py61099%136–>145
   types.py13285%34, 62
venv/lib/python3.12/site-packages/skore/sklearn/_comparison
   __init__.py50100% 
   metrics_accessor.py167298%166, 167–>169, 1281
   report.py67197%17, 251–>254
venv/lib/python3.12/site-packages/skore/sklearn/_cross_validation
   __init__.py50100% 
   metrics_accessor.py180099%144–>146, 146–>148
   report.py110198%23
venv/lib/python3.12/site-packages/skore/sklearn/_estimator
   __init__.py70100% 
   feature_importance_accessor.py133099%485–>491, 571–>580
   metrics_accessor.py3451096%170–179, 207–>216, 215, 245, 256–>258, 286, 313–317, 332, 367, 368–>370
   report.py143198%24, 253–>255
venv/lib/python3.12/site-packages/skore/sklearn/_plot
   __init__.py20100% 
   base.py60100% 
   style.py140100% 
   utils.py122595%51, 75–77, 81
venv/lib/python3.12/site-packages/skore/sklearn/_plot/metrics
   __init__.py40100% 
   precision_recall_curve.py169199%655
   prediction_error.py1610100% 
   roc_curve.py172199%645
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split
   __init__.py00100% 
   train_test_split.py51196%16, 154–>158
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split/warning
   __init__.py80100% 
   high_class_imbalance_too_few_examples_warning.py17190%79
   high_class_imbalance_warning.py180100% 
   random_state_unset_warning.py12188%15
   shuffle_true_warning.py10183%46
   stratify_is_set_warning.py12188%15
   time_based_column_warning.py23286%17, 73
   train_test_split_warning.py5180%21
venv/lib/python3.12/site-packages/skore/utils
   __init__.py60100% 
   _accessor.py42196%94
   _environment.py27097%30–>35
   _fixes.py80100% 
   _index.py50100% 
   _logger.py22485%15–19
   _measure_time.py100100% 
   _parallel.py38388%23–33, 124
   _patch.py13553%21–37
   _progress_bar.py340100% 
   _show_versions.py33295%65–66
TOTAL31438796% 

Tests Skipped Failures Errors Time
793 8 💤 0 ❌ 0 🔥 59.079s ⏱️

@thomasloux thomasloux marked this pull request as ready for review March 29, 2025 22:36
@glemaitre glemaitre self-requested a review March 30, 2025 09:33
glemaitre
glemaitre previously approved these changes Mar 30, 2025
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM. Thanks @thomasloux

@glemaitre
Copy link
Member

I just pushed a small fix in your branch to fix the conflict to be able to run the tests. It looks good to me otherwise.

@thomasloux Next time, you might want to make a branch instead of submitting from your main branch.

@glemaitre glemaitre merged commit 8041aa4 into probabl-ai:main Apr 1, 2025
19 checks passed
@glemaitre
Copy link
Member

Thanks @thomasloux. The CIs is happy so I am.

Nice job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enh: Change the default of aggregate for cross-validation report
4 participants