feat: Change the default of `aggregate` parameter for cross-validation report #1440

thomasloux · 2025-03-16T13:05:54Z

Following the corresponding issue, I change the default behaviour of CrossValidationReport to aggregate fold metrics as mean and std, rather than report metrics for all folds.

I choose to use Ellipsis as default value to make the difference with aggregate=None that outputs metrics for all folds.

I've also updated the tests, most are just column names, 1 is actual values, 2 behaviours in case of interruption (you may want to check these ones, as I am less familiar with the expected behaviour of the object is this setting).

Docstring still needs to be changed.

thomass-dev · 2025-03-17T08:08:32Z

Nitpick: on form, and this is open to debate, i find it more explicit to use a dedicated sentinel DEFAULT=object(); def foo([...], parameter=DEFAULT) instead of the ellipsis when None is a value that makes sense.

github-actions · 2025-03-17T08:19:45Z

Documentation preview @ fa20358

thomasloux · 2025-03-17T09:08:02Z

I wasn't sure how to write this, so of course it can be changed!

glemaitre · 2025-03-17T10:33:25Z

I think that we can pass the default aggregate=("mean", "std") in the signature. It means that we can relax the type by accepting list or tuple of literal or more generally a Sequence.

thomasloux · 2025-03-17T12:22:07Z

updated, did not think about this simpler possibility. Sequence of str also makes more sense, as more aggregate options exists in pandas than mean and std

glemaitre · 2025-03-17T13:14:46Z

Now, we should amend a couple of things:

change the docstring of the metric to reflect the default
change the output of the "Examples" section of the metric
make a pass in the example gallery to make sure that the discussion are aligned with what we compute: whenever possible, remove the aggregate=["mean", "std"] since it is the default and when we relied on aggregate=None, then we should explicitly use it.

glemaitre · 2025-03-24T17:59:02Z

@thomasloux would you be able to make the latest requested changes?

thomasloux · 2025-03-26T16:16:52Z

Hello, I've changed the doctstring. Not sure what you mean by "Examples" section if not the one in the docstring. You may want to check that briefly if you want to check exact formatting (where to truncate results...). For all metrics, I've remove any argument link with aggregate and report then report the results with new behavior, so: mean and std. But you may want to provide one with aggregate=None to show this specific behaviour.

Additionally, here is the new description of the argument:

aggregate : None or list of str, default=("mean", "std")
            Function to aggregate the scores across the cross-validation splits.
            None will return the scores for each split.

Let me know if you have suggestion on these change!

glemaitre · 2025-03-27T13:17:26Z

@thomasloux I went through the file. I did not wanted to push in your branch directly. So here is the diff of what I would propose as changes:

diff --git a/examples/getting_started/plot_skore_getting_started.py b/examples/getting_started/plot_skore_getting_started.py
index 993555fd..1e45f320 100644
--- a/examples/getting_started/plot_skore_getting_started.py
+++ b/examples/getting_started/plot_skore_getting_started.py
@@ -129,16 +129,16 @@ cv_report = CrossValidationReport(rf, X, y, cv_splitter=5)
 cv_report.help()
 
 # %%
-# We display the metrics for each fold:
+# We display the mean and standard deviation of for each metric:
 
 # %%
 cv_report.metrics.report_metrics(pos_label=1)
 
 # %%
-# or the aggregated results:
+# or by individual fold:
 
 # %%
-cv_report.metrics.report_metrics(aggregate=["mean", "std"], pos_label=1)
+cv_report.metrics.report_metrics(aggregate=None, pos_label=1)
 
 # %%
 # We display the ROC curves for each fold:
diff --git a/examples/technical_details/plot_cache_mechanism.py b/examples/technical_details/plot_cache_mechanism.py
index 65ba5ddb..04a0d12e 100644
--- a/examples/technical_details/plot_cache_mechanism.py
+++ b/examples/technical_details/plot_cache_mechanism.py
@@ -272,7 +272,7 @@ report.help()
 # exposed.
 # The first call will be slow because it computes the predictions for each fold.
 start = time.time()
-result = report.metrics.report_metrics(aggregate=["mean", "std"])
+result = report.metrics.report_metrics()
 end = time.time()
 result
 
@@ -283,7 +283,7 @@ print(f"Time taken: {end - start:.2f} seconds")
 #
 # But the subsequent calls are fast because the predictions are cached.
 start = time.time()
-result = report.metrics.report_metrics(aggregate=["mean", "std"])
+result = report.metrics.report_metrics()
 end = time.time()
 result
 
diff --git a/examples/use_cases/plot_employee_salaries.py b/examples/use_cases/plot_employee_salaries.py
index 60793db0..168576f6 100644
--- a/examples/use_cases/plot_employee_salaries.py
+++ b/examples/use_cases/plot_employee_salaries.py
@@ -198,7 +198,7 @@ my_project.put("Linear model report", report)
 
 # %%
 # We can now have a look at the performance of the model with some standard metrics.
-report.metrics.report_metrics(aggregate=["mean", "std"], indicator_favorability=True)
+report.metrics.report_metrics(indicator_favorability=True)
 
 # %%
 # Second model
@@ -241,7 +241,7 @@ my_project.put("HGBDT model report", report)
 # %%
 #
 # We can now have a look at the performance of the model with some standard metrics.
-report.metrics.report_metrics(aggregate=["mean", "std"])
+report.metrics.report_metrics()
 
 # %%
 # Investigating the models
@@ -264,8 +264,8 @@ import pandas as pd
 
 results = pd.concat(
     [
-        linear_model_report.metrics.report_metrics(aggregate=["mean", "std"]),
-        hgbdt_model_report.metrics.report_metrics(aggregate=["mean", "std"]),
+        linear_model_report.metrics.report_metrics(),
+        hgbdt_model_report.metrics.report_metrics(),
     ],
     axis=1,
 )
@@ -289,13 +289,11 @@ results = pd.concat(
             scoring=scoring,
             scoring_kwargs=scoring_kwargs,
             scoring_names=scoring_names,
-            aggregate=["mean", "std"],
         ),
         hgbdt_model_report.metrics.report_metrics(
             scoring=scoring,
             scoring_kwargs=scoring_kwargs,
             scoring_names=scoring_names,
-            aggregate=["mean", "std"],
         ),
     ],
     axis=1,
diff --git a/skore/src/skore/sklearn/_cross_validation/metrics_accessor.py b/skore/src/skore/sklearn/_cross_validation/metrics_accessor.py
index c65ef2b8..abb4250e 100644
--- a/skore/src/skore/sklearn/_cross_validation/metrics_accessor.py
+++ b/skore/src/skore/sklearn/_cross_validation/metrics_accessor.py
@@ -60,7 +60,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         pos_label: Optional[Union[int, float, bool, str]] = None,
         indicator_favorability: bool = False,
         flat_index: bool = False,
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Report a set of metrics for our estimator.
 
@@ -99,7 +101,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             Whether to flatten the `MultiIndex` columns. Flat index will always be lower
             case, do not include spaces and remove the hash symbol to ease indexing.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -118,7 +120,6 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> report.metrics.report_metrics(
         ...     scoring=["precision", "recall"],
         ...     pos_label=1,
-        ...     aggregate=["mean", "std"],
         ...     indicator_favorability=True,
         ... )
                   LogisticRegression           Favorability
@@ -150,7 +151,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         report_metric_name: str,
         *,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = None,
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
         **metric_kwargs: Any,
     ) -> pd.DataFrame:
         # build the cache key components to finally create a tuple that will be used
@@ -237,7 +240,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         self,
         *,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the accuracy score.
 
@@ -249,7 +254,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             - "test" : use the test set provided when creating the report.
             - "train" : use the train set provided when creating the report.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -266,10 +271,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.accuracy()
-                    LogisticRegression
-                                Split #0  Split #1
+                LogisticRegression
+                            mean      std
         Metric
-        Accuracy                0.94...   0.94...
+        Accuracy         0.94...  0.00...
         """
         return self.report_metrics(
             scoring=["accuracy"],
@@ -290,7 +295,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             Literal["binary", "macro", "micro", "weighted", "samples"]
         ] = None,
         pos_label: Optional[Union[int, float, bool, str]] = None,
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the precision score.
 
@@ -330,7 +337,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         pos_label : int, float, bool or str, default=None
             The positive class.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -347,11 +354,11 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.precision()
-                                    LogisticRegression
-                                                Split #0  Split #1
-        Metric         Label / Average
-        Precision     0                         0.96...   0.90...
-                      1                         0.93...   0.96...
+                                LogisticRegression
+                                                mean       std
+        Metric    Label / Average
+        Precision 0                          0.93...   0.04...
+                  1                          0.94...   0.02...
         """
         return self.report_metrics(
             scoring=["precision"],
@@ -374,7 +381,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             Literal["binary", "macro", "micro", "weighted", "samples"]
         ] = None,
         pos_label: Optional[Union[int, float, bool, str]] = None,
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the recall score.
 
@@ -415,7 +424,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         pos_label : int, float, bool or str, default=None
             The positive class.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -432,11 +441,11 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.recall()
-                                    LogisticRegression
-                                            Split #0  Split #1
-        Metric      Label / Average
-        Recall     0                         0.87...  0.94...
-                   1                         0.98...  0.94...
+                            LogisticRegression
+                                            mean       std
+        Metric Label / Average
+        Recall 0                         0.91...  0.046...
+               1                         0.96...  0.027...
         """
         return self.report_metrics(
             scoring=["recall"],
@@ -453,7 +462,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         self,
         *,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the Brier score.
 
@@ -465,7 +476,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             - "test" : use the test set provided when creating the report.
             - "train" : use the train set provided when creating the report.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -482,10 +493,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.brier_score()
-                        LogisticRegression
-                                Split #0  Split #1
+                    LogisticRegression
+                                mean       std
         Metric
-        Brier score             0.04...   0.04...
+        Brier score          0.04...   0.00...
         """
         return self.report_metrics(
             scoring=["brier_score"],
@@ -504,7 +515,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         data_source: DataSource = "test",
         average: Optional[Literal["macro", "micro", "weighted", "samples"]] = None,
         multi_class: Literal["raise", "ovr", "ovo"] = "ovr",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the ROC AUC score.
 
@@ -549,7 +562,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
               pairwise combinations of classes. Insensitive to class imbalance when
               `average == "macro"`.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -566,10 +579,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.roc_auc()
-                    LogisticRegression
-                            Split #0  Split #1
+                LogisticRegression
+                            mean       std
         Metric
-        ROC AUC             0.99...   0.98...
+        ROC AUC          0.98...   0.00...
         """
         return self.report_metrics(
             scoring=["roc_auc"],
@@ -587,7 +600,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         self,
         *,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the log loss.
 
@@ -599,7 +614,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             - "test" : use the test set provided when creating the report.
             - "train" : use the train set provided when creating the report.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -616,10 +631,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> classifier = LogisticRegression(max_iter=10_000)
         >>> report = CrossValidationReport(classifier, X=X, y=y, cv_splitter=2)
         >>> report.metrics.log_loss()
-                    LogisticRegression
-                                Split #0  Split #1
+                LogisticRegression
+                            mean       std
         Metric
-        Log loss                0.1...     0.1...
+        Log loss         0.14...   0.03...
         """
         return self.report_metrics(
             scoring=["log_loss"],
@@ -637,7 +652,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         *,
         data_source: DataSource = "test",
         multioutput: Literal["raw_values", "uniform_average"] = "raw_values",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the R² score.
 
@@ -659,7 +676,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
 
             By default, no averaging is done.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -676,10 +693,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> regressor = Ridge()
         >>> report = CrossValidationReport(regressor, X=X, y=y, cv_splitter=2)
         >>> report.metrics.r2()
-                    Ridge
-                Split #0  Split #1
+                Ridge
+                    mean       std
         Metric
-        R²      0.36...   0.39...
+        R²       0.37...   0.02...
         """
         return self.report_metrics(
             scoring=["r2"],
@@ -698,7 +715,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         *,
         data_source: DataSource = "test",
         multioutput: Literal["raw_values", "uniform_average"] = "raw_values",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
     ) -> pd.DataFrame:
         """Compute the root mean squared error.
 
@@ -720,7 +739,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
 
             By default, no averaging is done.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         Returns
@@ -738,9 +757,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         >>> report = CrossValidationReport(regressor, X=X, y=y, cv_splitter=2)
         >>> report.metrics.rmse()
                     Ridge
-                    Split #0   Split #1
+                    mean       std
         Metric
-        RMSE        59.9...    61.4...
+        RMSE     60.7...    1.0...
         """
         return self.report_metrics(
             scoring=["rmse"],
@@ -756,7 +775,9 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         *,
         metric_name: Optional[str] = None,
         data_source: DataSource = "test",
-        aggregate: Optional[Sequence[str]] = ("mean", "std"),
+        aggregate: Optional[
+            Union[Literal["mean", "std"], Sequence[Literal["mean", "std"]]]
+        ] = ("mean", "std"),
         **kwargs,
     ) -> pd.DataFrame:
         """Compute a custom metric.
@@ -791,7 +812,7 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
             - "test" : use the test set provided when creating the report.
             - "train" : use the train set provided when creating the report.
 
-        aggregate : {"mean", "std"} or list of such str, default=None
+        aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.
 
         **kwargs : dict
@@ -816,10 +837,10 @@ class _MetricsAccessor(_BaseAccessor["CrossValidationReport"], DirNamesMixin):
         ...     response_method="predict",
         ...     metric_name="MAE",
         ... )
-                    Ridge
-                Split #0   Split #1
+            Ridge
+                    mean       std
         Metric
-        MAE     50.1...   52.6...
+        MAE     51.4...   1.7...
         """
         # create a scorer with `greater_is_better=True` to not alter the output of
         # `metric_function`

They are mainly documentation changes.

NB: I corrected the typing from list[..] to Sequence[...]. Some copy pasting mistakes.

thomasloux · 2025-03-29T14:59:16Z

Hey, seems like I've forgotten to push a previous commit where I also changed docstrings, sorry. To be sure, don't you want to emphasize the aggregate=None in docstring as well. I feel like this one is slightly less clear

aggregate : {"mean", "std"} or list of such str, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.

glemaitre · 2025-03-29T15:15:21Z

@thomasloux Indeed, it was the default (and this no need to specify it) before and I forgot to add it explicitly:

aggregate : {"mean", "std"}, list of such str or None, default=("mean", "std")
             Function to aggregate the scores across the cross-validation splits.

github-actions · 2025-03-29T15:19:56Z

Coverage Report for backend

File	Stmts	Miss	Cover	Missing
venv/lib/python3.12/site-packages/skore
__init__.py	22	0	100%
_config.py	28	0	100%
exceptions.py	4	4	0%	4–23
venv/lib/python3.12/site-packages/skore/persistence
__init__.py	0	0	100%
venv/lib/python3.12/site-packages/skore/persistence/item
__init__.py	55	1	98%	97
altair_chart_item.py	19	1	91%	14
item.py	22	1	95%	86
matplotlib_figure_item.py	36	1	95%	19
media_item.py	22	0	100%
numpy_array_item.py	27	1	94%	16
pandas_dataframe_item.py	29	1	94%	14
pandas_series_item.py	29	1	94%	14
pickle_item.py	22	0	100%
pillow_image_item.py	25	1	93%	15
plotly_figure_item.py	20	1	92%	14
polars_dataframe_item.py	27	1	94%	14
polars_series_item.py	22	1	92%	14
primitive_item.py	23	2	91%	13–15
sklearn_base_estimator_item.py	29	1	94%	15
venv/lib/python3.12/site-packages/skore/persistence/repository
__init__.py	2	0	100%
item_repository.py	59	5	91%	15–16, 202–203, 226
venv/lib/python3.12/site-packages/skore/persistence/storage
__init__.py	4	0	100%
abstract_storage.py	22	0	100%
disk_cache_storage.py	33	1	95%	44
in_memory_storage.py	20	0	100%
venv/lib/python3.12/site-packages/skore/project
__init__.py	2	0	100%
project.py	84	2	98%	282, 394
venv/lib/python3.12/site-packages/skore/sklearn
__init__.py	6	0	100%
_base.py	170	14	92%	44, 57, 125, 128, 181–190, 202–>208, 223, 226–227
find_ml_task.py	61	0	99%	136–>145
types.py	13	2	85%	34, 62
venv/lib/python3.12/site-packages/skore/sklearn/_comparison
__init__.py	5	0	100%
metrics_accessor.py	167	2	98%	166, 167–>169, 1281
report.py	67	1	97%	17, 251–>254
venv/lib/python3.12/site-packages/skore/sklearn/_cross_validation
__init__.py	5	0	100%
metrics_accessor.py	180	0	99%	144–>146, 146–>148
report.py	110	1	98%	23
venv/lib/python3.12/site-packages/skore/sklearn/_estimator
__init__.py	7	0	100%
feature_importance_accessor.py	133	0	99%	485–>491, 571–>580
metrics_accessor.py	345	10	96%	170–179, 207–>216, 215, 245, 256–>258, 286, 313–317, 332, 367, 368–>370
report.py	143	1	98%	24, 253–>255
venv/lib/python3.12/site-packages/skore/sklearn/_plot
__init__.py	2	0	100%
base.py	6	0	100%
style.py	14	0	100%
utils.py	122	5	95%	51, 75–77, 81
venv/lib/python3.12/site-packages/skore/sklearn/_plot/metrics
__init__.py	4	0	100%
precision_recall_curve.py	169	1	99%	655
prediction_error.py	161	0	100%
roc_curve.py	172	1	99%	645
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split
__init__.py	0	0	100%
train_test_split.py	51	1	96%	16, 154–>158
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split/warning
__init__.py	8	0	100%
high_class_imbalance_too_few_examples_warning.py	17	1	90%	79
high_class_imbalance_warning.py	18	0	100%
random_state_unset_warning.py	12	1	88%	15
shuffle_true_warning.py	10	1	83%	46
stratify_is_set_warning.py	12	1	88%	15
time_based_column_warning.py	23	2	86%	17, 73
train_test_split_warning.py	5	1	80%	21
venv/lib/python3.12/site-packages/skore/utils
__init__.py	6	0	100%
_accessor.py	42	1	96%	94
_environment.py	27	0	97%	30–>35
_fixes.py	8	0	100%
_index.py	5	0	100%
_logger.py	22	4	85%	15–19
_measure_time.py	10	0	100%
_parallel.py	38	3	88%	23–33, 124
_patch.py	13	5	53%	21–37
_progress_bar.py	34	0	100%
_show_versions.py	33	2	95%	65–66
TOTAL	3143	87	96%

Tests	Skipped	Failures	Errors	Time
793	8 💤	0 ❌	0 🔥	59.079s ⏱️

skore/src/skore/sklearn/_cross_validation/metrics_accessor.py

glemaitre

Otherwise LGTM. Thanks @thomasloux

skore/tests/unit/sklearn/test_cross_validation.py

glemaitre · 2025-04-01T22:12:57Z

I just pushed a small fix in your branch to fix the conflict to be able to run the tests. It looks good to me otherwise.

@thomasloux Next time, you might want to make a branch instead of submitting from your main branch.

glemaitre · 2025-04-01T22:20:55Z

Thanks @thomasloux. The CIs is happy so I am.

Nice job.

thomasloux added 3 commits March 16, 2025 13:20

change default value aggregate

5c57e1c

aggreg cv: change tests column names

80f48b5

aggreg cv: update expected shape and score values

dd4ebab

github-actions bot assigned thomasloux Mar 16, 2025

thomass-dev linked an issue Mar 17, 2025 that may be closed by this pull request

enh: Change the default of aggregate for cross-validation report #1427

Closed

thomass-dev changed the title ~~enh: Change the default of aggregate for cross-validation report~~ feat: Change the default of aggregate parameter for cross-validation report Mar 17, 2025

pass directly tuple to aggregate

d230950

glemaitre self-requested a review March 27, 2025 11:47

Merge branch 'main' into main

71738d3

thomasloux added 3 commits March 29, 2025 15:43

change documentation

516ec32

change examples

9a8359a

change signature aggregate

9810530

glemaitre reviewed Mar 29, 2025

View reviewed changes

skore/src/skore/sklearn/_cross_validation/metrics_accessor.py Outdated Show resolved Hide resolved

update docstring on aggregate

5d45e57

thomasloux marked this pull request as ready for review March 29, 2025 22:36

glemaitre self-requested a review March 30, 2025 09:33

glemaitre reviewed Mar 30, 2025

View reviewed changes

skore/src/skore/sklearn/_cross_validation/metrics_accessor.py Outdated Show resolved Hide resolved

glemaitre previously approved these changes Mar 30, 2025

View reviewed changes

glemaitre reviewed Mar 30, 2025

View reviewed changes

skore/tests/unit/sklearn/test_cross_validation.py Outdated Show resolved Hide resolved

glemaitre reviewed Mar 30, 2025

View reviewed changes

skore/tests/unit/sklearn/test_cross_validation.py Outdated Show resolved Hide resolved

reverse change, keep unit test unchanged

922402e

thomasloux dismissed glemaitre’s stale review via 922402e April 1, 2025 20:36

glemaitre self-requested a review April 1, 2025 22:07

Merge remote-tracking branch 'origin/main' into pr/thomasloux/1440

fa20358

glemaitre approved these changes Apr 1, 2025

View reviewed changes

glemaitre merged commit 8041aa4 into probabl-ai:main Apr 1, 2025
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Change the default of `aggregate` parameter for cross-validation report #1440

feat: Change the default of `aggregate` parameter for cross-validation report #1440

thomasloux commented Mar 16, 2025

thomass-dev commented Mar 17, 2025 •

edited

Loading

github-actions bot commented Mar 17, 2025 •

edited

Loading

thomasloux commented Mar 17, 2025

glemaitre commented Mar 17, 2025 •

edited

Loading

thomasloux commented Mar 17, 2025

glemaitre commented Mar 17, 2025

glemaitre commented Mar 24, 2025

thomasloux commented Mar 26, 2025 •

edited by auguste-probabl

Loading

glemaitre commented Mar 27, 2025 •

edited

Loading

thomasloux commented Mar 29, 2025

glemaitre commented Mar 29, 2025

github-actions bot commented Mar 29, 2025 •

edited

Loading

glemaitre left a comment

glemaitre commented Apr 1, 2025

glemaitre commented Apr 1, 2025

feat: Change the default of aggregate parameter for cross-validation report #1440

feat: Change the default of aggregate parameter for cross-validation report #1440

Conversation

thomasloux commented Mar 16, 2025

thomass-dev commented Mar 17, 2025 • edited Loading

github-actions bot commented Mar 17, 2025 • edited Loading

thomasloux commented Mar 17, 2025

glemaitre commented Mar 17, 2025 • edited Loading

thomasloux commented Mar 17, 2025

glemaitre commented Mar 17, 2025

glemaitre commented Mar 24, 2025

thomasloux commented Mar 26, 2025 • edited by auguste-probabl Loading

glemaitre commented Mar 27, 2025 • edited Loading

thomasloux commented Mar 29, 2025

glemaitre commented Mar 29, 2025

github-actions bot commented Mar 29, 2025 • edited Loading

glemaitre left a comment

Choose a reason for hiding this comment

glemaitre commented Apr 1, 2025

glemaitre commented Apr 1, 2025

feat: Change the default of `aggregate` parameter for cross-validation report #1440

feat: Change the default of `aggregate` parameter for cross-validation report #1440

thomass-dev commented Mar 17, 2025 •

edited

Loading

github-actions bot commented Mar 17, 2025 •

edited

Loading

glemaitre commented Mar 17, 2025 •

edited

Loading

thomasloux commented Mar 26, 2025 •

edited by auguste-probabl

Loading

glemaitre commented Mar 27, 2025 •

edited

Loading

github-actions bot commented Mar 29, 2025 •

edited

Loading