MAINT: Cleaner design #7

HaoZeke · 2025-01-27T04:56:58Z

Also prettier outputs:

The pre-Polars stuff could even be upstreamed perhaps..

JuroOravec · 2025-05-30T04:52:54Z

src/asv_spyglass/_asv_ro.py

    api_version = 2

-    def __init__(self, benchmarks_file: Path, regex: Union[str, list[str]] = None):
+    def __init__(self, benchmarks_file: Path, regex: str | list[str] = None):


Missing None in the type

Suggested change

def __init__(self, benchmarks_file: Path, regex: str | list[str] = None):

def __init__(self, benchmarks_file: Path, regex: str | list[str] | None = None):

JuroOravec · 2025-05-30T05:08:50Z

src/asv_spyglass/_num.py

+    supports marking a ratio as insignificant.
+
+    Attributes:
+        _t1 (float | None): The first value (denominator).


Not related to the logic, but the documenting side of things - I use VSCode and I've never used PyCharm, so I don't know if I'm not maybe jsut missing out on something.

Do you use an IDE where the attributes / methods are documented when using this notation for parameters / methods?

As an example, since I use VSCode (as Github codespace in this case), I don't get hints on hover.

To get hints on hover on VSCode, the attributes / methods need to have their docstrings set:

Or like this

Or maybe there is a VSCode plugin I'm not familiar with?

JuroOravec · 2025-05-30T05:11:18Z

src/asv_spyglass/_num.py

+            - "~" followed by the formatted ratio if is_insignificant is True.
+            - A formatted string with 2 decimal places otherwise.
+
+        _is_invalid(self, t1, t2):


Also, just opinion, but documenting the private fields on the class docstring seems redundant - IMO makes more sense not to mention it here, so we have the freedom to remove _is_invalid as needed, without having peopele being reliant on it too much.

JuroOravec · 2025-05-30T05:22:38Z

src/asv_spyglass/_num.py

+        self.val = None
+
+        if self._is_invalid(t1, t2):
+            self.val = math.inf


Since the value of val is set only once and only at creation, maybe we could make this object immutable with a namedtuple or a dataclass? To prevent the possiblity of putting this object into invalid state. Since there is post-processing in the __init__ method, it would have to be a dataclass:

@dataclass(frozen=True) class Ratio: ...

In that case this if / else logic would be moved to __post_init__

JuroOravec · 2025-05-30T05:25:43Z

tests/test_num.py

+
+def test_ratio_is_insignificant():
+    ratio = Ratio(t1=10, t2=500)
+    ratio.is_insignificant = True


It would make sense to allow to set this at initialization

ratio = Ratio(t1=10, t2=500, is_insignificant=True)

JuroOravec · 2025-05-30T05:35:14Z

src/asv_spyglass/_num.py

+            calculation.
+    """
+
+    def __init__(self, t1: float | None, t2: float | None):


Btw more generally, does it make sense to support None? I see two uses of this class in production code:

https://github.com/airspeed-velocity/asv_spyglass/pull/7/files#diff-407e1ebccf4ba0fdc4dd19324ced7be40b749de1c6f6979459d18352b9938a16R171

https://github.com/airspeed-velocity/asv_spyglass/pull/7/files#diff-407e1ebccf4ba0fdc4dd19324ced7be40b749de1c6f6979459d18352b9938a16R268

In the first case Mypy says that the values will always be float (math.nan is the default).

Can't tell what the types are for the dataframe in the second case for the values "before" / "after". But in the case that even there the defaults are math.nan (or similar), and there are no None values, we could simplify the signature to:

def __init__(self, t1: float, t2: float):

JuroOravec · 2025-05-30T06:15:18Z

src/asv_spyglass/changes.py

+
+
+@dataclass
+class Incomparable(ASVChangeInfo):


I think in Python it makes more sense to not subclass ASVChangeInfo in this case, but rather to make instances of ASVChangeInfo. Because when you are subclassing in Python then you have no obligation to define all the attributes.

In other words, the language would allow even something like this, where we would define only a single attribute:

@dataclass class Incomparable(ASVChangeInfo): mark: ResultMark = ResultMark.INCOMPARABLE

But to ensure that we are forced to set all the fields, we should make instances of ASVChangeInfo instead:

def incomparable(): return ASVChangeInfo( mark=ResultMark.INCOMPARABLE, color=ResultColor.LIGHTGREY, description="Not comparable", state=AfterIs.LUKEWARM, before="", after="", )

Building on top of this, we can leverage Mypy even further to catch and prevent errors. Right now ResultMark has 7 states, but nothing is forcing us to define the ASVChangeInfo for all 7 states. We have to implicitly remember that.

Instead, we can make Mypy to raise error if we forget to define ASVChangeInfo for any of the states.

To do that, we can make a dictionary with dict[ResultMark, ASVChangeInfo] type:

CHANGE_INFO_TYPES: dict[ResultMark, ASVChangeInfo] = { ResultMark.INCOMPARABLE: ASVChangeInfo( mark=ResultMark.INCOMPARABLE, color=ResultColor.LIGHTGREY, description="Not comparable", state=AfterIs.LUKEWARM, before="", after="", ), ResultMark.FAILURE: ASVChangeInfo( mark=ResultMark.FAILURE, color=ResultColor.RED, description="Introduced a failure", state=AfterIs.WORSE, before=""Succeeded, after="Failed", ), ... }

And then, when accessing the values in get_change_info(), instead of:

# not comparable return Incomparable()

We do:

# not comparable return CHANGE_INFO_TYPES[ResultMark.INCOMPARABLE]

It's a bit more verbose, but reduces the mental load - we don't have to remember that we need to update the states when we update the ResultMark enum the next time. The IDE will highlight this error automatically.

Next step goes a bit more abstract. But to get even more type safety, even the _get_change_info() function could be refactored so that Mypy forces us to handle all cases explicitly.

Again we would use a dictionary with enum keys for that. But in this case the values would be the checks. It could look like this:

CHANGE_INFO_CHECKS: dict[ ResultMark, Callable[[ASVBench, ASVBench, float, bool], bool], ] = { ResultMark.INCOMPARABLE: lambda asv1, asv2, factor, use_stats: ( asv1.version is not None and asv2.version is not None and asv1.version != asv2.version ), ... # Catch all ResultMark.UNCHANGED: lambda *a: True, } def _get_change_info( asv1: ASVBench, asv2: ASVBench, factor, use_stats ) -> ASVChangeInfo: for key, predicate in CHANGE_INFO_CHECKS.items(): if predicate(asv1, asv2, factor, use_stats): return CHANGE_INFO_TYPES[key] # Sanity check - Should never be reached raise ValueError("No ResultMark matched")

And if we already had both CHANGE_INFO_CHECKS and CHANGE_INFO_TYPES, we could actually combine them:

class ChangeInfo(NamedTuple): predicate: Callable[[ASVBench, ASVBench, float, bool], bool] info: ASVChangeInfo CHANGE_INFO: dict[ResultMark, ChangeInfo] = { ResultMark.INCOMPARABLE: ChangeInfo( predicate=lambda asv1, asv2, factor, use_stats: ( asv1.version is not None and asv2.version is not None and asv1.version != asv2.version ), info=ASVChangeInfo( mark=ResultMark.INCOMPARABLE, color=ResultColor.LIGHTGREY, description="Not comparable", state=AfterIs.LUKEWARM, before="", after="", ), ), ResultMark.FAILURE: ChangeInfo(...), ... ResultMark.UNCHANGED: ChangeInfo(...), } def _get_change_info( asv1: ASVBench, asv2: ASVBench, factor, use_stats ) -> ASVChangeInfo: for info in CHANGE_INFO.values(): if info.predicate(asv1, asv2, factor, use_stats): return info.info # Sanity check - Should never be reached raise ValueError("No ResultMark matched")

JuroOravec · 2025-05-30T06:19:52Z

src/asv_spyglass/cli.py

+            row_style = ""
+
+            # Determine row style based on change_mark
+            if change_mark == ResultMark.BETTER:


Similar to what I mentioned in https://github.com/airspeed-velocity/asv_spyglass/pull/7/files#r2115211895.

If you decided to implement the CHANGE_INFO metadata object, then row_style could be another attribute of ChangeInfo.

That would ensure that if we add a new ResultMark enum value, then we would be forced to also add row_style for it.

JuroOravec · 2025-05-30T06:25:12Z

src/asv_spyglass/cli.py

+
+        # Print summary of worsened/improved status
+        if not split:
+            if (av_x := sum([x.value for x in table_data["states"]])) > 0:


I'd move the assignment of av_x to a separate line, for 2 reasons:

At first I didn't notice the > 0 at the end.

If an error occurs on this line, it will be easier to debug this. Because there's a lot that can go wrong on that line (1. maybe no .value attr, or no "states" key, or maybe sum() raises internally).

Suggested change

if (av_x := sum([x.value for x in table_data["states"]])) > 0:

av_x = sum([x.value for x in table_data["states"]]))

if av_x > 0:

JuroOravec · 2025-05-30T07:17:46Z

src/asv_spyglass/_num.py

+
+
+@dataclass
+class BenchNum:


BenchNum doesn't seem to be used anywhere, we can delete it.

JuroOravec · 2025-05-30T07:30:19Z

src/asv_spyglass/compare.py

-    res_1 = results.Results.load(b1)
-    res_2 = results.Results.load(b2)
+def _get_change_info(
+    asv1: ASVBench, asv2: ASVBench, factor, use_stats


Add type hints

Suggested change

asv1: ASVBench, asv2: ASVBench, factor, use_stats

asv1: ASVBench, asv2: ASVBench, factor: float, use_stats: bool

JuroOravec · 2025-05-30T07:40:30Z

src/asv_spyglass/results.py

+
+
+@dataclass
+class ASVBench:


Again, this could be made immutable so we can have higher confidence that things are the way we assume they are.

Frozen / immutable dataclass doesn't play nicely with __post_init__ (see stackoverflow), so in that case I would make __post_init__ into a class method:

@dataclass(frozen=True) class ASVBench: name: str time: float stats_n_samples: tuple err: float | None version: str | None unit: str | None @classmethod def from_prepared_result(cls, name: str, pr: PreparedResult): time = pr.results.get(name, math.nan) stats_n_samples = pr.stats.get(name, (None,)) err: float | None = None if name in pr.stats and stats_n_samples[0]: err = get_err(time, stats_n_samples[0]) return cls( name=name, time=time, stats_n_samples=stats_n_samples, err=err, version=pr.versions.get(name), unit=pr.units.get(name), )

JuroOravec · 2025-05-30T07:58:22Z

src/asv_spyglass/compare.py

+        else:
+            filtered_df = df
+
+        if not filtered_df.is_empty():


Minor point, but in cases like this where there is only one logic branch, I think it's better to format this a way to reduce identation.

So instead of:

for color in colors: ... if not filtered_df.is_empty(): # Rest of code...

We would write:

for color in colors: ... if filtered_df.is_empty(): continue # Rest of code...

You can see that in the second example the # Rest of code... has 1 less indentation level.

JuroOravec · 2025-05-30T08:01:51Z

src/asv_spyglass/compare.py

+def do_compare(
+    b1,
+    b2,
+    bdat,


What are these b1, b2, bdat?

JuroOravec · 2025-05-30T08:03:26Z

src/asv_spyglass/compare.py

+    sort="default",
+    machine=None,
+    env_spec=None,
+    use_stats=True,


machine, env_spec are not used, we can remove them.

use_stats is not passed in to do_compare() in this codebase, I believe we can remove that too?

JuroOravec · 2025-05-30T08:12:37Z

src/asv_spyglass/compare.py

-            color_print(titles[key])
-            color_print("")
+    # Initialize benchmarks
+    benchmarks = ReadOnlyASVBenchmarks(Path(bdat)).benchmarks


Outside of the scope of this PR, and I will make a separate PR / issue to discuss this, but I believe we can simplifyReadOnlyASVBenchmarks

For one anywhere where we use this class, we always only access the
benchmarks attribute, and nothng else.

So this could be simplified to a function:

benchmarks = load_asv_benchmarks( benchmarks_file: Path, regex: str | list[str] | None = None, api_version: int = 2, )

Classes are always a bit "riskier" then functions, because class instances hold state that could be modified and corrupted (setting nonsense values) between calls.

Secondly, after reading through ReadOnlyASVBenchmarks and it's file, it looks like it's purpose is to load benchmark data from directory.

So instead of calling it ReadOnlyASVBenchmarks, it would be clearer if name reflected what it does - load benchmarks (hence load_asv_benchmarks()).

Thus I would also rename the file _asv_ro.py to loader.py, because ro is cryptic.

JuroOravec · 2025-05-30T08:14:46Z

src/asv_spyglass/compare.py

-        else:
-            name_1 = ""
+    # Prepare results
+    preparer = ResultPreparer(benchmarks)


Again outside of the scope of this PR, but ReadOnlyASVBenchmarks is similar to ResultPreparer - it has only a single method, so it makes more sense to instead have a function than class:

pr1 = prepare_results(bencharks, res_1) pr2 = prepare_results(bencharks, res_2)

JuroOravec · 2025-05-30T08:15:42Z

src/asv_spyglass/compare.py

+
+    # Get commit names or use empty strings
+    name_1 = ""  # commit_names.get(hash_1, "")
+    name_2 = ""  # commit_names.get(hash_2, "")


What does the commit_names.get(hash_1, "") mean in this context?

JuroOravec · 2025-05-30T08:24:11Z

@HaoZeke Don't be alarmed by the number of comments. I don't know your background, so some things I mention might be obvious to you.

Most of the comments are on linting / typing / best practices.

The general flow / intent looks good to me! But also this is the first time I see the asv_spyglass repo, so I might miss some nuances. 😄

HaoZeke · 2025-05-30T09:14:06Z

@HaoZeke Don't be alarmed by the number of comments. I don't know your background, so some things I mention might be obvious to you.

Most of the comments are on linting / typing / best practices.

The general flow / intent looks good to me! But also this is the first time I see the asv_spyglass repo, so I might miss some nuances. 😄

No worries, even known things are best remembered in context (and all the suggestions are great) ^_^

I might get to this a bit later though (~couple of days).

HaoZeke added 17 commits January 26, 2025 17:01

ENH: Add rich-click for a nicer interface

1575ae1

GEN: Update lockfile

6126bd2

ENH: Use structures for the results

6904925

ENH: Start calculating ratios better

f9f0bf2

MAINT: Handle insignificance

867f584

ENH: Start generating result structures

ac8a7af

ENH: Start refactoring with ASVBench and use Rich

4d1310f

ENH: Use rich

831eb3f

MAINT: Lint changes

b617aaf

MAINT: Cleanup results more

6dde507

BUG: Fix typo and approval tests

37ed174

MAINT: Cleanup with dataclasses

387dc9f

TST: Fix output

fc60557

MAINT: Refactor further with spyglass changes

4d3a501

ENH: Switch to polars

abcc917

MAINT: Split do_compare

0df830d

MAINT: More verbose formatting

5be5d5f

HaoZeke requested review from JuroOravec and mattip May 29, 2025 20:48

HaoZeke mentioned this pull request May 29, 2025

Add built-in support to display only improved or regressed benchmarks airspeed-velocity/asv#1496

Open

JuroOravec reviewed May 30, 2025

View reviewed changes

src/asv_spyglass/_num.py

@dataclass

class BenchNum:

Copy link

JuroOravec May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BenchNum doesn't seem to be used anywhere, we can delete it.

JuroOravec reviewed May 30, 2025

View reviewed changes

src/asv_spyglass/compare.py

def do_compare(

b1,

b2,

bdat,

Copy link

JuroOravec May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these b1, b2, bdat?

JuroOravec reviewed May 30, 2025

View reviewed changes

JuroOravec mentioned this pull request May 30, 2025

MAINT: Cleanup suggestions #9

Draft

	def __init__(self, benchmarks_file: Path, regex: str \| list[str] = None):
	def __init__(self, benchmarks_file: Path, regex: str \| list[str] \| None = None):

	if (av_x := sum([x.value for x in table_data["states"]])) > 0:
	av_x = sum([x.value for x in table_data["states"]]))
	if av_x > 0:

	asv1: ASVBench, asv2: ASVBench, factor, use_stats
	asv1: ASVBench, asv2: ASVBench, factor: float, use_stats: bool

MAINT: Cleaner design #7

Are you sure you want to change the base?

MAINT: Cleaner design #7

Uh oh!

Conversation

HaoZeke commented Jan 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JuroOravec May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JuroOravec May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JuroOravec commented May 30, 2025

Uh oh!

HaoZeke commented May 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JuroOravec May 30, 2025 •

edited

Loading

JuroOravec May 30, 2025 •

edited

Loading