Skip to content

Conversation

@HaoZeke
Copy link
Member

@HaoZeke HaoZeke commented Jan 27, 2025

Also prettier outputs:
image
The pre-Polars stuff could even be upstreamed perhaps..

api_version = 2

def __init__(self, benchmarks_file: Path, regex: Union[str, list[str]] = None):
def __init__(self, benchmarks_file: Path, regex: str | list[str] = None):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing None in the type

Suggested change
def __init__(self, benchmarks_file: Path, regex: str | list[str] = None):
def __init__(self, benchmarks_file: Path, regex: str | list[str] | None = None):

supports marking a ratio as insignificant.
Attributes:
_t1 (float | None): The first value (denominator).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to the logic, but the documenting side of things - I use VSCode and I've never used PyCharm, so I don't know if I'm not maybe jsut missing out on something.

Do you use an IDE where the attributes / methods are documented when using this notation for parameters / methods?

As an example, since I use VSCode (as Github codespace in this case), I don't get hints on hover.

Screenshot 2025-05-30 at 07 03 43

To get hints on hover on VSCode, the attributes / methods need to have their docstrings set:

Screenshot 2025-05-30 at 07 06 48

Or like this

Screenshot 2025-05-30 at 07 05 57

Or maybe there is a VSCode plugin I'm not familiar with?

- "~" followed by the formatted ratio if is_insignificant is True.
- A formatted string with 2 decimal places otherwise.
_is_invalid(self, t1, t2):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, just opinion, but documenting the private fields on the class docstring seems redundant - IMO makes more sense not to mention it here, so we have the freedom to remove _is_invalid as needed, without having peopele being reliant on it too much.

self.val = None

if self._is_invalid(t1, t2):
self.val = math.inf

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the value of val is set only once and only at creation, maybe we could make this object immutable with a namedtuple or a dataclass? To prevent the possiblity of putting this object into invalid state. Since there is post-processing in the __init__ method, it would have to be a dataclass:

@dataclass(frozen=True)
class Ratio:
    ...

In that case this if / else logic would be moved to __post_init__


def test_ratio_is_insignificant():
ratio = Ratio(t1=10, t2=500)
ratio.is_insignificant = True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense to allow to set this at initialization

ratio = Ratio(t1=10, t2=500, is_insignificant=True)

calculation.
"""

def __init__(self, t1: float | None, t2: float | None):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw more generally, does it make sense to support None? I see two uses of this class in production code:

In the first case Mypy says that the values will always be float (math.nan is the default).

Can't tell what the types are for the dataframe in the second case for the values "before" / "after". But in the case that even there the defaults are math.nan (or similar), and there are no None values, we could simplify the signature to:

def __init__(self, t1: float, t2: float):



@dataclass
class Incomparable(ASVChangeInfo):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in Python it makes more sense to not subclass ASVChangeInfo in this case, but rather to make instances of ASVChangeInfo. Because when you are subclassing in Python then you have no obligation to define all the attributes.

In other words, the language would allow even something like this, where we would define only a single attribute:

@dataclass
class Incomparable(ASVChangeInfo):
    mark: ResultMark = ResultMark.INCOMPARABLE

But to ensure that we are forced to set all the fields, we should make instances of ASVChangeInfo instead:

def incomparable():
    return ASVChangeInfo(
        mark=ResultMark.INCOMPARABLE,
        color=ResultColor.LIGHTGREY,
        description="Not comparable",
        state=AfterIs.LUKEWARM,
        before="",
        after="",
    )

Building on top of this, we can leverage Mypy even further to catch and prevent errors. Right now ResultMark has 7 states, but nothing is forcing us to define the ASVChangeInfo for all 7 states. We have to implicitly remember that.

Instead, we can make Mypy to raise error if we forget to define ASVChangeInfo for any of the states.

To do that, we can make a dictionary with dict[ResultMark, ASVChangeInfo] type:

CHANGE_INFO_TYPES: dict[ResultMark, ASVChangeInfo] = {
    ResultMark.INCOMPARABLE: ASVChangeInfo(
        mark=ResultMark.INCOMPARABLE,
        color=ResultColor.LIGHTGREY,
        description="Not comparable",
        state=AfterIs.LUKEWARM,
        before="",
        after="",
    ),
    ResultMark.FAILURE: ASVChangeInfo(
        mark=ResultMark.FAILURE,
        color=ResultColor.RED,
        description="Introduced a failure",
        state=AfterIs.WORSE,
        before=""Succeeded,
        after="Failed",
    ),
    ...
}

And then, when accessing the values in get_change_info(), instead of:

# not comparable
return Incomparable()

We do:

# not comparable
return CHANGE_INFO_TYPES[ResultMark.INCOMPARABLE]

It's a bit more verbose, but reduces the mental load - we don't have to remember that we need to update the states when we update the ResultMark enum the next time. The IDE will highlight this error automatically.


Next step goes a bit more abstract. But to get even more type safety, even the _get_change_info() function could be refactored so that Mypy forces us to handle all cases explicitly.

Again we would use a dictionary with enum keys for that. But in this case the values would be the checks. It could look like this:

CHANGE_INFO_CHECKS: dict[
    ResultMark,
    Callable[[ASVBench, ASVBench, float, bool], bool],
] = {
    ResultMark.INCOMPARABLE: lambda asv1, asv2, factor, use_stats: (
        asv1.version is not None
        and asv2.version is not None
        and asv1.version != asv2.version
    ),
    ...
    # Catch all
    ResultMark.UNCHANGED: lambda *a: True,
}

def _get_change_info(
    asv1: ASVBench, asv2: ASVBench, factor, use_stats
) -> ASVChangeInfo:
    for key, predicate in CHANGE_INFO_CHECKS.items():
        if predicate(asv1, asv2, factor, use_stats):
            return CHANGE_INFO_TYPES[key]

    # Sanity check - Should never be reached
    raise ValueError("No ResultMark matched")

And if we already had both CHANGE_INFO_CHECKS and CHANGE_INFO_TYPES, we could actually combine them:

class ChangeInfo(NamedTuple):
    predicate: Callable[[ASVBench, ASVBench, float, bool], bool]
    info: ASVChangeInfo

CHANGE_INFO: dict[ResultMark, ChangeInfo] = {
    ResultMark.INCOMPARABLE: ChangeInfo(
        predicate=lambda asv1, asv2, factor, use_stats: (
            asv1.version is not None
            and asv2.version is not None
            and asv1.version != asv2.version
        ),
        info=ASVChangeInfo(
            mark=ResultMark.INCOMPARABLE,
            color=ResultColor.LIGHTGREY,
            description="Not comparable",
            state=AfterIs.LUKEWARM,
            before="",
            after="",
        ),
    ),
    ResultMark.FAILURE: ChangeInfo(...),
    ...
    ResultMark.UNCHANGED: ChangeInfo(...),
}

def _get_change_info(
    asv1: ASVBench, asv2: ASVBench, factor, use_stats
) -> ASVChangeInfo:
    for info in CHANGE_INFO.values():
        if info.predicate(asv1, asv2, factor, use_stats):
            return info.info

    # Sanity check - Should never be reached
    raise ValueError("No ResultMark matched")

row_style = ""

# Determine row style based on change_mark
if change_mark == ResultMark.BETTER:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what I mentioned in https://github.com/airspeed-velocity/asv_spyglass/pull/7/files#r2115211895.

If you decided to implement the CHANGE_INFO metadata object, then row_style could be another attribute of ChangeInfo.

That would ensure that if we add a new ResultMark enum value, then we would be forced to also add row_style for it.


# Print summary of worsened/improved status
if not split:
if (av_x := sum([x.value for x in table_data["states"]])) > 0:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move the assignment of av_x to a separate line, for 2 reasons:

  1. At first I didn't notice the > 0 at the end.
  2. If an error occurs on this line, it will be easier to debug this. Because there's a lot that can go wrong on that line (1. maybe no .value attr, or no "states" key, or maybe sum() raises internally).
Suggested change
if (av_x := sum([x.value for x in table_data["states"]])) > 0:
av_x = sum([x.value for x in table_data["states"]]))
if av_x > 0:



@dataclass
class BenchNum:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BenchNum doesn't seem to be used anywhere, we can delete it.

res_1 = results.Results.load(b1)
res_2 = results.Results.load(b2)
def _get_change_info(
asv1: ASVBench, asv2: ASVBench, factor, use_stats

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add type hints

Suggested change
asv1: ASVBench, asv2: ASVBench, factor, use_stats
asv1: ASVBench, asv2: ASVBench, factor: float, use_stats: bool



@dataclass
class ASVBench:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this could be made immutable so we can have higher confidence that things are the way we assume they are.

Frozen / immutable dataclass doesn't play nicely with __post_init__ (see stackoverflow), so in that case I would make __post_init__ into a class method:

@dataclass(frozen=True)
class ASVBench:
    name: str
    time: float
    stats_n_samples: tuple
    err: float | None
    version: str | None
    unit: str | None

    @classmethod
    def from_prepared_result(cls, name: str, pr: PreparedResult):
        time = pr.results.get(name, math.nan)
        stats_n_samples = pr.stats.get(name, (None,))

        err: float | None = None
        if name in pr.stats and stats_n_samples[0]:
            err = get_err(time, stats_n_samples[0])

        return cls(
            name=name,
            time=time,
            stats_n_samples=stats_n_samples,
            err=err,
            version=pr.versions.get(name),
            unit=pr.units.get(name),
        )

else:
filtered_df = df

if not filtered_df.is_empty():
Copy link

@JuroOravec JuroOravec May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point, but in cases like this where there is only one logic branch, I think it's better to format this a way to reduce identation.

So instead of:

for color in colors:
    ...
    if not filtered_df.is_empty():
        # Rest of code...

We would write:

for color in colors:
    ...
    if filtered_df.is_empty():
        continue
    # Rest of code...

You can see that in the second example the # Rest of code... has 1 less indentation level.

def do_compare(
b1,
b2,
bdat,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these b1, b2, bdat?

sort="default",
machine=None,
env_spec=None,
use_stats=True,
Copy link

@JuroOravec JuroOravec May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

machine, env_spec are not used, we can remove them.

use_stats is not passed in to do_compare() in this codebase, I believe we can remove that too?

color_print(titles[key])
color_print("")
# Initialize benchmarks
benchmarks = ReadOnlyASVBenchmarks(Path(bdat)).benchmarks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of the scope of this PR, and I will make a separate PR / issue to discuss this, but I believe we can simplifyReadOnlyASVBenchmarks

  1. For one anywhere where we use this class, we always only access the
    benchmarks attribute, and nothng else.

    So this could be simplified to a function:

    benchmarks = load_asv_benchmarks(
        benchmarks_file: Path,
        regex: str | list[str] | None = None,
        api_version: int = 2,
    )

    Classes are always a bit "riskier" then functions, because class instances hold state that could be modified and corrupted (setting nonsense values) between calls.

  2. Secondly, after reading through ReadOnlyASVBenchmarks and it's file, it looks like it's purpose is to load benchmark data from directory.

    So instead of calling it ReadOnlyASVBenchmarks, it would be clearer if name reflected what it does - load benchmarks (hence load_asv_benchmarks()).

    Thus I would also rename the file _asv_ro.py to loader.py, because ro is cryptic.

else:
name_1 = ""
# Prepare results
preparer = ResultPreparer(benchmarks)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again outside of the scope of this PR, but ReadOnlyASVBenchmarks is similar to ResultPreparer - it has only a single method, so it makes more sense to instead have a function than class:

pr1 = prepare_results(bencharks, res_1)
pr2 = prepare_results(bencharks, res_2)


# Get commit names or use empty strings
name_1 = "" # commit_names.get(hash_1, "")
name_2 = "" # commit_names.get(hash_2, "")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the commit_names.get(hash_1, "") mean in this context?

@JuroOravec
Copy link

@HaoZeke Don't be alarmed by the number of comments. I don't know your background, so some things I mention might be obvious to you.

Most of the comments are on linting / typing / best practices.

The general flow / intent looks good to me! But also this is the first time I see the asv_spyglass repo, so I might miss some nuances. 😄

@HaoZeke
Copy link
Member Author

HaoZeke commented May 30, 2025

@HaoZeke Don't be alarmed by the number of comments. I don't know your background, so some things I mention might be obvious to you.

Most of the comments are on linting / typing / best practices.

The general flow / intent looks good to me! But also this is the first time I see the asv_spyglass repo, so I might miss some nuances. 😄

No worries, even known things are best remembered in context (and all the suggestions are great) ^_^

I might get to this a bit later though (~couple of days).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants