Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to the Overloads chapter #1839

Open
wants to merge 39 commits into
base: main
Choose a base branch
from
Open

Conversation

erictraut
Copy link
Collaborator

@erictraut erictraut commented Aug 13, 2024

  • Attempts to clearly define the algorithm for overload matching.
  • Describes checks for overload consistency, overlapping overloads, and implementation consistency.

python/typing-council#40

erictraut and others added 2 commits August 13, 2024 17:06
* Attempts to clearly define the algorithm for overload matching.
* Describes checks for overload consistency, overlapping overloads, and implementation consistency.
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
Copy link
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(two quick comments)

docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Show resolved Hide resolved
docs/spec/overload.rst Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
Copy link
Member

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this tricky area! I haven't finished review yet, but may be called away soon, so I'm submitting the comments I have so far. (EDIT: I've now completed my review.)

docs/spec/overload.rst Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
docs/spec/overload.rst Outdated Show resolved Hide resolved
@erictraut
Copy link
Collaborator Author

erictraut commented Aug 28, 2024

We typically wait for a proposed spec change to be accepted by the TC prior to writing conformance tests. In this case, I think it's advisable to write the conformance tests prior to acceptance. This will help us validate the proposed spec changes and tell us if (and to what extent) these changes will be disruptive for existing stubs and current type checker implementations.

I would normally volunteer to write the conformance tests, but in this case I think it would be preferable for someone else to write the tests based on their reading of the spec update. If I write the tests, there's a real possibility that they will match what's in my head but not accurately reflect the letter of the spec. There's also a possibility that I'll miss some important cases in the tests. If someone else writes the tests, they can help identify holes and ambiguities in the spec language.

Is there anyone willing to volunteer to write a draft set of conformance tests for this overload functionality? I'm thinking that there should be four new test files:

  1. overloads_definitions: Tests the rules defined in the "Invalid overload definitions" section
  2. overloads_consistency: Tests the rules defined in the "Implementation consistency" section
  3. overloads_overlap: Tests the rules defined in the "Overlapping overloads" section
  4. overloads_evaluation: Tests the rules defined in the "Overload call evaluation" section

If this is more work than any one person wants to volunteer for, we could split it up.

@carljm
Copy link
Member

carljm commented Aug 28, 2024

I am willing to work on conformance tests for this, but I probably can't get to it until the core dev sprint, Sept 23-27. I realize that implies a delay to moving forward with this PR. Happy for someone else to get to it first.

@carljm
Copy link
Member

carljm commented Jan 11, 2025

I believe I've completed the test suite, with reasonably good coverage of everything specified as a "should". I intentionally avoided adding tests either way for behaviors specified as a "may".

I also added the capability to have stub test files in the conformance suite, and added overloads_definitions_stub.pyi, since the rules for valid overload definition are significantly different in a stub file.

I aimed to write tests that reflect the specification as it currently exists in this PR, to help illuminate where type checkers currently do and don't conform to this spec. I commented inline on some points where I wonder if we should adjust the spec.

@erictraut
Copy link
Collaborator Author

@carljm, thanks for doing this! I'll try to find time next week to review your test code and update the draft spec if your test uncovered any areas of ambiguity or lack of clarity.

@erictraut
Copy link
Collaborator Author

@carljm, thanks for writing the test suite. I reviewed it, and it looks good to me. One thought is that we may want to cover more cases for rule 6 in the overload evaluation algorithm. I know there are cases where mypy and pyright diverge here, and it would be good to suss out some of those divergences and determine the "correct" answer based on the wording in the proposed spec.

I didn't want to write these tests myself because I thought it would be beneficial for someone else to write the tests with a critical eye toward the proposed spec — to look for holes and areas of ambiguity. Other than the points you raised above, did you find any other areas of concern about the proposed spec? If not, would you be in favor of advancing it as a "final" draft for consideration by the full typing council?

@carljm
Copy link
Member

carljm commented Jan 24, 2025

I'm curious what specific additional scenarios for step 6 you'd want to test. I think I remember looking at it and not coming up with anything that wasn't already covered in the existing tests. But at that point I was probably fading, so I may well have missed something. Definitely not opposed to adding more tests; maybe you can add them if you have ideas?

I don't think I have areas of concern besides those mentioned above. I think the main thing I realized is that it would ideally be better for clarity of the overload spec if we had a clear spec for call-checking in the first place, since overloads depend on call-checking, and thus this spec currently implicitly depends on some assumptions about behavior of call-checking that aren't specified.

But I don't think that's a reason to delay landing this spec change; it's clearly an improvement on the status quo.

I also think that we should explicitly note in the spec text that the separation of step one (arity checking) and step two (full argument matching) into separate elimination passes is recommended for performance reasons and may make it easier to understand the algorithm, but makes no observable semantic difference (assuming that return types of erroring calls are treated as meaningless, as you advocated above). I think this clarification is useful for helping readers understand the spec. It was a point of confusion for me and I only got clarity on it by working with example cases for awhile.

@erictraut
Copy link
Collaborator Author

this spec currently implicitly depends on some assumptions about behavior of call-checking that aren't specified.

Yes, I agree. Definitely more work to do still.

I also think that we should explicitly note in the spec text that the separation of step one (arity checking) and step two (full argument matching) into separate elimination passes is recommended for performance reasons

I'd prefer not to do this. Such a statement seems out of place for a spec. As you've pointed out before, the spec should describe the behaviors, not the implementation. In this case, the behavior is sufficiently complex that it requires us to use a multi-step process. If a type checker implementer wants to combine some of the steps when implementing this support and can prove that the resulting behavior is identical to the behavior described in the spec, that is fine. That's always the case for any behaviors described in the spec.

@erictraut erictraut marked this pull request as ready for review January 25, 2025 01:09
`argument-type-expansion`_ below.


Step 4: If the argument list is compatible with two or more overloads,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step seems unsound and it's not clear to me that the behavior it enables is commonly needed. Does this situation come up regularly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior comes from mypy. There are various overloads in typeshed and in other popular stub libraries like pandas-stubs that rely on this behavior. For example, the gather function in asyncio.tasks use a fallback overload signature (the last one in the list) that should be used when an indeterminate number of args are passed to the function.

Why do you think it's unsound? If it is unsound, can you think of a modification or variation that would make it sound?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think it's unsound?

The test case (check_variadic in overloads_evaluation.py). If you call check_variadic([1]), variadic() will return a str but type checkers will infer int.

If it is unsound, can you think of a modification or variation that would make it sound?

My naive answer would be that nothing special should happen here, so we will end up selecting all overloads that might match, combine their return types, and see what comes out. That would often be Any though. Even if we use a union instead, asyncio.gather would result in a very unwieldy type.

I'm fine with keeping this rule as a pragmatic way to get the desired behavior for gather-like functions, but I'd be happier if we found an approach that preserves soundness a bit more, or at least a narrower application of the rule.

Copy link
Member

@carljm carljm Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a modification that would improve soundness would be to require that the overload with a variadic parameter return something that is assignable from the eliminated overload(s). In principle this is still compatible with the gather "fallback" scenario. In practice, for some reason the current gather stub returns a list rather than an indeterminate-length tuple in the fallback overload, even though it returns tuples in all the known-arity cases. Unless that were changed, it wouldn't be compatible with this tightening of the rules.

(This tightening would be equivalent to "just union the return types of all matching overloads" in the case of fully-static return types, when the rule is followed; it would behave a little differently for non-fully-static return types, where assignability of one to another doesn't imply that they will simplify out of a union.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My naive answer would be that nothing special should happen here, so we will end up selecting all overloads that might match, combine their return types, and see what comes out.

If we were to omit step 4 and following the normal rules, we wouldn't "select all overloads that might match and combine their return types". The normal rules (in the absence of an argument that contains an Any) would choose the first matching overload, and I don't think that's what we want here. That means some special rule is needed to handle variadic parameters — or unpacked arguments of indeterminate length.

I see what you mean about unsoundness though, so maybe we need to reconsider the formulation of step 4. The variant that Carl suggests above seems like a reasonable proposal.

Here's another option to consider. Maybe we should focus on the unpacked arguments rather than variadic parameters. This highlights an area of ambiguity for (non-overloaded) calls. The spec is currently silent about what to do when an unpacked argument of indeterminate length is matched against a list of parameters. Consider the following example:

def func1(a: int): ...
def func2(*args: int): ...
def func3(a: int, b: str): ...
def func4(a: int, *args: str): ...

def test1(*args: int):
    func1(*args)  # OK
    func2(*args)  # OK
    func3(*args, "")  # Error
    func4(*args)  # Error

I made pyright agree with mypy in the above example, but this has always bothered me because I'm not convinced that mypy is correct (or consistent) in its behavior here. For example, if it accepts func1, why shouldn't it also accept func3 and func4? I've received numerous bug reports and questions regarding these cases.

If we were to specify the behavior of unpacked iterables within a call expression in a way that is "tighter" (i.e. eliminates more false negatives at the expense of more false positives), it might help us simplify the overload resolution rules.

Then again, maybe we don't want to tackle this additional area of the spec at the same time as overloads.

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason the current gather stub returns a list rather than an indeterminate-length tuple in the fallback overload, even though it returns tuples in all the known-arity cases

The background for this is that gather() actually returns a list at runtime (in all cases). For the known-arity overloads, we currently lie and return a tuple because you can't have a heterogeneous list in the type system. It's an unfortunate situation without a great solution.

I'd need to think more about the other options discussed here. Regarding Eric's example about non-overload *args matching, I feel ideally the func1(*args) call should be an error, since it may fail at runtime. However, this might be unreasonably strict in practice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The background for this is that gather() actually returns a list at runtime (in all cases). For the known-arity overloads, we currently lie and return a tuple because you can't have a heterogeneous list in the type system. It's an unfortunate situation without a great solution.

Yes, I realized that after I made my comment. If we are already lying and saying it returns a tuple, it seems like it wouldn't make the situation any worse to have it return an indeterminate-length tuple in the fallback case, too, which would make it conform to the assignability rule I proposed above.

I feel ideally the func1(*args) call should be an error

I feel the same (and mentioned it in another comment thread.) But it seems likely that this change would cause a lot of churn for existing users?

Then again, maybe we don't want to tackle this additional area of the spec at the same time as overloads.

I think better specification of call checking would be a good idea, and would make the overload spec less ambiguous. Doing it in this PR would delay landing this PR, and make it even larger for reviewers -- but if it adds clarity, that might be an OK tradeoff?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be good to specify call checking in more detail, but I don't think we need to delay this PR -- this content can be adjusted later on as needed.

@erictraut erictraut changed the title First draft of an update to the Overloads chapter (DRAFT: DO NOT MERGE) Update to the Overloads chapter Jan 29, 2025
Copy link

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to see that this complicated matter is crystallizing into something solid!

After reading through this latest version, a couple of question came to mind:

  • What is the type of an overloaded function? When can it be assigned to a Callable, and the other way around?
  • Is a Protocol with an overloaded __call__ method an overloaded function type, or does it follow different rules?
  • When assigning an overloaded function to some Callable[Tss, R], then what happens to the individual signatures? Do they live within the Tss paramspec, and does R become causally dependent on Tss, i.e. as a type-mapping? Or does this assignment cause the overloaded type to change into a different function type without overloads?
  • Do I understand correctly that this spec allows decorating an overloaded function in .pyi stubs, as well? Because this is currently not supported in (at least) pyright.

docs/spec/overload.rst Show resolved Hide resolved
Type checkers may ignore the possibility of multiple inheritance or
intersections involving structural types for purposes of computing overlap.
In the following example, classes ``A`` and ``B`` could theoretically overlap
because there could be a common type ``C`` that derives from both ``A`` and

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never/NoReturn and Any are both assignable to both A and B.
Does this mean that A and B (and all other types) always overlap, or do they get special treatment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording above is "If two overloads can accept the same set of arguments, they are said to 'partially overlap'." Perhaps we should clarify what a "set of arguments" means. If it means "a set of argument types", then one could say "what about an argument typed as Never?" But if by "set of arguments" we mean " some possible set of objects (inhabitants of types), one for each argument", the issue disappears: if any argument is typed as Never, that call corresponds to zero possible sets of arguments, and so plays no role in determining overlap.

I think the latter definition is the correct one for determining whether a pair of overloads should be considered to overlap.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. That's what I meant to say in the original wording, but I can see how my wording could be misinterpreted. This change would eliminate the ambiguity.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

" some possible set of objects (inhabitants of types), one for each argument",

Yea that makes sense to me. But this would still include Never, i.e. the empty set of objects. Stated as "non-empty sets of objects" would avoid that.

Copy link
Member

@carljm carljm Feb 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's not what I meant, sorry -- there's still a confusion here over two uses of the word "set". It is used here to mean "one per argument", like a "collect one of each!" kind of set. We should probably avoid using "set" in this way entirely here.

What we mean to say is "is there any possible concrete call that could match more than one overload? If so, those overloads partially overlap" where by "concrete call" we mean "exactly one object per argument". A "concrete call", in this sense, does not involve any types (no "sets of objects"), just objects.

docs/spec/overload.rst Show resolved Hide resolved
docs/spec/overload.rst Show resolved Hide resolved
docs/spec/overload.rst Show resolved Hide resolved
for all remaining overloads are :term:<equivalent>, proceed to step 6.

If the return types are not equivalent, overload matching is ambiguous. In
this case, assume a return type of ``Any`` and stop.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing stubs (including typeshed, numpy, pandas, and others) assume that the result will be Any in this case, so I don't think this is something we can change at this point.

I maintain the stubs of NumPy and SciPy, and I know those stubs better than I care to admit, but this doesn't ring a bell for me. Do you have an example?
Because if so, then I'd be more than willing to change that. I'd like to avoid having to deal with Any if I can (and not only because set("Any") < set("Annoying")).

docs/spec/overload.rst Show resolved Hide resolved
docs/spec/overload.rst Show resolved Hide resolved
Copy link
Contributor

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for specifying overloads in more detail! Overall looks good, just a few comments. This is important for making stubs and third-party libraries behave consistently across type checkers, and without standardization it may be impossible to provide definitions for library functionality that work consistently across type checkers.

Step 5: For each argument, determine whether all possible
:term:`materializations <materialize>` of the argument's type are assignable to
the corresponding parameter type for each of the remaining overloads. If so,
eliminate all of the subsequent remaining overloads.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph felt unclear. I had to read it multiple times to understand the intent. Maybe reword this, or include some motivation why we have this rule, so that this can be understood easily without peeking at the following paragraph?

In the following example, classes ``A`` and ``B`` could theoretically overlap
because there could be a common type ``C`` that derives from both ``A`` and
``B``, but type checkers may choose not to flag this as an overlapping
overload::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "may choose not to flag" implies that it's reasonable for a type checker to flag overloaded function definition such as this example, various conformance suite tests, or even common __getitem__ definitions in typeshed stubs, as invalid. It seems a little odd that examples in the specification or widely accepted definitions in typeshed could be invalid, unless a type checker chooses to implement an optional rule. Could we make it a recommendation to not flag overloads like these:

@overload
def func(x: int) -> int: ...
@overload
def func(x: str) -> str: ...

Since this is symmetric with the example below, it would likely imply that the example below probably shouldn't be flagged either. Note that my example above is always fine at runtime, since it's not possible to have a subclass of both int and str, but this can't be represented in the type system.

Another idea would be to say something like "type checkers may choose to flag this as an overlapping overload, if there is evidence that some class inherits from both A and B", but this is pretty vague.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be reluctant (that is, have very strong rationale) before we require by spec that type checkers implement something that can easily give wrong results with reasonable code.

I think that overloads that are unsound in the face of multiple inheritance is a hole that we should aim to fix in the Python type system, so I would rather not specify that type checkers are required to leave this hole open.

it's not possible to have a subclass of both int and str, but this can't be represented in the type system.

I think it can be represented in the type system, because it can be framed as a rule about __slots__ compatibility (that's what it is at runtime). __slots__ is visible on Python classes, and can also be included in stubs for C extension types. To do this without special-casing builtin types would require adding __slots__ definitions to some builtin types in typeshed, but this seems doable.

if there is evidence that some class inherits from both A and B

I'm not sure what this would mean for a library defining API functions that are called by clients. When type checking the library, there is no way to have evidence one way or another about which classes exist in the client.

The way to ensure that there is no such class is to mark either A or B as final, or use __slots__ to ensure they can't be inherited together (if type checkers implement an understanding of how __slots__ restricts multiple inheritance.)

`argument-type-expansion`_ below.


Step 4: If the argument list is compatible with two or more overloads,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would be good to specify call checking in more detail, but I don't think we need to delay this PR -- this content can be adjusted later on as needed.

assert_type(ret1, int)

ret2 = example4(v2, 1)
assert_type(ret2, Any)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add more test cases for ambiguous overload resolution resulting in an Any return type? I couldn't see other tests for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there particular distinct scenarios that you think should also be tested?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to look at the unit tests in pyright and mypy. I wouldn't recommend copying test code verbatim, but they can serve as a source of inspiration for additional conformance tests.

Here are some relevant test cases in the pyright test suite:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect to contribute more test cases to this PR myself soon, but I have no objection to anyone building on the existing test cases who sees gaps they would like to fill.

parameter of the generic class ``typing.IO`` is constrained (only
``IO[str]``, ``IO[bytes]`` and ``IO[Any]`` are valid)::

class IO(Generic[AnyStr]): ...


Invalid overload definitions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should having a non-empty body for an @overload-decorated definition be an error? E.g.,

@overload
def f(x: int) -> int:
    return x
@overload
def f(x: str) -> str:
    return x
def f(x):
    return x

The first two implementations are pointless and potentially misleading.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traditionally, it hasn't been flagged as an error. It will not lead to unsoundness or a runtime exception, so my inclination is to not mandate that type checkers flag it. Arguably, this is more appropriate for a linter. I guess I don't feel that strongly about it one way or another.

I'll note that the typing spec currently doesn't clearly define what "empty body" means, and mypy and pyright differ slightly on their definitions of this. This is on my list of items to discuss and unify, but it's been lower priority.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough! I don't feel strongly about this either, just a passing thought.

Comment on lines +254 to +274
Step 1: Examine the argument list to determine the number of
positional and keyword arguments. Use this information to eliminate any
overload candidates that are not plausible based on their
input signatures.

- If no candidate overloads remain, generate an error and stop.
- If only one candidate overload remains, it is the winning match. Evaluate
it as if it were a non-overloaded function call and stop.
- If two or more candidate overloads remain, proceed to step 2.


Step 2: Evaluate each remaining overload as a regular (non-overloaded)
call to determine whether it is compatible with the supplied
argument list. Unlike step 1, this step considers the types of the parameters
and arguments. During this step, do not generate any user-visible errors.
Simply record which of the overloads result in evaluation errors.

- If all overloads result in errors, proceed to step 3.
- If only one overload evaluates without error, it is the winning match.
Evaluate it as if it were a non-overloaded function call and stop.
- If two or more candidate overloads remain, proceed to step 4.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, I would also find a one-step version easier to understand, but I don't have any strong objections to describing the two-step version, especially if we're not sure that they lead to the same result and the two-step one is more performant.

docs/spec/overload.rst Show resolved Hide resolved
@erictraut
Copy link
Collaborator Author

What is the type of an overloaded function? When can it be assigned to a Callable, and the other way around?

This is already covered in the typing spec here.

Is a Protocol with an overloaded call method an overloaded function type, or does it follow different rules?

This is already covered in the typing spec here.

When assigning an overloaded function to some Callable[Tss, R], then what happens to the individual signatures? Do they live within the Tss paramspec, and does R become causally dependent on Tss, i.e. as a type-mapping? Or does this assignment cause the overloaded type to change into a different function type without overloads?

This is not currently specified. It's out of scope for this PR.

Do I understand correctly that this spec allows decorating an overloaded function in .pyi stubs, as well? Because this is currently not supported in (at least) pyright.

This is already covered in the typing spec here. The short answer is, no, arbitrary decorators should not be used in stub files.

@jorenham
Copy link

jorenham commented Feb 1, 2025

When assigning an overloaded function to some Callable[Tss, R], then what happens to the individual signatures? Do they live within the Tss paramspec, and does R become causally dependent on Tss, i.e. as a type-mapping? Or does this assignment cause the overloaded type to change into a different function type without overloads?

This is not currently specified. It's out of scope for this PR.

Ah ok. The scope of this PR wasn't all that clear to me after reading the PR description. So before I ask anything else; what exactly is within the scope of this PR?
It's touches on multiple topics, which are of course all related, but it isn't all that obvious to me why it isn't required to also include typing.Callable or callable Protocol types when specifying the precise mechanics of overloads. They will likely also be directly affected by the choices that are made here. So I don't see why we shouldn't at least talk about those, so that we can avoid any unwanted (and currently unknown) potential consequences there.

@jorenham
Copy link

jorenham commented Feb 1, 2025

What is the type of an overloaded function? When can it be assigned to a Callable, and the other way around?

This is already covered in the typing spec here.

Is a Protocol with an overloaded call method an overloaded function type, or does it follow different rules?

This is already covered in the typing spec here.

Ok I wasn't aware of these sections. Maybe it could help to link to them from this spec?

@erictraut
Copy link
Collaborator Author

The scope of this PR is to describe:

  1. The way to perform overload matching — how to evaluate call expressions that target an overloaded function
  2. The way to check for overload consistency, overlapping overloads, and implementation consistency

These new sections rely upon definitions and concepts described elsewhere in the typing spec including assignability, materialization, type equivalency, enums, tuples, etc.

We try hard not to duplicate concept definitions in the spec because duplicate definitions will inevitably get out of sync and cause confusion. This section shouldn't need to say anything about assignability rules for callables or protocols because those are discussed elsewhere in the spec.

We've also endeavored to make concepts in the typing spec as orthogonal and composable as possible. If you see cases where concepts are not composing, those are cases that we should discuss.


def is_one(x: int) -> bool:
return x == 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that conformance tests are intended to be very basic, but this seems slightly too basic. Can we add a case with at least two arguments and a case with default values? E.g. mypy stopped flagging the following as an error a while ago (probably due to a regression; those signature do "partially overlap" according to the definition in spec and have incompatible return types):

@overload
def fn(x: str = ...) -> str: ...
@overload
def fn(x: int = ...) -> int: ...
def fn(x: int | str = 0) -> int | str:
    return x

^^^^^^^^^^^^^^^^^^^^^

If two overloads can accept the same set of arguments, they are said
to "partially overlap". If two overloads partially overlap, the return type
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JelleZijlstra, @JukkaL, I want to confirm that you are OK with the proposed definition of "partially overlapping overloads" in light of the response @ilevkivskyi posted to this mypy issue. I think Ivan's point is worth considering, although I'm not sure how to write the specification for "cases that could fool the type checker into choosing the wrong overload". Intuitively, I know what he means here, but a precise spec might be difficult to formulate. It looks from this historical comment that mypy contains quite a few rules and various exceptions to those rules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does pyright currently check overlapping overloads?

The text already contains a number of special cases (__get__, ignoring multiple inheritance, etc.). Also, it seems to me that overlap is a somewhat isolated issue; the rules we set out for evaluating overload calls still work even if some overloads overlap. It might be better then to leave the spec vague, saying that type checkers may choose to flag some unsafe overlaps, without trying to come up with precise rules.

Copy link
Collaborator Author

@erictraut erictraut Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does pyright currently check overlapping overloads?

The implementation is complicated. It leverages assignability rules but internally sets a flag that changes some of the normal assignability behaviors. For example, normally Any is assignable to every other type, but when this flag is set, Any is not assignable to any type (even itself). There are a number of other changes in the algorithm, but they attempt to implement the behavior described in this PR.

If we don't have agreement on the desired behavior, then I agree that it's best to keep the spec loose here. Actually, I'm not sure it makes sense to discuss it at all in the spec if we are not in agreement on the intended behavior. So maybe we should just delete that entire section and the corresponding conformance tests.

@samwgoldman
Copy link

How should subtyping overloaded functions behave?

I think there is a simple intuition, which is that f is a subtype of g if every call to f is also a valid call to g. If we follow that intuition, then I would expect the following examples to pass:

Example 1: Subtyping between overloaded callback Protocol and Callable type

from typing import Protocol, Callable, overload, assert_type

class P(Protocol):
  @overload
  def __call__(self, x: int, y: str, z: int) -> str: ...
  @overload
  def __call__(self, x: int, y: int, z: int) -> int: ...

def check_expand_union_callable(v: int | str, f: P) -> None:
    g: Callable[[int, int | str, int], int | str] = f
    ret = g(1, v, 1)
    assert_type(ret, int | str)

Example 2: Subclass/sub-protocol consistency

from typing import Protocol, overload, assert_type

class Base(Protocol):
    def m(self, x: int, y: int | str, z: int) -> int | str: ...

class Derived(Base, Protocol):
    @overload
    def m(self, x: int, y: str, z: int) -> str: ...
    @overload
    def m(self, x: int, y: int, z: int) -> int: ...

def check_expand_protocol(v: int | str, base: Base, derived: Derived) -> None:
    ret_base = base.m(1, v, 1)
    assert_type(ret_base, int | str)
    
    ret_derived = derived.m(1, v, 1)
    assert_type(ret_derived, int | str)

For what it's worth, I think it's reasonable to restrict the union expansion to calls. Union expansion is complicated and has concerning performance implications My understanding is that it is included in the spec because people need it, but if type checkers do not currently implement these rules for subtyping purposes, maybe that's evidence that this need does not extend to subtyping, and we can forgo the rule there entirely?

@erictraut
Copy link
Collaborator Author

I agree that type expansion shouldn't affect (infect) subtyping rules. Assignability rules for overloaded callables are already defined in the spec here.

What you're pointing out here is a small inconsistency between call evaluation behavior and the subtyping behavior. This was also recently pointed out by @hauntsaninja in this pyright issue.

We could look at amending the assignability rules to eliminate this inconsistency, but this would come at a big price — in terms of complexity and performance. My sense is that it wouldn't be a good tradeoff.

I was careful in this PR to talk about type expansion only in the context of "argument types". Arguments are applicable only to calls.

@samwgoldman
Copy link

While looking at overload resolution for Pyre, I noticed an ambiguity in the spec for Step 2. Specifically, what should we do with call argument expressions which have "internal errors"? That is, evaluating the expression leads to an error, but the expression still results in a type. For example f(g(x)) where g(x) is an error, but returns a type compatible with f.

A worked example:

from typing import overload, assert_type

@overload
def f(x: int) -> int: ...
@overload
def f(x: str) -> str: ...
def f(x: int | str) -> int | str:
    return x

def h(x: str) -> str:
    return ""

def g(x: str) -> int:
    return 0


# Call with error, returns int, select first overload
assert_type(f(g(0)), int)

x = g(0)
assert_type(f(x), int)


# Call with error, return str, select second overload
assert_type(f(h(0)), str)

y = h(0)
assert_type(f(y), str)

Should we specify that overload selection in Step 2 is determined by errors in between the arguments and parameters of the overload signature -- not simply the presence/absence of errors on the call overall?

@erictraut
Copy link
Collaborator Author

I noticed an ambiguity in the spec

When evaluating a call to an overloaded function, I think the behavior for "internal errors" should be the same as with calls to non-overloaded functions. That is, if there are type errors detected when evaluating argument expressions, those errors shouldn't affect the evaluation of the call expression itself. I think that's consistent with what you mean by "errors in between the arguments and parameters".

I'm not sure this matters though. Perhaps we can just leave this unspecified. My view is that in cases where a type error is detected and reported by a type checker, any downstream type evaluations that depend on that error are not covered by the spec. Type checkers will generally want to "limit the collateral damage" and reduce downstream false positives once the first error is detected, but I don't think we should try to mandate specific behaviors here. Ultimately, it's up to the user to fix the "inner error" type violation; once that's fixed, then the type checker can guarantee conformant behavior for dependent type evaluations.

If you can think of a situation where an "inner error" is detected but not reported to the user during overloaded call evaluation, this would be a bigger concern because it would effectively change the results of the overloaded call without the user realizing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.