typeshed icon indicating copy to clipboard operation
typeshed copied to clipboard

Should dict() constructor overloads be more permissive?

Open dimaqq opened this issue 1 year ago • 6 comments

Current constructor overloads are here:

https://github.com/python/typeshed/blob/e80ad6b2bce7ef6b2a20aec2d79a672859b31864/stdlib/builtins.pyi#L1041-L1046

Someone took care to allow specific positive case, and to block specific negative case. Many potential valid uses have fallen through the cracks:

dict([[1, 2]])
dict("AT TA GC CG".split())
dict([["str", b"bytes"]])

My understanding of the issue is that there's no way to specify sequence length (other than a tuple) in a type hint, that is there's no syntax to hint ["a", "b"] vs ["a", "b", "c"].

This brings a philosophical question: what side should typeshed err on when type hint syntax is not precise enough?

  • type whatever Python may accept run time, or
  • restrict users to what Python is guaranteed to accept at run time?

It was mentioned at https://github.com/microsoft/pyright/issues/7382 that type checkers trust typeshed, and thus the question belongs here.

My personal preference would be for permissive type hints in these cases.

dimaqq avatar Mar 05 '24 14:03 dimaqq

Previously discussed in #2287. This is due to the lack of fixed-length sequences, see python/typing#592.

srittau avatar Mar 05 '24 14:03 srittau

Indeed, than you for the reference to the canonical upstream ticket.

Come to think of it, it seems there's already a precedent for looser overload spec:

https://github.com/python/typeshed/blob/e80ad6b2bce7ef6b2a20aec2d79a672859b31864/stdlib/builtins.pyi#L1043-L1044

Typing machinery allows both of the below, when only one of the two is allowed at runtime:

dict([["a", "b"]])
dict([["a", "b", "c"]])

I suppose the wider discussion should involve type checker folks and maybe some proxy for user code, but I'm not sure where this discussion should take place, in this repo issue, or someplace else?

dimaqq avatar Mar 06 '24 00:03 dimaqq

This repo is the right place to make that decision. If you make PR with a change, the mypy-primer tool will test its effect on a sample of open-source code.

JelleZijlstra avatar Mar 06 '24 01:03 JelleZijlstra

There is more discussion about this in #10013.

Akuli avatar Mar 06 '24 20:03 Akuli

Note that mypy_primer is much better at measuring effect of changes that produce false positives than changes that produce false negatives. You're subject to survivorship bias [insert aeroplane meme here] and reliant on unused ignores.

To your philosophical question, typeshed will usually prefer false negatives over false positives. That said, as the existence of the str splitting overloads indicate, we often try to be pragmatic.

With that in mind, and that three issues about this in six years isn't the worst thing, I'd like to not have the fully general Iterable[Iterable[T]] overload. In particular, I'd like to keep the true positives on things that manifest in cases where str being a Sequence[str] comes up. Maybe something like def __init__(self: dict[T, T], __iterable: Iterable[list[T]]) -> None: ... would work? What is the specific real world false positive you're running into?

hauntsaninja avatar Mar 06 '24 21:03 hauntsaninja

Code

The summary of the original code (since rewritten to pass type checks):

class CatClass: ...
class DogProvider: ...

REGISTRY: list[list[str|object]] = []

# some submodule
REGISTRY.extend((
    ["cats", CatClass],
    ["dogs", DogProvider],
    # ...
))

# another submodule
REGISTRY.extend((
    # ...
))

# somewhere else
d = dict(REGISTRY)

# or, simpler, like this
d = dict([
    ["cats", CatClass],
    ["dogs", DogProvider],
])

Pyright

The earlier dict fails with:

  • No overloads for "init" match the provided arguments
  • Argument of type "list[list[str | object]]" cannot be assigned to parameter "iterable" of type "Iterable[list[bytes]]" in function "init"

The latter with:

  • No overloads for "init" match the provided arguments
  • Argument of type "list[list[str | type[CatClass]] | list[str | type[DogProvider]]]" cannot be assigned to parameter "iterable" of type "Iterable[list[bytes]]" in function "init"

Extra points for the confusing "expectation" of "Iterable[list[bytes]]"... Like where did bytes even come from, there's not a single byte in the code?

MyPy

The earlier:

  • Argument 1 to "dict" has incompatible type "list[list[str | object]]"; expected "Iterable[tuple[Never, Never]]"

The latter:

  • List item 0 has incompatible type "list[object]"; expected "tuple[Any, Any]"

Extra points for "Iterable[tuple[Never, Never]]"... like Never what does that even mean?

dimaqq avatar Aug 29 '24 08:08 dimaqq