mypy icon indicating copy to clipboard operation
mypy copied to clipboard

False positive: No overload variant of "asdict" matches argument type "Type[DataclassInstance]" [call-overload]

Open brianmedigate opened this issue 1 year ago • 13 comments

Bug Report

Mypy 1.11.0 gives me a false positive that 1.10.1 didn't.

Looks like it thinks that is_dataclass() narrows to Type[DataclassInstance], whereas in reality it should probably be something more like Type[DataclassInstance] | DataclassInstance.

Not sure if this is a typeshed issue or a mypy issue, but it might be related to the recent overload changes.

To Reproduce

https://mypy-play.net/?mypy=latest&python=3.8&gist=a561e9787a7710733b09df23d2fc5c04

from dataclasses import dataclass, asdict, is_dataclass
from typing import Any


@dataclass
class Foo:
    a: int


foo: Any = Foo(a=1)

if is_dataclass(foo):
    asdict(foo)

Expected Behavior

No errors

Actual Behavior

main.py:13: error: No overload variant of "asdict" matches argument type "Type[DataclassInstance]"  [call-overload]
main.py:13: note: Possible overload variants:
main.py:13: note:     def asdict(obj: DataclassInstance) -> Dict[str, Any]
main.py:13: note:     def [_T] asdict(obj: DataclassInstance, *, dict_factory: Callable[[List[Tuple[str, Any]]], _T]) -> _T
Found 1 error in 1 file (checked 1 source file)

Your Environment

  • Mypy version used: 1.11.0
  • Python version used: 3.8.19

brianmedigate avatar Jul 21 '24 07:07 brianmedigate

I also encountered this on Friday so thank you for filing this. We are using python 3.11. mypy 1.10 was fine.

Code is at https://github.com/lsst/daf_butler/blob/main/python/lsst/daf/butler/formatters/json.py#L135-L136

Example action failure: https://github.com/lsst/daf_butler/actions/runs/10014302029/job/27683746998

timj avatar Jul 21 '24 20:07 timj

I'm not sure it's a false positive. If is_dataclass narrows to Type[DataclassInstance] | DataclassInstance, then you need to ensure it's not Type[DataclassInstance] before passing to asdict. From your example, it would be fixed by

if is_dataclass(foo) and not isinstance(foo, type):
    asdict(foo)

cbornet avatar Jul 22 '24 10:07 cbornet

Hmm, good point. The error message confused me because it said argument type "Type[DataclassInstance]".

Thanks!

brianmedigate avatar Jul 22 '24 11:07 brianmedigate

I tried changing my code to:

137 if dataclasses.is_dataclass(inMemoryDataset) and not isinstance(inMemoryDataset, type):
138     inMemoryDataset = dataclasses.asdict(inMemoryDataset)

but whilst the code does work fine mypy is now complaining about a different thing:

python/lsst/daf/butler/formatters/json.py:138: error: Statement is unreachable  [unreachable]

so it looks like mypy doesn't really understand that that isinstance check filters out types.

timj avatar Jul 22 '24 17:07 timj

@timj not sure what your new problem is without more context. For me the isinstance did the job. BTW, inMemoryDataset = dataclasses.asdict(inMemoryDataset) will probably not please mypy as you're trying to assign a dict to a variable that mypy has typed to dataclass.

cbornet avatar Jul 22 '24 21:07 cbornet

Sorry. The context was the same lines from my previous comment pointing at my repo. To be concrete:

import dataclasses
from typing import Any


def to_dict(in_memory_dataset: Any) -> dict:
    as_dict: dict[str, Any]
    if dataclasses.is_dataclass(in_memory_dataset) and not isinstance(in_memory_dataset, type):
        as_dict = dataclasses.asdict(in_memory_dataset)
    elif hasattr(in_memory_dataset, "_asdict"):
        as_dict = in_memory_dataset._asdict()
    else:
        as_dict = {}
    return as_dict

gives me an error:

test.py:8: error: Statement is unreachable  [unreachable]
Found 1 error in 1 file (checked 1 source file)

Note that my mypy.ini has warn_unreachable = True in it (which doesn't seem to be the default).

timj avatar Jul 22 '24 22:07 timj

hmm, you're right. Here's a playground reproducing it: https://mypy-play.net/?mypy=latest&python=3.12&flags=warn-unreachable&gist=9453a339138852b03bf0d8f16286baf4

If I add a reveal_type() after just the dataclasses.is_dataclass(in_memory_dataset) check, it does think the type is main.py:8: note: Revealed type is "type[_typeshed.DataclassInstance]", not Union[DataclassInstance, type[DataclassInstance]] or whatever like I would expect: https://mypy-play.net/?mypy=latest&python=3.12&flags=warn-unreachable&gist=0948b2c430cab1d09fa0a378e772172b

Maybe this is still an issue.

brianmedigate avatar Jul 23 '24 05:07 brianmedigate

bump. encountered it too after an update to mypy 1.11.0 in python 3.12.4. # type: ignore'ed for now to get on with prod.

i very much appreciate the warning about the is_dataclass narrowing to instance or class. didn't know about the latter.

lczyk avatar Jul 23 '24 09:07 lczyk

Thanks for the mypy playground. It shows that in mypy 1.11 the type from the is_dataclass test is:

main.py:8: note: Revealed type is "type[_typeshed.DataclassInstance]"

but the type from 1.10.1 is still Any.

The bug seems to be that after is_dataclass the type should be a type[DataclassInstance] | DataclassInstance.

timj avatar Jul 23 '24 17:07 timj

related: https://github.com/python/typeshed/issues/12401

Looks like it really did show up in this version of mypy because of the overload changes. Also looks like it might be a problem without an easy fix, because it's not clear there's a better alternative than the current implementation.

@timj your case can be fixed by changing the order of the checks from if dataclasses.is_dataclass(in_memory_dataset) and not isinstance(in_memory_dataset, type): to if not isinstance(in_memory_dataset, type) and dataclasses.is_dataclass(in_memory_dataset): because in the first case it matches the first overload, and then the isinstance-of-type check narrows it to unreachable, but in the second case we first narrow the type to knowing that it isn't a type, then it matches the second overload and correctly narrows to DataclassInstance

brianmedigate avatar Jul 23 '24 20:07 brianmedigate

@brianmedigate, thanks. Changing the order of the checks does fix my unreachable statement problem.

timj avatar Jul 23 '24 20:07 timj

I believe the relevant change included in mypy 1.11 is https://github.com/python/typeshed/pull/11929

hauntsaninja avatar Jul 24 '24 00:07 hauntsaninja

I just came across this problem in my own code. Not sure what the most correct resolution is, but here is a complete implementation of is_dataclass_instance for python 3.12 that appears to work. Any feedback is appreciated!

from typing_extensions import TypeIs # Will be included in python3.13.


class DataclassInstance(Protocol):
  'Copied from typeshed/stdlib/_typeshed/__init__.py.'
  __dataclass_fields__: ClassVar[dict[str, Field[Any]]]


def is_dataclass_instance(obj:Any) -> TypeIs[DataclassInstance]:
  '''
  Returns `True` if `obj` is a dataclass instance. Returns `False` for all type objects.
  It is often more correct to use this function rather than `is_dataclass`
  it is easy to forget that `is_dataclass` returns `True` for types
  and many use cases do not intend to let type objects through.
  This function is annotated as returning `TypeIs[DataclassInstance]`
  to aid the type checker in narrowing the type of `obj`.
  '''
  return not isinstance(obj, type) and is_dataclass(obj)

gwk avatar Aug 27 '24 22:08 gwk

I ran into a slight variation and found a different workaround. I'm doing some (fairly sketchy) metaprogramming with dataclasses.fields():

import dataclasses
from typing import Any

def create_any_dataclass[T](ty: type[T]) -> T:
    assert dataclasses.is_dataclass(ty)

    params: dict[str, Any] = {}
    for field in dataclasses.fields(ty):
        params[field.name] = 'asdf' # something more sophisticated checking field.type
    
    return ty(**params)

But this workaround works because just ty2 is narrowed, while ty is unchanged.

import dataclasses
from typing import Any

def create_any_dataclass[T](ty: type[T]) -> T:
    ty2 = ty
    assert dataclasses.is_dataclass(ty2)

    params: dict[str, Any] = {}
    for field in dataclasses.fields(ty2):
        params[field.name] = 'asdf' # something more sophisticated checking field.type
    
    return ty(**params)

danieldulaney avatar Nov 07 '24 06:11 danieldulaney