typeshed icon indicating copy to clipboard operation
typeshed copied to clipboard

make xml module easier to work with

Open tungol opened this issue 5 months ago • 3 comments

Changes:

  • Removed the default value of _Tag, so Element is now Element[Any] instead of Element[str].
  • clean up small inconsistencies caused by the default type of _Tag obscuring where things were missed
  • The TypeAlias _AnyTag isn't strictly needed, but may be convenient for interested end users
  • ElementTree is generic over _Tag now instead of TypeVar(_Root, bound=Element[Any] | None). The None case is marginal and can be ignored. The variable represents the type of the tag attribute of elements within the tree.
  • TreeBuilder is also generic over _Tag. Ideally, it should be TreeBuilder[str] when both insert_comments and insert_pis are False, and TreeBuilder[_AnyTag] when either is true, but it wasn't possible to do something like this based on the __init__ method last time I tried. I don't know if this is worth doing without that.

XMLParser is unchanged, but worth noting because I'm not happy with it still. I don't know if it can be improved given how much duck typing is at work there, but I might take another crack at it later.

@hterik if you're able to try these stubs out against your code, can you let me know what pain points remain? You can clone this branch of typeshed down to your local machine, and tell mypy to use it with mypy --custom-typeshed-dir path/to/typeshed/repo

I don't have a lot of experience using the xml module myself, so your input is very useful.

I am also open to the idea that all of this typevar stuff was a bad idea in the first place - the main thing it's trying to do is save users from needing to guard against ET.Comment and ET.ProcessingInstruction when working with Elements that only contain string tags. It'd save a lot of complexity in the stubs if we bite the bullet on that and just type Element.tag as str | _ElementCallable universally. My intuition was that that would be a frequent papercut annoyance, but maybe the problems introduced by trying to be fancy about it are worse, actually.

tungol avatar Sep 04 '25 08:09 tungol

Pyright is failing on xml.etree.ElementTree users in our third party stubs that now are lacking type arguments for Element. I can fix those up easily enough, but I'll wait for opinions about the direction of this MR before I do that.

tungol avatar Sep 04 '25 08:09 tungol

Diff from mypy_primer, showing the effect of this PR on open source code:

openlibrary (https://github.com/internetarchive/openlibrary)
+ openlibrary/i18n/test_po_files.py: note: In function "test_html_format":
+ openlibrary/i18n/test_po_files.py:82: error: Need type annotation for "id_tree"  [var-annotated]
+ openlibrary/i18n/test_po_files.py:83: error: Need type annotation for "str_tree"  [var-annotated]

colour (https://github.com/colour-science/colour)
+ colour/io/tm2714.py:1690: error: Need type annotation for "tree"  [var-annotated]

sphinx (https://github.com/sphinx-doc/sphinx)
+ sphinx/testing/util.py: note: In function "etree_parse":
+ sphinx/testing/util.py:79:50: error: Missing type parameters for generic type "ElementTree"  [type-arg]
+ sphinx/ext/graphviz.py: note: In function "fix_svg_relative_paths":
+ sphinx/ext/graphviz.py:242:12: error: Need type annotation for "tree"  [var-annotated]

materialize (https://github.com/MaterializeInc/materialize)
- misc/python/materialize/cli/gen-chroma-syntax.py:56: error: Incompatible types in assignment (expression has type "Element[str] | None", variable has type "Element[str]")  [assignment]
+ misc/python/materialize/cli/gen-chroma-syntax.py:56: error: Incompatible types in assignment (expression has type "Element[Any] | None", variable has type "Element[Any]")  [assignment]
+ misc/python/materialize/cli/gen-chroma-syntax.py:70: error: Need type annotation for "tree"  [var-annotated]

boostedblob (https://github.com/hauntsaninja/boostedblob)
+ boostedblob/xml.py:15: error: Missing type parameters for generic type "Element"  [type-arg]
+ boostedblob/azure_auth.py:286: error: Need type annotation for "result"  [var-annotated]
+ boostedblob/azure_auth.py:573: error: Need type annotation for "result"  [var-annotated]
+ boostedblob/request.py:271: error: Missing type parameters for generic type "Element"  [type-arg]
+ boostedblob/request.py:287: error: Need type annotation for "result"  [var-annotated]
+ boostedblob/listing.py:394: error: Missing type parameters for generic type "Element"  [type-arg]
+ boostedblob/write.py:355: error: Need type annotation for "result"  [var-annotated]
+ boostedblob/write.py:446: error: Need type annotation for "result"  [var-annotated]
+ boostedblob/_recover.py:20: error: Missing type parameters for generic type "Element"  [type-arg]

python-chess (https://github.com/niklasf/python-chess)
+ chess/svg.py:151: error: Missing type parameters for generic type "Element"  [type-arg]
+ chess/svg.py:185: error: Missing type parameters for generic type "Element"  [type-arg]

mkosi (https://github.com/systemd/mkosi)
+ mkosi/distributions/opensuse.py:259:12: error: Need type annotation for "root"  [var-annotated]

pytest (https://github.com/pytest-dev/pytest)
+ src/_pytest/junitxml.py:94: error: Missing type parameters for generic type "Element"  [type-arg]
+ src/_pytest/junitxml.py:97: error: Missing type parameters for generic type "Element"  [type-arg]
+ src/_pytest/junitxml.py:107: error: Missing type parameters for generic type "Element"  [type-arg]
+ src/_pytest/junitxml.py:146: error: Missing type parameters for generic type "Element"  [type-arg]
+ src/_pytest/junitxml.py:685: error: Missing type parameters for generic type "Element"  [type-arg]

cloud-init (https://github.com/canonical/cloud-init)
+ cloudinit/sources/helpers/azure.py:382: error: Need type annotation for "root"  [var-annotated]
+ cloudinit/sources/helpers/azure.py:1026: error: Need type annotation for "root"  [var-annotated]

meson (https://github.com/mesonbuild/meson)
+ mesonbuild/modules/_qt.py:309:20: error: Need type annotation for "tree"  [var-annotated]

discord.py (https://github.com/Rapptz/discord.py)
- discord/ext/commands/hybrid.py:508: error: Overlap between argument names and ** TypedDict items: "description", "name"  [misc]
+ discord/ext/commands/hybrid.py:508: error: Overlap between argument names and ** TypedDict items: "name", "description"  [misc]
- discord/ext/commands/hybrid.py:629: error: Overlap between argument names and ** TypedDict items: "description", "name"  [misc]
+ discord/ext/commands/hybrid.py:629: error: Overlap between argument names and ** TypedDict items: "name", "description"  [misc]
- discord/ext/commands/hybrid.py:834: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name"  [misc]
+ discord/ext/commands/hybrid.py:834: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command"  [misc]
- discord/ext/commands/hybrid.py:858: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name"  [misc]
+ discord/ext/commands/hybrid.py:858: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command"  [misc]
- discord/ext/commands/hybrid.py:883: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name"  [misc]
+ discord/ext/commands/hybrid.py:883: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command"  [misc]
- discord/ext/commands/hybrid.py:935: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name"  [misc]
+ discord/ext/commands/hybrid.py:935: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command"  [misc]
- discord/ext/commands/bot.py:290: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name"  [misc]
+ discord/ext/commands/bot.py:290: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command"  [misc]
- discord/ext/commands/bot.py:314: error: Overlap between argument names and ** TypedDict items: "with_app_command", "name"  [misc]
+ discord/ext/commands/bot.py:314: error: Overlap between argument names and ** TypedDict items: "name", "with_app_command"  [misc]

github-actions[bot] avatar Sep 04 '25 08:09 github-actions[bot]

See #14678 for what it looks like if we commit to the type of Element.tag being a union of a bunch of stuff

tungol avatar Sep 05 '25 12:09 tungol