stub packages with overlapping namespaces overwrite their METADATA.toml
Hello,
While working on the python3-typeshed package for Debian I noticed that if one installs two types- packages involving the same namespace, the METADATA.toml files of the first package will be overwritten by the second package:
$ python3 -m venv typeshed-venv
$ . typeshed-venv/bin/activate
$ pip install types_google_cloud_ndb
Collecting types_google_cloud_ndb
Obtaining dependency information for types_google_cloud_ndb from https://files.pythonhosted.org/packages/09/d8/70b6b36b0e82095a43b2ff7cfe0a55f12fa53bdc70e4d2768538aae646fa/types_google_cloud_ndb-2.2.0.1-py3-none-any.whl.metadata
Downloading types_google_cloud_ndb-2.2.0.1-py3-none-any.whl.metadata (1.6 kB)
Downloading types_google_cloud_ndb-2.2.0.1-py3-none-any.whl (16 kB)
Installing collected packages: types_google_cloud_ndb
Successfully installed types_google_cloud_ndb-2.2.0.1
$ cat typeshed-venv/lib/python3.11/site-packages/google-stubs/METADATA.toml
version = "2.2.*"
upstream_repository = "https://github.com/googleapis/python-ndb"
partial_stub = true
[tool.stubtest]
stubtest_requirements = ["protobuf==3.20.2", "six"]
ignore_missing_stub = true
$ pip install types_protobuf
Collecting types_protobuf
Obtaining dependency information for types_protobuf from https://files.pythonhosted.org/packages/72/03/f7dd2f1ec9712c4242f04b7cb0f7e88605a98ee2695f0e98d72a277580aa/types_protobuf-4.24.0.4-py3-none-any.whl.metadata
Downloading types_protobuf-4.24.0.4-py3-none-any.whl.metadata (1.9 kB)
Downloading types_protobuf-4.24.0.4-py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.1/62.1 kB 1.2 MB/s eta 0:00:00
Installing collected packages: types_protobuf
Successfully installed types_protobuf-4.24.0.4
$ cat typeshed-venv/lib/python3.11/site-packages/google-stubs/METADATA.toml
version = "4.24.*"
upstream_repository = "https://github.com/protocolbuffers/protobuf"
extra_description = "Generated using [mypy-protobuf==3.5.0](https://github.com/nipunn1313/mypy-protobuf/tree/v3.5.0) on protobuf==4.21.8"
partial_stub = true
[tool.stubtest]
ignore_missing_stub = true
From a packaging system perspective, this makes me uncomfortable.
- Does
METADATA.tomlneed to be installed in the top-level namespace of stub packages? - Would it be a problem to exclude it from the Debian packages, instead of randomly choosing which one is kept?
- If keeping these files is important, perhaps they could be prefixed, so they can exist alongside each other:
METADATA.google_cloud_ndb.toml/METADATA.protobuf.tomlperhaps
If I remember correctly, the METADATA files are mainly included for reference or potential use by tooling and have no runtime impact apart from that.
I see a few options, but I'd be interested in what others think.
- As suggested by @mr-c, rename the metadata files to include the package name. For consistency's sake, we should do so for all type packages, not only namespace packages.
- For namespace packages, copy the metadata file into all non-namespace packages below the namespace package. (E.g. into
google-stubs/cloud/ndb/METADATA.tomlinstead ofgoogle-stubs/METADATA.toml.) - Skip copying the metadata file for namespace packages.
If I remember correctly, the METADATA files are mainly included for reference or potential use by tooling and have no runtime impact apart from that.
I believe this is correct: no external tool that I know of uses the METADATA.toml file, so it shouldn't be the worst thing in the world if you excluded them from the Debian packages @mr-c. The idea is that users should be able to inspect them if they want to, but I'm not sure if anybody actually does. (typeshed-stats looks at the METADATA.toml files, but it grabs them directly from GitHub rather than downloading the built packages from PyPI.)
But, I agree that we should also fix this in typeshed so that it isn't an issue in the first place. I like @srittau's option (1); it seems simplest to me.
I feel like a more principled solution might be to include the data from the METADATA.toml in the dist-info directory somehow, instead of in the stubs directory. After all, the METADATA.toml conceptually applies to the whole distribution, not to an individual directory.
I looked into that, but accroding to the packaging docs, this may not be allowed:
This .dist-info directory may contain the following files, described in detail below:
I understand this to mean that the list is exhaustive, although we could ask the PyPA.
~In practice, pip doesn't like if you add extra files to dist info~ actually what I'm remembering may only be true of things not in RECORD
I understand this to mean that the list is exhaustive, although we could ask the PyPA.
https://discuss.python.org/t/extra-files-in-dist-info/39418
It seems that - while not officially supported - adding extra files to .dist-info is not officially supported, it is tolerated. I don't think we should name the file METADATA.toml, though, considering possible confusion with the existing METADATA file. Maybe name it TYPESHED.toml or _TYPESHED.toml?
This leaves the technical aspect of adding the file to the .tar.gz and .whl files. This could prove tricky, as it seems that .whl files include a checksum. But I haven't looked into this.
wheel library makes it easy to manage checksums, see e.g. here: https://github.com/hauntsaninja/change_wheel_version