metadata_parser Support Python3.13

Importing metadata_parser in Python 3.13 runtime raises:

2024-11-06 14:14:26     import metadata_parser
2024-11-06 14:14:26   File "/usr/local/lib/python3.13/site-packages/metadata_parser/__init__.py", line 4, in <module>
2024-11-06 14:14:26     import cgi  # noqa: I100,I201
2024-11-06 14:14:26     ^^^^^^^^^^
2024-11-06 14:14:26 ModuleNotFoundError: No module named 'cgi'

As of Python 3.13, the cgi module was removed

https://docs.python.org/3.11/library/cgi.html https://peps.python.org/pep-0594/#cgi https://discuss.python.org/t/alternative-function-for-deprecated-cgi/21960

Nov 06 '24 13:11 Jan-Jasek

For a quick workaround I find (with the doc) a plugin named legacy-cgi https://pypi.org/project/legacy-cgi/

https://docs.python.org/3.13/library/cgi.html#module-cgi

(I'll check to make a PR with a real fix)

Jan 22 '25 14:01 Dryusdan

I've released a new version on pypi with the above hotfix.

I'm leaving this issue open to explore a way to remove the new dependency in the future.

Jan 22 '25 16:01 jvanasco

Yeay o/

I currently find a way : https://github.com/Dryusdan/metadata_parser/tree/dev/remove-cgi but it's very ugly actually (I will reduce that tonight or tomorow)

Just a question, why do you need encoding charset ?

Jan 22 '25 17:01 Dryusdan

Just a question, why do you need encoding charset ?

The charset will either be in the response headers, or a meta-tag.

Without detecting it, the page content can't be properly decoded - and fatal Exceptions will often be raised. While ASCII/UTF-8 dominate English and European internet content, the global internet is significantly more diverse. Some languages require UTF-16 or UTF-32; some sites use non-unicode encodings as well - such as various ISOs or some language specific sets.

Jan 22 '25 17:01 jvanasco

Fixed in PR #44 \o/

Jan 23 '25 13:01 Dryusdan

I'm closing this issue as the legacy-cgi fix is sufficient.

i'm still considering the PR to drop it, but there are more important needs on this project right now.

May 24 '25 01:05 jvanasco