setuptools icon indicating copy to clipboard operation
setuptools copied to clipboard

parse_requirements fails on requirements.txt files with multiple line continuations.

Open thisfred opened this issue 5 years ago • 5 comments

pip files with hashes, as generated by for instance the pip-compile command in pip-tools, are currently not parseable by parse_requirements in pkg_resources/__init__.py. It looks like the issue is multiple line continuations, as for instance here:

certifi==2020.6.20 \
    --hash=sha256:5930595817496dd21bb8dc35dad090f1c2cd0adfaf21204bf6732ca5d8ee34d3 \
    --hash=sha256:8fc0819f1f30ba15bdb34cceffb9ef04d99f420f68eb75d901e9560b8749fc41 \
    # via sentry-sdk

I think this:

https://github.com/pypa/setuptools/blob/master/pkg_resources/init.py#L3074

should be

while line.endswith('\\'):
    ...

rather than if.

I may have missed some subtle reason why this is not already the case, but it solved the issue I was running into locally where python setup.py [any command] was failing because I had a requirements.txt with hashes generated by pip-tools. (I think it's a fairly unusual edge case because we also use pbr, which I think is what's triggering the parsing of the requirements to happen when calling setup.py).

I'd like to get a reading on how likely a patch is to be accepted before I write up the PR and tests, but I'd be happy to do so.

If it's not something that's likely to get in, I'll probably dump pbr as a dependency instead.

thisfred avatar Oct 14 '20 17:10 thisfred

I'd like to get a reading on how likely a patch is to be accepted

Thanks for the offer and taking the time to get an early read... and your caution is warranted.

Although historically, this project and much of the Python packaging tools would develop de-facto standards, often in isolation from the other tools, these days we're aiming to avoid the pitfalls that creates.

What that means for a problem like the one you've described is that rather than accept the output of pip-compile as correct, we'd instead want to develop and drive to acceptance a standard for requirements files on which all tools can rely.

In the case of parse_requirements, PEP 440 seems to be the closest specification we have, and in my brief scan of it, it doesn't appear to address a format with line continuations.

If we can find some reference that line continuations are a supported format, then by all means Setuptools should support it. If such a reference doesn't exist, then we have a bigger barrier ahead, which is to create that standard.

A change of this magnitude may not require a PEP (a documented reference is sufficient) or it may require a PEP or modification to one.

@pradyunsg Do you know who owns the requirements.txt format spec and are you aware of any documentation that specifies the syntax?

jaraco avatar Oct 17 '20 17:10 jaraco

Nope, there isn't really a spec for it. It's purely implementation defined.

There have been some efforts toward designing a "requirements 2.0" that'll make things nicer on that front -- but that's not happened yet.

pradyunsg avatar Oct 17 '20 19:10 pradyunsg

That's what I thought might be the case, since nothing related to packaging is ever simple. ;)

I could argue that since a) pip itself accepts this format (and checks the multiple hashes), and b) a single line continuation is accepted, but not a double one, it feels more like an oversight, rather than a conscious decision not to allow continuations, but I definitely do understand it's best to be very cautious in making any changes to something so central.

thisfred avatar Oct 26 '20 16:10 thisfred

According to pip's requirements file format: A line ending in an unescaped \ is treated as a line continuation and the newline following it is effectively ignored. It is useful in Hash-Checking Mode. Is it enough to accept a patch for parsing multiple line continuations in requirements?

jwygoda avatar Dec 30 '20 19:12 jwygoda

Encountered this same issue.

If/until this issue is ever addressed, setuputils should handle Exception, from the error message from vendored pyproject_hook package, gracefully. Something like,

To pip-tool/pip-compile users, generate_hashes was supported and now isn't as of setuptools 69.0.x. Set generate_hashes to false then try again

pyproject.toml can be successfully parse,

validate-pyproject pyproject.toml

While build fails, often swallowing the Exception. Since it's running the backend in a subprocess, the Exception isn't propagated.

python -m build

Save the Exception

Hacked a hack so as to log the Exception. Here is the original. Again this is a hack to diagnose the issue, otherwise the log file would be written to a sane location.

pyproject_hooks/_in_process/_in_process.py

def main():
    if len(sys.argv) < 3:
        sys.exit("Needs args: hook_name, control_dir")
    hook_name = sys.argv[1]
    control_dir = sys.argv[2]
    if hook_name not in HOOK_NAMES:
        sys.exit("Unknown hook: %s" % hook_name)
    hook = globals()[hook_name]

    hook_input = read_json(pjoin(control_dir, 'input.json'))

    json_out = {'unsupported': False, 'return_val': None}

    # begin: added
    import io
    from pathlib import Path
    path_log_file = Path(__file__).parent / "log.txt"
    
    with io.StringIO() as f_out:
        try:
            json_out['return_val'] = hook(**hook_input['kwargs'])
        except Exception as e:
            is_except = True
            f_out.write(f"Exception: {e}\n")
            import traceback
            traceback.print_tb(e.__traceback__, file=f_out)
            path_log_file.write_text(f_out.getvalue())
        else:
            is_except = False

    if is_except:
        # end: added
        try:
            json_out['return_val'] = hook(**hook_input['kwargs'])
        except BackendUnavailable as e:
            json_out['no_backend'] = True
            json_out['traceback'] = e.traceback
        except BackendInvalid as e:
            json_out['backend_invalid'] = True
            json_out['backend_error'] = e.message
        except GotUnsupportedOperation as e:
            json_out['unsupported'] = True
            json_out['traceback'] = e.traceback
        except HookMissing as e:
            json_out['hook_missing'] = True
            json_out['missing_hook_name'] = e.hook_name or hook_name
    else:  # pragma: no cover
        pass

    write_json(json_out, pjoin(control_dir, 'output.json'), indent=2)

Turn off the hashes and regenerating requirements.txt and requirements-[group].txt then python -m build works. Then revert pyproject_hooks/_in_process.py removing the hack.

[tool.pip-tools]
no_header = true
resolver = "backtracking"
no_allow_unsafe = true
generate_hashes = false  # <-- Set to false

msftcangoblowm avatar Dec 12 '23 03:12 msftcangoblowm