cli: renku session start error: nano_cpus > 4300?
Describe the bug
Local renku docker session fails to start - apparently an issue with requesting too many 'nano_cpus', not sure what this is.
Renku version: 1.8.1 OS: Linux (#1 SMP PREEMPT_DYNAMIC Wed Oct 26 15:55:21 UTC 2022) Python: 3.10.7
Traceback
Traceback (most recent call last):
File "[...]/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "[...]/renku/ui/cli/session.py", line 222, in start
session_start_command()
File "[...]/renku/command/command_builder/command.py", line 252, in execute
output = self._operation(*args, **kwargs) # type: ignore
File "[...]/renku/core/session/session.py", line 107, in session_start
session_name = provider_api.session_start(
File "[...]/renku/core/session/docker.py", line 150, in session_start
resource_requests["nano_cpus"] = int(cpu_request * 10**9)
ValueError: Exceeds the limit (4300) for integer string conversion: value has 1000000000 digits
Additional context
If you go to line 150 in docker.py and change:
resource_requests["nano_cpus"] = int(cpu_request * 10**9)
for
resource_requests["nano_cpus"] = int(4300)
The container will start up fine and seems to work no issue - obviously this hack is not a long-term solution though.
Did you pass the --cpu flag when running this? Otherwise there might be a cpu_request set in your $HOME/.renku/renku.ini or in your projects .renku/renku.ini where it takes a value from.
I think we erroneously read that value as string from the config. so when we read it, let's say it's "1", the line cpu_request * 10**9 in python does not create the number 100000000, but instead a string "111111111[...]111111" which gets too long for integer conversion.
Removing that from your config probably makes this work. Nonetheless, something we'll fix.
Ah I think it may related to the use of fractional values in the cpu_request field that causes the issue, maybe that causes it to be parsed as a string rather than an numerical value.
I had I think 0.5 set in my .renku/renku.ini file, so can probably fix it with a quick type coercion somewhere.