`NotGitRepository` error when installing multiple packages from one git repository
- Poetry version: 1.2.2
- Python version: 3.10.8
- OS version and name: macOS 13.0
- pyproject.toml: https://gist.github.com/gnuletik/8d876426a36b9bfefee4327823c1459b
- [x] I am on the latest stable Poetry version, installed using a recommended method.
- [x] I have searched the issues of this repo and believe that this is not a duplicate.
- [x] I have consulted the FAQ and blog for any relevant entries or release notes.
- [x] If an exception occurs when executing a command, I executed it again in debug mode (
-vvvoption) and have included the output below.
Issue
It seems that a race condition occurs when installing two packages:
- from the same git repository
- with a different subdirectory
- on a non-default git branch
Repro:
cd /tmp
git clone https://github.com/gnuletik/poetry-lib-monorepo-issue
cd poetry-lib-monorepo-issue
poetry install
It fails with
Package operations: 2 installs, 0 updates, 0 removals
• Installing package1 (0.1.0 c6f487b): Failed
NotGitRepository
No git repository was found at /private/tmp/test-poetry/.venv/src/poetry-multipackages-example
at /opt/homebrew/Cellar/poetry/1.2.2/libexec/lib/python3.10/site-packages/dulwich/repo.py:1090 in __init__
1086│ elif (os.path.isdir(os.path.join(root, OBJECTDIR))
1087│ and os.path.isdir(os.path.join(root, REFSDIR))):
1088│ bare = True
1089│ else:
→ 1090│ raise NotGitRepository(
1091│ "No git repository was found at %(path)s" % dict(path=root)
1092│ )
1093│
1094│ self.bare = bare
The following error occurred when trying to handle this error:
NB: output of poetry install -vvv can be found here: https://gist.github.com/gnuletik/ddcb05ff3467f022f9d3540f379763df
Please note that subsequent calls may succeed but a fresh install (after a poetry env remove --all) always fails.
Based on the error message you provided, it looks like the package you are trying to install requires a git repository, but the installation process is unable to find one at the specified location: /private/tmp/test-poetry/.venv/src/poetry-multipackages-example.
To fix this error, you will need to first determine the root cause of the problem. This may involve examining the package's code, as well as the installation process, to identify any issues. It may also be helpful to consult the documentation for the package, or seek help from the package's maintainers or the community.
Once you have determined the cause of the error, you can then take the appropriate steps to fix it. This may involve modifying the package's code, changing the way it is installed, or taking some other action.
@pneb In this case, the fault lies with Poetry; the diagnosis in the original issue appears correct to me. Related: #7113.
We are also seeing this issue with a docker build that depends on multiple packages from the same git repository.
I suspect that as more and more people adopt the monorepo strategy that is now quite well supported by poetry.
None of the workarounds presented here worked for us, we had to manually serialize the installation of the packages to avoid the race condition.
@danieldanciu can you describe the following ?
we had to manually serialize the installation of the packages to avoid the race condition
Did you run a pip install (in your venv) before running poetry install?
Are there any workarounds for this? I have multiple misc modules in a utilities repo and I'd really like to use a few of them in other projects.
The issue is pretty annoying because it's hard to pinpoint the exact problem. Especially when the installation seems to work locally but then it randomly fails in CI or in a Docker container, and after retrying, it works again.
I have the same issue for Poetry 1.3.2, 1.4.2, and 1.5.1.
@pdarulewski I think that the root issue is in the way poetry clone multiple dependencies in parallel.
The fix could be something that disable parallel install for dependencies that comes from the same repository.
https://github.com/python-poetry/poetry/blob/6e942983dff1bcc6d307c7704e8159df0c959a16/src/poetry/installation/executor.py#L71-L77
You could try to totally disable parallel installer with:
poetry config installer.parallel false
as stated here https://github.com/python-poetry/poetry/issues/7949#issue-1716659814
@gnuletik yes, I think so too, I guess I've had other errors related to the .git directory of the monorepo inside the project's virtualenv directory.
Setting the parallel to false seems to work, although as expected, the installation time is much slower. It's fine for now, thanks for the hint
This would be a great fix! We also use monorepos to handle private python packages and end up with this issue. Turning parallelism off can increase the build time x10 for a large project...
@gnuletik
Setting the parallel to false didn't work in my case.
Please note that subsequent calls may succeed but a fresh install (after a poetry env remove --all) always fails.
Does anyone have any ideas on how to better consistently reproduce this? I can reproduce it sometimes locally, but not always, which is making fixing it a pain. @gnuletik I was able to reproduce it a few times with your repos, but not every time (even after deleting the environment).
*edit: I seem to be able to reproduce it more consistently running poetry install with this repo https://github.com/JonathanRayner/some_other_repo
I see a few possible ways forward, but can I ask: what is the expected behavior?
Suppose the following monorepo structure:
monorepo/pkg_1/pyproject.toml
monorepo/pkg_2/pyproject.toml
and another repo that wants to use pkg_1 and pkg_2 as git dependencies:
some_repo/pyproject.toml
which is
[tool.poetry]
name = "some_repo"
version = "0.1.0"
description = ""
authors = ["my name <[email protected]>"]
[tool.poetry.dependencies]
python = "^3.10 <3.13"
pkg_1 = {git = "[email protected]:MyOrg/monorepo.git", subdirectory = "pkg_1"}
pkg_2 = {git = "[email protected]:MyOrg/monorepo.git", subdirectory = "pkg_2"}
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
When the user installs some_repo, there are some possibilities of what should happen
- The repo
monorepois cloned once and reused to installpkg_1andpkg_2. This advantageous for large repos. We would need to either throw an error ifpkg_1andpkg_2point to different branches/revs or allow for reverting to two separate clones if this is the case. - The repo
monorepois cloned twice, completely independently forpkg_1andpkg_2.
- The repo
monorepois cloned once and reused to installpkg_1andpkg_2. This advantageous for large repos. We would need to either throw an error ifpkg_1andpkg_2point to different branches/revs or allow for reverting to two separate clones if this is the case.
The 1. option with throwing error is probably breaking change for us. We are using monorepo approach for storing microservices APIs. Then in other projects, we combine package releases (tags) based on deployment. With throwing error, monorepo approach will not be suitable anymore.
- The repo
monorepois cloned once and reused to installpkg_1andpkg_2. This advantageous for large repos. We would need to either throw an error ifpkg_1andpkg_2point to different branches/revs or allow for reverting to two separate clones if this is the case.The 1. option with throwing error is probably breaking change for us. We are using monorepo approach for storing microservices APIs. Then in other projects, we combine package releases (tags) based on deployment. With throwing error, monorepo approach will not be suitable anymore.
Fair! It sounds like a separate clone per parallel install is a sensible default then? ie. each package is completely separate. Perhaps people with very large monorepos use other tooling to handle reducing redundancy with multiple clones anyway?
- The repo
monorepois cloned once and reused to installpkg_1andpkg_2. This advantageous for large repos. We would need to either throw an error ifpkg_1andpkg_2point to different branches/revs or allow for reverting to two separate clones if this is the case.The 1. option with throwing error is probably breaking change for us. We are using monorepo approach for storing microservices APIs. Then in other projects, we combine package releases (tags) based on deployment. With throwing error, monorepo approach will not be suitable anymore.
Fair! It sounds like a separate clone per parallel install is a sensible default then? ie. each package is completely separate. Perhaps people with very large monorepos use other tooling to handle reducing redundancy with multiple clones anyway?
Maybe git worktree can solve both problems?