pygit2 icon indicating copy to clipboard operation
pygit2 copied to clipboard

repo.walk in thread hangs the main thread

Open timxx opened this issue 5 years ago • 2 comments

When calling repo.walk in a new thread, it blocks the main thread until it finished!!!

See the example code below, when run with a huge repo (qt5, chromium e.g.), the main thread won't print any message until the repo.walk ended. (Use iter also the same)

It seems that repo.diff also have the problem.

from pygit2 import Repository, GIT_SORT_TOPOLOGICAL
from threading import Thread

import sys
import time


def thread_func(repo_dir):
    repo = Repository(repo_dir)

    print(">>>>>>>> begin diff")
    commits = list(repo.walk(repo.head.target, GIT_SORT_TOPOLOGICAL))
    #for commit in repo.walk(repo.head.target, GIT_SORT_TOPOLOGICAL):
    #    continue
    print(">>>>>>>> end diff")


def test(repo_dir):
    t = Thread(target=thread_func, args=[repo_dir])
    t.start()

    while t.is_alive():
        print("main thread...")
        time.sleep(0.01)


if __name__ == "__main__":
    if len(sys.argv) != 2:
        print(">>>>>>>> Invalid argument")
        sys.exit(-1)

    test(sys.argv[1])

timxx avatar Apr 29 '20 05:04 timxx

That's not the behaviour I observe. If you replace list(...) by the for loop you will see many prints. In other words, it's list which is blocking, not pygit2. And that's expected in my opinion, read about the Python's GIL (Global Interpreter Lock): list is a single call, so the GIL won't allow any other thread to run.

You can either write the code differently, using a for loop, or go multiprocessing.

jdavid avatar May 03 '20 07:05 jdavid

As I mentioned, the for loop is the same here. My project also uses for loop, but it just hangs the GUI thread. On windows platform it even worse compare to Linux.

I will try to use multiprocessing to see if it have nice performance to walk on small repo.

timxx avatar May 05 '20 01:05 timxx