Pyinstaller packaged Petals binary fails to load/download on`AutoDistributedModelForCausalLM.from_pretrained`
I using Pyinstaller to package a script that uses Petals, here's what my components look like:
main.py:
import torch, os
from petals import AutoDistributedModelForCausalLM
import multiprocessing
import multihash
import multiaddr
import multiaddr.codecs.fspath
multiprocessing.freeze_support()
model="petals-team/StableBeluga2"
print("GOING TO START NOW")
def download_model() -> None:
print(os)
_ = AutoDistributedModelForCausalLM.from_pretrained(model, **kwargs)
download_model()
requirements.txt:
fastapi==0.95.0
uvicorn==0.21.1
pytest==7.2.2
requests==2.28.2
tqdm==4.65.0
httpx==0.23.3
python-dotenv==1.0.0
tenacity==8.2.2
petals==2.2.0
For installing deps & env:
#!/bin/bash
PWD_PATH="$(pwd)"
REQUIREMENTS_FILE_PATH="${PWD_PATH}/petals/requirements.txt"
# Create and activate a Python virtual environment
virtualenv venv -p=3.11
source ./venv/bin/activate
# Install the necessary packages
pip install -r $REQUIREMENTS_FILE_PATH pyinstaller
Note: There seems to be an issue when using torchscript with pyinstaller and hivemind uses it for a gelu function so above the function definition I've put
torch.jit.script = torch.jit.script_if_tracingwhich makes it work with pyinstaller
And for packaging using pyinstaller I have this script (package_petals.sh):
#!/bin/bash
# Set the paths to the Python script, requirements file, and bash script
PWD_PATH="$(pwd)"
PYTHON_SCRIPT_PATH="${PWD_PATH}/petals/main.py"
BUILD_SIDE_SCRIPT_PATH="${PWD_PATH}/petals/package_petals.sh"
DIST_PATH="${PWD_PATH}/bin/python"
pyinstaller --onefile --distpath $DIST_PATH --clean --hidden-import=torch --collect-data torch --hidden-import=transformers --collect-data=transformers --copy-metadata=transformers --copy-metadata torch --copy-metadata tqdm --copy-metadata regex --copy-metadata requests --copy-metadata packaging --copy-metadata filelock --copy-metadata numpy --copy-metadata tokenizers --copy-metadata importlib_metadata --copy-metadata huggingface-hub --copy-metadata safetensors --copy-metadata pyyaml --copy-metadata petals --copy-metadata hivemind --hidden-import multiprocessing.BufferTooShort --hidden-import multiprocessing.AuthenticationError --hidden-import multiprocessing.get_context --hidden-import multiprocessing.TimeoutError --hidden-import multiprocessing.set_start_method --hidden-import multiprocessing.get_start_method --hidden-import multiprocessing.Queue --hidden-import multiprocessing.Process --hidden-import multiprocessing.Pipe --hidden-import multiprocessing.cpu_count --hidden-import multiprocessing.RLock --hidden-import multiprocessing.Pool --hidden-import torch.multiprocessing._prctl_pr_set_pdeathsig --hidden-import torch.distributed._tensor._collective_utils --hidden-import hivemind --hidden-import hivemind.dht.schema --hidden-import multiprocessing --hidden-import scipy.linalg._basic --hidden-import filelock._windows --hidden-import scipy.sparse._dok --hidden-import scipy.linalg._special_matrices --name=cht-petals-aarch64-apple-darwin --paths ./venv/lib/python3.11/site-packages --paths "$(PWD_PATH)/petals" $PYTHON_SCRIPT_PATH
now when I run ./bin/python/cht-petals-aarch64-apple-darwin it throws FileNotFoundError (also doesn't download shards from hf_hub unless explicitly mentioned force_download=True but even after downloading it doesn't seem to load up and fails with same error):
Oct 18 23:18:54.719 [INFO] Make sure you follow the LLaMA's terms of use: https://bit.ly/llama2-license for LLaMA 2, https://bit.ly/llama-license for LLaMA 1
Oct 18 23:18:54.719 [INFO] Using DHT prefix: StableBeluga2-hf
Traceback (most recent call last):
File "download.py", line 39, in <module>
File "download.py", line 34, in download_model
File "petals/utils/auto_config.py", line 78, in from_pretrained
File "petals/utils/auto_config.py", line 51, in from_pretrained
File "petals/client/from_pretrained.py", line 37, in from_pretrained
File "transformers/modeling_utils.py", line 3085, in from_pretrained
File "petals/models/llama/model.py", line 132, in __init__
File "petals/models/llama/model.py", line 34, in __init__
File "petals/client/remote_sequential.py", line 47, in __init__
File "petals/client/routing/sequence_manager.py", line 89, in __init__
File "hivemind/dht/dht.py", line 88, in __init__
File "hivemind/dht/dht.py", line 148, in run_in_background
File "hivemind/dht/dht.py", line 151, in wait_until_ready
File "hivemind/utils/mpfuture.py", line 262, in result
File "concurrent/futures/_base.py", line 445, in result
File "concurrent/futures/_base.py", line 390, in __get_result
FileNotFoundError: [Errno 2] No such file or directory
I also came through #468 which has the exact same stacktrace as mine (though I haven't validated #468 as in my usecase I specifically need a binary file, so irrelevant - hence used pyinstaller)
Other things I tried
when I do something like:
try:
_ = AutoDistributedModelForCausalLM.from_pretrained(args.model, **kwargs)
except FileNotFoundError as e:
os.system("python")
so that it pops up a new python shell but now inside it and using local python from binary, and if I run the code after importing all modules and with .from_pretrained manually inside the shell -> it works but only inside this new shell which binary file initiates.
Also discarding petals totally and just using its transformers equivalent modules didn't cause any issue and the models downloads and loads up correctly when followed the above steps.
My hunch is something specific happening because of how there are few dynamic (not static) elements when things startup in - petals x hivemind i.e in these files
File "petals/client/remote_sequential.py", line 47, in __init__
File "petals/client/routing/sequence_manager.py", line 89, in __init__
File "hivemind/dht/dht.py", line 88, in __init__
File "hivemind/dht/dht.py", line 148, in run_in_background
File "hivemind/dht/dht.py", line 151, in wait_until_ready
File "hivemind/utils/mpfuture.py", line 262, in result
My machine specifications:
Apple M2 16GB macos - 13.5.1