AssertionError: Torch not compiled with CUDA enabled
if there isn't anything special, the normal quickstart install doesn't work.
I had the same issue. The quickstart seems to install the cpu only version of pytorch by default. You need the cuda-enabled version of pytorch. Use pip/conda to uninstall the version of pytorch you have, then install the cuda version using the instructions here.
Before downloading, double-check what version of cuda you have installed so you pick the right torch version. You can do this by running nvcc --version from the command line.
Good luck!
I have the same issue with a MacBook Pro with an AMD graphic card. I don't think installing a Cuda-enabled version of PyTorch is an option in my case.
I had the same issue. The quickstart seems to install the cpu only version of pytorch by default. You need the cuda-enabled version of pytorch. Use pip/conda to uninstall the version of pytorch you have, then install the cuda version using the instructions here.
Before downloading, double-check what version of cuda you have installed so you pick the right torch version. You can do this by running
nvcc --versionfrom the command line.Good luck!
Cuda 11.7 with gpu is already installed, i could use an anaconda environment but i don't have much experience in that, still it doesn't work,
Has anyone found a workaround to this?
I had the same issue. The quickstart seems to install the cpu only version of pytorch by default. You need the cuda-enabled version of pytorch. Use pip/conda to uninstall the version of pytorch you have, then install the cuda version using the instructions here. Before downloading, double-check what version of cuda you have installed so you pick the right torch version. You can do this by running
nvcc --versionfrom the command line. Good luck!Cuda 11.7 with gpu is already installed, i could use an anaconda environment but i don't have much experience in that, still it doesn't work,
Is the CPU-only version also installed? If so, try uninstalling it. Otherwise, it sounds like an environment issue and I would make a new conda/venv environment. Both are relatively easy to get set up, here's a good place to start.
Has anyone found a workaround to this?
another attempt with the huggingface transformer worked, maybe abit complicated also had to use a cpu version of a package
Hi @Naugustogi, can you check if you still experience the issues with galai version 1.1.0? You should be able to use the model on CPU with load_model(..., num_gpus=0).
num_gpus=0)
doesn't work either
AssertionError: Torch not compiled with CUDA enabled
@Naugustogi any chance you can provide the full stack trace?
@Naugustogi any chance you can provide the full stack trace? it happened after i started the program normally with inference
import galai as gal
model = gal.load_model(name = 'mini',num_gpus=0)
model.generate("Scaled dot product attention:\n\n\\[")
i just use the cpu version
┌─────────────────────────────── Traceback (most recent call last) ────────────────────────────────┐
│ F:\galai-1.0.0\start.py:2 in <module> │
│ │
│ 1 import galai as gal │
│ > 2 model = gal.load_model(name = 'mini',num_gpus=0) │
│ 3 model.generate("Scaled dot product attention:\n\n\\[") │
│ │
│ F:\galai-1.0.0\galai\__init__.py:40 │
│ in load_model │
│ │
│ 37 │ model = Model(name=name, dtype=dtype, num_gpus=num_gpus) │
│ 38 │ model._set_tokenizer(tokenizer_path=get_tokenizer_path()) │
│ 39 │ if name in ['mini', 'base']: │
│ > 40 │ │ model._load_checkpoint(checkpoint_path=get_checkpoint_path(name)) │
│ 41 │ else: │
│ 42 │ │ model._load_checkpoint(checkpoint_path=get_checkpoint_path(name)) │
│ 43 │
│ │
│ F:\galai-1.0.0\galai\model.py:63 in │
│ _load_checkpoint │
│ │
│ 60 │ │ if 'mini' in checkpoint_path or 'base' in checkpoint_path: │
│ 61 │ │ │ checkpoint_path = checkpoint_path + '/pytorch_model.bin' │
│ 62 │ │ │
│ > 63 │ │ load_checkpoint_and_dispatch( │
│ 64 │ │ │ self.model.model, │
│ 65 │ │ │ checkpoint_path, │
│ 66 │ │ │ device_map=device_map, │
│ │
│ C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\big_modeling. │
│ py:366 in load_checkpoint_and_dispatch │
│ │
│ 363 │ │ ) │
│ 364 │ if offload_state_dict is None and "disk" in device_map.values(): │
│ 365 │ │ offload_state_dict = True │
│ > 366 │ load_checkpoint_in_model( │
│ 367 │ │ model, │
│ 368 │ │ checkpoint, │
│ 369 │ │ device_map=device_map, │
│ │
│ C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\utils\modelin │
│ g.py:701 in load_checkpoint_in_model │
│ │
│ 698 │ │ │ │ │ set_module_tensor_to_device(model, param_name, "meta") │
│ 699 │ │ │ │ │ offload_weight(param, param_name, state_dict_folder, index=state_dic │
│ 700 │ │ │ │ else: │
│ > 701 │ │ │ │ │ set_module_tensor_to_device(model, param_name, param_device, value=p │
│ 702 │ │ │
│ 703 │ │ # Force Python to clean up. │
│ 704 │ │ del checkpoint │
│ │
│ C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\utils\modelin │
│ g.py:124 in set_module_tensor_to_device │
│ │
│ 121 │ │ if value is None: │
│ 122 │ │ │ new_value = old_value.to(device) │
│ 123 │ │ elif isinstance(value, torch.Tensor): │
│ > 124 │ │ │ new_value = value.to(device) │
│ 125 │ │ else: │
│ 126 │ │ │ new_value = torch.tensor(value, device=device) │
│ 127 │
│ │
│ C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda\__init__.py:2 │
│ 21 in _lazy_init │
│ │
│ 218 │ │ │ │ "Cannot re-initialize CUDA in forked subprocess. To use CUDA with " │
│ 219 │ │ │ │ "multiprocessing, you must use the 'spawn' start method") │
│ 220 │ │ if not hasattr(torch._C, '_cuda_getDeviceCount'): │
│ > 221 │ │ │ raise AssertionError("Torch not compiled with CUDA enabled") │
│ 222 │ │ if _cudart is None: │
│ 223 │ │ │ raise AssertionError( │
│ 224 │ │ │ │ "libcudart functions unavailable. It looks like you have a broken build? │
└──────────────────────────────────────────────────────────────────────────────────────────────────┘
AssertionError: Torch not compiled with CUDA enabled
Thanks @Naugustogi. The traceback shows galai 1.0.0. Can you try with 1.1.2?
Thanks @Naugustogi. The traceback shows
galai1.0.0. Can you try with 1.1.2?
i'm not sure where to get that, in this repo, its just version 1.0.0 (3 weeks ago)
@Naugustogi You can install it with pip or clone the main git branch (currently at 1.1.2, you can verify by inspecting the setup.py file in your installation).
@Naugustogi
alright, 1.1.2 doesn't work either, it won't even show me any error, after starting, it returns the main folder
it returns the main folder
what do you mean? If you are running it as a script, you need to wrap the last line in print().
what do you mean? If you are running it as a script, you need to wrap the last line in
print().
ok it worked