RuntimeError: CUDA error: invalid device ordinal
When I load model I have this error.
Traceback (most recent call last):
File "
Trying this with
model = galai.load_model("base")
it looks like there is a device map that expects 8 GPUs, if I'm seeing this right:
{'decoder.embed_tokens': 0,
'decoder.embed_positions': 0,
'decoder.layer_norm': 0,
'decoder.layers.0': 0,
'decoder.layers.1': 0,
'decoder.layers.2': 0,
'decoder.layers.3': 1,
'decoder.layers.4': 1,
'decoder.layers.5': 1,
'decoder.layers.6': 2,
'decoder.layers.7': 2,
'decoder.layers.8': 2,
'decoder.layers.9': 3,
'decoder.layers.10': 3,
'decoder.layers.11': 3,
'decoder.layers.12': 4,
'decoder.layers.13': 4,
'decoder.layers.14': 4,
'decoder.layers.15': 5,
'decoder.layers.16': 5,
'decoder.layers.17': 5,
'decoder.layers.18': 6,
'decoder.layers.19': 6,
'decoder.layers.20': 6,
'decoder.layers.21': 7,
'decoder.layers.22': 7,
'decoder.layers.23': 7}
If you have less than the default number of GPUs (8), you have to specify how many when you load the model. Try:
model = gal.load_model(name = 'base', num_gpus = 1)
Thanks @dcruiz01 that worked out like a charm. Unsure if it deserves a mention in the README, but much appreciated for letting us know! We can probably close this issue.
Confirmed. Had same error and num_gpus = 1 resolved it.
Please mention that in your documentation / readme.
A model size between base and standard would be nice. I barely can't fit standard on my RTX 3090, I think.
Do you offer 8 bit versions/compatibility, like BLOOM?
I see, dtype='float16' does the job sorry. Please mention in readme. Many folks will want to try on a local gpu as well.
Hmm.. 8 bit would still be handy to play with larger models. Is that possible?
Num of GPUs defaults to None.
If you have less than the default number of GPUs (8)
Who has a default number of 8 GPUs?
If you have less than the default number of GPUs (8)
Who has a default number of 8 GPUs?
people that work at Meta AI, probably XD
If you have less than the default number of GPUs (8), you have to specify how many when you load the model. Try:
model = gal.load_model(name = 'base', num_gpus = 1)
why this isnt written on main page
galai 1.1.0 uses all available GPUs by default which should fix the issue. One can still manually specify the number of GPUs using num_gpus parameter. Setting num_gpus=0 (or keeping the default None if no GPUs are available) will load the model to RAM. 8 bit inference is not supported yet. Please reopen if you still experience any issues.