FlexGen
FlexGen copied to clipboard
Unable to run the benchmark
Hi,
I'm trying to run the benchmark bench_30b_1x4.sh (except that I set N_GPUS=2), but I get the following python exception:
rank #1: TypeError: sequence item 6: expected str instance, NoneType found
Traceback (most recent call last):
File "/home/fungiboletus/miniconda3/envs/flexgen/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/fungiboletus/miniconda3/envs/flexgen/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/fungiboletus/flexgen/flexgen/dist_flex_opt.py", line 694, in <module>
raise e
File "/home/fungiboletus/flexgen/flexgen/dist_flex_opt.py", line 690, in <module>
run_flexgen_dist(args)
File "/home/fungiboletus/flexgen/flexgen/dist_flex_opt.py", line 620, in run_flexgen_dist
outputs = tokenizer.batch_decode(output_ids, skip_special_tokens=True)
File "/home/fungiboletus/miniconda3/envs/flexgen/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3432, in batch_decode
return [
File "/home/fungiboletus/miniconda3/envs/flexgen/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3433, in <listcomp>
self.decode(
File "/home/fungiboletus/miniconda3/envs/flexgen/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3471, in decode
return self._decode(
File "/home/fungiboletus/miniconda3/envs/flexgen/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 949, in _decode
sub_texts.append(self.convert_tokens_to_string(current_sub_text))
File "/home/fungiboletus/miniconda3/envs/flexgen/lib/python3.10/site-packages/transformers/models/gpt2/tokenization_gpt2.py", line 316, in convert_tokens_to_string
text = "".join(tokens)
TypeError: sequence item 6: expected str instance, NoneType found
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. ...
I use Python 3.10.9 with Pytorch 1.13.1 with Cuda 11.7, and mpirun 2.1.1.