lighteval Evaluate EncoderDecoderModels

When I try to evaluate a roBERTa EncoderDecoderModel after fixing the "openai" import https://github.com/huggingface/lighteval/issues/175 errors occur with the command:

accelerate launch --num_processes=1 run_evals_accelerate.py \
    --model_args "pretrained=Bachstelze/instructionRoberta-base" \
    --tasks "lighteval|truthfulqa:mc|0|0" \
    --override_batch_size 1 \
    --output_dir="./evals/"

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
2024-05-03 07:14:21.100922: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-03 07:14:21.100992: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-03 07:14:21.102501: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-03 07:14:22.196482: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
tokenizer_config.json: 100% 52.0/52.0 [00:00<00:00, 271kB/s]
config.json: 100% 729/729 [00:00<00:00, 4.43MB/s]
vocab.json: 100% 899k/899k [00:00<00:00, 3.40MB/s]
merges.txt: 100% 456k/456k [00:00<00:00, 1.73MB/s]
pytorch_model.bin: 100% 1.62G/1.62G [00:11<00:00, 144MB/s] 
Could not initialize the JudgeOpenAI model:
name 'OpenAI' is not defined
Could not initialize the JudgeOpenAI model:
name 'OpenAI' is not defined
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
WARNING:lighteval.logging.hierarchical_logger:main: (0, Namespace(model_config_path=None, model_args='pretrained=Bachstelze/instructionRoberta-base', max_samples=None, override_batch_size=1, job_id='', output_dir='./evals/', push_results_to_hub=False, save_details=False, push_details_to_hub=False, public_run=False, cache_dir=None, results_org=None, use_chat_template=False, system_prompt=None, dataset_loading_processes=1, custom_tasks=None, tasks='lighteval|truthfulqa:mc|0|0', num_fewshot_seeds=1)),  {
WARNING:lighteval.logging.hierarchical_logger:  Test all gather {
WARNING:lighteval.logging.hierarchical_logger:    Test gather tensor
WARNING:lighteval.logging.hierarchical_logger:    gathered_tensor tensor([0], device='cuda:0'), should be [0]
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.005201]
WARNING:lighteval.logging.hierarchical_logger:  Creating model configuration {
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.000031]
WARNING:lighteval.logging.hierarchical_logger:  Model loading {
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
config.json: 100% 4.79k/4.79k [00:00<00:00, 23.9MB/s]
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.206942]
WARNING:lighteval.logging.hierarchical_logger:} [0:00:00.214687]
Traceback (most recent call last):
  File "/content/lighteval/run_evals_accelerate.py", line 82, in <module>
    main(args)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/logging/hierarchical_logger.py", line 166, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/main_accelerate.py", line 77, in main
    model, model_info = load_model(config=model_config, env_config=env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/model_loader.py", line 83, in load_model
    return load_model_with_accelerate_or_default(config=config, env_config=env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/model_loader.py", line 125, in load_model_with_accelerate_or_default
    model = BaseModel(config=config, env_config=env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 76, in __init__
    self._max_length = self._init_max_length(config.max_length)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 269, in _init_max_length
    if hasattr(self.tokenizer, "model_max_length"):
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 103, in tokenizer
    return self._tokenizer
AttributeError: 'BaseModel' object has no attribute '_tokenizer'. Did you mean: 'tokenizer'?
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 46, in main
    args.func(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1075, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 681, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'run_evals_accelerate.py', '--model_args', 'pretrained=Bachstelze/instructionRoberta-base', '--tasks', 'lighteval|truthfulqa:mc|0|0', '--override_batch_size', '1', '--output_dir=./evals/']' returned non-zero exit status 1.

and after changing _tokenizer to tokenizer:

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
2024-05-03 07:22:04.016177: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-03 07:22:04.016231: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-03 07:22:04.017617: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-03 07:22:05.084129: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
tokenizer_config.json: 100% 52.0/52.0 [00:00<00:00, 351kB/s]
config.json: 100% 729/729 [00:00<00:00, 5.36MB/s]
vocab.json: 100% 899k/899k [00:00<00:00, 27.2MB/s]
merges.txt: 100% 456k/456k [00:00<00:00, 6.55MB/s]
pytorch_model.bin: 100% 1.62G/1.62G [00:33<00:00, 48.7MB/s]
Could not initialize the JudgeOpenAI model:
name 'OpenAI' is not defined
Could not initialize the JudgeOpenAI model:
name 'OpenAI' is not defined
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
WARNING:lighteval.logging.hierarchical_logger:main: (0, Namespace(model_config_path=None, model_args='pretrained=Bachstelze/instructionRoberta-base', max_samples=None, override_batch_size=1, job_id='', output_dir='./evals/', push_results_to_hub=False, save_details=False, push_details_to_hub=False, public_run=False, cache_dir=None, results_org=None, use_chat_template=False, system_prompt=None, dataset_loading_processes=1, custom_tasks=None, tasks='lighteval|truthfulqa:mc|0|0', num_fewshot_seeds=1)),  {
WARNING:lighteval.logging.hierarchical_logger:  Test all gather {
WARNING:lighteval.logging.hierarchical_logger:    Test gather tensor
WARNING:lighteval.logging.hierarchical_logger:    gathered_tensor tensor([0], device='cuda:0'), should be [0]
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.005309]
WARNING:lighteval.logging.hierarchical_logger:  Creating model configuration {
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.000020]
WARNING:lighteval.logging.hierarchical_logger:  Model loading {
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
config.json: 100% 4.79k/4.79k [00:00<00:00, 24.7MB/s]
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.181156]
WARNING:lighteval.logging.hierarchical_logger:} [0:00:00.188934]
Traceback (most recent call last):
  File "/content/lighteval/run_evals_accelerate.py", line 82, in <module>
    main(args)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/logging/hierarchical_logger.py", line 166, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/main_accelerate.py", line 77, in main
    model, model_info = load_model(config=model_config, env_config=env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/model_loader.py", line 83, in load_model
    return load_model_with_accelerate_or_default(config=config, env_config=env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/model_loader.py", line 125, in load_model_with_accelerate_or_default
    model = BaseModel(config=config, env_config=env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 76, in __init__
    self._max_length = self._init_max_length(config.max_length)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 269, in _init_max_length
    if hasattr(self.tokenizer, "model_max_length"):
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 103, in tokenizer
    return self.tokenizer
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 103, in tokenizer
    return self.tokenizer
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 103, in tokenizer
    return self.tokenizer
  [Previous line repeated 989 more times]
RecursionError: maximum recursion depth exceeded
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 46, in main
    args.func(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1075, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 681, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'run_evals_accelerate.py', '--model_args', 'pretrained=Bachstelze/instructionRoberta-base', '--tasks', 'lighteval|truthfulqa:mc|0|0', '--override_batch_size', '1', '--output_dir=./evals/']' returned non-zero exit status 1.

System-Info:

transformers version: 4.40.1
Platform: Linux-6.1.58+-x86_64-with-glibc2.35
Python version: 3.10.12
Huggingface_hub version: 0.23.0
Safetensors version: 0.4.3
Accelerate version: 0.29.3
Accelerate config: not found
PyTorch version (GPU?): 2.2.1+cu121 (True)
Tensorflow version (GPU?): 2.15.0 (True)
Flax version (CPU?/GPU?/TPU?): 0.8.3 (gpu)
Jax version: 0.4.26
JaxLib version: 0.4.26
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

It is confusing me that the loaded model is way bigger than my model on huggingface.

May 03 '24 07:05 Bachstelze

hi ! the first issue you encoutered happens because the model does not have a sequence_length in the model config. this bug is being fixed in #185. the second issue is a recursion due to the fact that you changed _tokenizer to tokenizer.

a quick fix would be to simply return a default sequence_length value in the init_max_length function.

May 12 '24 13:05 NathanHB

I cloned to repository from @gucci-j and get this ValueError: Unrecognized configuration class:

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
2024-05-13 08:59:06.982659: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-13 08:59:06.982704: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-13 08:59:06.984096: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-05-13 08:59:08.011446: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
tokenizer_config.json: 100% 52.0/52.0 [00:00<00:00, 397kB/s]
config.json: 100% 729/729 [00:00<00:00, 5.31MB/s]
vocab.json: 100% 899k/899k [00:00<00:00, 12.3MB/s]
merges.txt: 100% 456k/456k [00:00<00:00, 2.38MB/s]
pytorch_model.bin: 100% 1.62G/1.62G [00:16<00:00, 95.8MB/s]
Could not initialize the JudgeOpenAI model:
name 'OpenAI' is not defined
Could not initialize the JudgeOpenAI model:
name 'OpenAI' is not defined
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
WARNING:lighteval.logging.hierarchical_logger:main: (0, Namespace(model_config_path=None, model_args='pretrained=Bachstelze/instructionRoberta-base', max_samples=None, override_batch_size=1, job_id='', output_dir='./evals/', push_results_to_hub=False, save_details=False, push_details_to_hub=False, public_run=False, cache_dir=None, results_org=None, use_chat_template=False, system_prompt=None, dataset_loading_processes=1, custom_tasks=None, tasks='lighteval|truthfulqa:mc|0|0', num_fewshot_seeds=1)),  {
WARNING:lighteval.logging.hierarchical_logger:  Test all gather {
WARNING:lighteval.logging.hierarchical_logger:    Test gather tensor
WARNING:lighteval.logging.hierarchical_logger:    gathered_tensor tensor([0], device='cuda:0'), should be [0]
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.005391]
WARNING:lighteval.logging.hierarchical_logger:  Creating model configuration {
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.000021]
WARNING:lighteval.logging.hierarchical_logger:  Model loading {
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
config.json: 100% 4.79k/4.79k [00:00<00:00, 25.7MB/s]
WARNING:lighteval.logging.hierarchical_logger:    No max length config setting is found in the model or tokenizer. max_length set to 2048.
tokenizer_config.json: 100% 1.22k/1.22k [00:00<00:00, 9.84MB/s]
vocab.json: 100% 798k/798k [00:00<00:00, 3.27MB/s]
merges.txt: 100% 456k/456k [00:00<00:00, 61.2MB/s]
tokenizer.json: 100% 2.11M/2.11M [00:00<00:00, 28.1MB/s]
special_tokens_map.json: 100% 280/280 [00:00<00:00, 1.87MB/s]
WARNING:lighteval.logging.hierarchical_logger:    Tokenizer truncation and padding size set to the left side.
WARNING:lighteval.logging.hierarchical_logger:    We are not in a distributed setting. Setting model_parallel to False.
WARNING:lighteval.logging.hierarchical_logger:    Model parallel was set to False, max memory set to None and device map to None
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:01.846038]
WARNING:lighteval.logging.hierarchical_logger:} [0:00:01.854152]
Traceback (most recent call last):
  File "/content/lighteval/run_evals_accelerate.py", line 82, in <module>
    main(args)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/logging/hierarchical_logger.py", line 166, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/main_accelerate.py", line 77, in main
    model, model_info = load_model(config=model_config, env_config=env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/model_loader.py", line 83, in load_model
    return load_model_with_accelerate_or_default(config=config, env_config=env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/model_loader.py", line 125, in load_model_with_accelerate_or_default
    model = BaseModel(config=config, env_config=env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 82, in __init__
    self.model = self._create_auto_model(config, env_config)
  File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 176, in _create_auto_model
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
    raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.encoder_decoder.configuration_encoder_decoder.EncoderDecoderConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CohereConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DbrxConfig, ElectraConfig, ErnieConfig, FalconConfig, FuyuConfig, GemmaConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, JambaConfig, LlamaConfig, MambaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MptConfig, MusicgenConfig, MusicgenMelodyConfig, MvpConfig, OlmoConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, Qwen2MoeConfig, RecurrentGemmaConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, StableLmConfig, Starcoder2Config, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 46, in main
    args.func(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1082, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 688, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'run_evals_accelerate.py', '--model_args', 'pretrained=Bachstelze/instructionRoberta-base', '--tasks', 'lighteval|truthfulqa:mc|0|0', '--override_batch_size', '1', '--output_dir=./evals/']' returned non-zero exit status 1.

May 13 '24 09:05 Bachstelze

Hi! Atm we only support AutoModelForCausalLM models (autoregressive models), not AutoModelForSeq2SeqLM (seq2seq models), as they are considerably less used. We might add them in the future!

Jul 17 '24 13:07 clefourrier