dont support CohereForAI/c4ai-command-r-plus-4bit
Some weights of the model checkpoint at CohereForAI/c4ai-command-r-plus-4bit were not used when initializing CohereForCausalLM: ['model.layers.0.self_attn.k_norm.weight', 'model.layers.0.self_attn.q_norm.weight', 'model.layers.1.self_attn.k_norm.weight', 'model.layers.1.self_attn.q_norm.weight', 'model.layers.10.self_attn.k_norm.weight', 'model.layers.10.self_attn.q_norm.weight', 'model.layers.11.self_attn.k_norm.weight', 'model.layers.11.self_attn.q_norm.weight', 'model.layers.12.self_attn.k_norm.weight', 'model.layers.12.self_attn.q_norm.weight', 'model.layers.13.self_attn.k_norm.weight', 'model.layers.13.self_attn.q_norm.weight', 'model.layers.14.self_attn.k_norm.weight', 'model.layers.14.self_attn.q_norm.weight', 'model.layers.15.self_attn.k_norm.weight', 'model.layers.15.self_attn.q_norm.weight', 'model.layers.16.self_attn.k_norm.weight', 'model.layers.16.self_attn.q_norm.weight', 'model.layers.17.self_attn.k_norm.weight', 'model.layers.17.self_attn.q_norm.weight', 'model.layers.18.self_attn.k_norm.weight', 'model.layers.18.self_attn.q_norm.weight', 'model.layers.19.self_attn.k_norm.weight', 'model.layers.19.self_attn.q_norm.weight', 'model.layers.2.self_attn.k_norm.weight', 'model.layers.2.self_attn.q_norm.weight', 'model.layers.20.self_attn.k_norm.weight', 'model.layers.20.self_attn.q_norm.weight', 'model.layers.21.self_attn.k_norm.weight', 'model.layers.21.self_attn.q_norm.weight', 'model.layers.22.self_attn.k_norm.weight', 'model.layers.22.self_attn.q_norm.weight', 'model.layers.23.self_attn.k_norm.weight', 'model.layers.23.self_attn.q_norm.weight', 'model.layers.24.self_attn.k_norm.weight', 'model.layers.24.self_attn.q_norm.weight', 'model.layers.25.self_attn.k_norm.weight', 'model.layers.25.self_attn.q_norm.weight', 'model.layers.26.self_attn.k_norm.weight', 'model.layers.26.self_attn.q_norm.weight', 'model.layers.27.self_attn.k_norm.weight', 'model.layers.27.self_attn.q_norm.weight', 'model.layers.28.self_attn.k_norm.weight', 'model.layers.28.self_attn.q_norm.weight', 'model.layers.29.self_attn.k_norm.weight', 'model.layers.29.self_attn.q_norm.weight', 'model.layers.3.self_attn.k_norm.weight', 'model.layers.3.self_attn.q_norm.weight', 'model.layers.30.self_attn.k_norm.weight', 'model.layers.30.self_attn.q_norm.weight', 'model.layers.31.self_attn.k_norm.weight', 'model.layers.31.self_attn.q_norm.weight', 'model.layers.32.self_attn.k_norm.weight', 'model.layers.32.self_attn.q_norm.weight', 'model.layers.33.self_attn.k_norm.weight', 'model.layers.33.self_attn.q_norm.weight', 'model.layers.34.self_attn.k_norm.weight', 'model.layers.34.self_attn.q_norm.weight', 'model.layers.35.self_attn.k_norm.weight', 'model.layers.35.self_attn.q_norm.weight', 'model.layers.36.self_attn.k_norm.weight', 'model.layers.36.self_attn.q_norm.weight', 'model.layers.37.self_attn.k_norm.weight', 'model.layers.37.self_attn.q_norm.weight', 'model.layers.38.self_attn.k_norm.weight', 'model.layers.38.self_attn.q_norm.weight', 'model.layers.39.self_attn.k_norm.weight', 'model.layers.39.self_attn.q_norm.weight', 'model.layers.4.self_attn.k_norm.weight', 'model.layers.4.self_attn.q_norm.weight', 'model.layers.40.self_attn.k_norm.weight', 'model.layers.40.self_attn.q_norm.weight', 'model.layers.41.self_attn.k_norm.weight', 'model.layers.41.self_attn.q_norm.weight', 'model.layers.42.self_attn.k_norm.weight', 'model.layers.42.self_attn.q_norm.weight', 'model.layers.43.self_attn.k_norm.weight', 'model.layers.43.self_attn.q_norm.weight', 'model.layers.44.self_attn.k_norm.weight', 'model.layers.44.self_attn.q_norm.weight', 'model.layers.45.self_attn.k_norm.weight', 'model.layers.45.self_attn.q_norm.weight', 'model.layers.46.self_attn.k_norm.weight', 'model.layers.46.self_attn.q_norm.weight', 'model.layers.47.self_attn.k_norm.weight', 'model.layers.47.self_attn.q_norm.weight', 'model.layers.48.self_attn.k_norm.weight', 'model.layers.48.self_attn.q_norm.weight', 'model.layers.49.self_attn.k_norm.weight', 'model.layers.49.self_attn.q_norm.weight', 'model.layers.5.self_attn.k_norm.weight', 'model.layers.5.self_attn.q_norm.weight', 'model.layers.50.self_attn.k_norm.weight', 'model.layers.50.self_attn.q_norm.weight', 'model.layers.51.self_attn.k_norm.weight', 'model.layers.51.self_attn.q_norm.weight', 'model.layers.52.self_attn.k_norm.weight', 'model.layers.52.self_attn.q_norm.weight', 'model.layers.53.self_attn.k_norm.weight', 'model.layers.53.self_attn.q_norm.weight', 'model.layers.54.self_attn.k_norm.weight', 'model.layers.54.self_attn.q_norm.weight', 'model.layers.55.self_attn.k_norm.weight', 'model.layers.55.self_attn.q_norm.weight', 'model.layers.56.self_attn.k_norm.weight', 'model.layers.56.self_attn.q_norm.weight', 'model.layers.57.self_attn.k_norm.weight', 'model.layers.57.self_attn.q_norm.weight', 'model.layers.58.self_attn.k_norm.weight', 'model.layers.58.self_attn.q_norm.weight', 'model.layers.59.self_attn.k_norm.weight', 'model.layers.59.self_attn.q_norm.weight', 'model.layers.6.self_attn.k_norm.weight', 'model.layers.6.self_attn.q_norm.weight', 'model.layers.60.self_attn.k_norm.weight', 'model.layers.60.self_attn.q_norm.weight', 'model.layers.61.self_attn.k_norm.weight', 'model.layers.61.self_attn.q_norm.weight', 'model.layers.62.self_attn.k_norm.weight', 'model.layers.62.self_attn.q_norm.weight', 'model.layers.63.self_attn.k_norm.weight', 'model.layers.63.self_attn.q_norm.weight', 'model.layers.7.self_attn.k_norm.weight', 'model.layers.7.self_attn.q_norm.weight', 'model.layers.8.self_attn.k_norm.weight', 'model.layers.8.self_attn.q_norm.weight', 'model.layers.9.self_attn.k_norm.weight', 'model.layers.9.self_attn.q_norm.weight']
- This IS expected if you are initializing CohereForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing CohereForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-04-16 11:23:21 | WARNING | accelerate.big_modeling | You shouldn't move a model that is dispatched using accelerate hooks.
2024-04-16 11:23:21 | ERROR | stderr | Traceback (most recent call last):
2024-04-16 11:23:21 | ERROR | stderr | File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
2024-04-16 11:23:21 | ERROR | stderr | return _run_code(code, main_globals, None,
2024-04-16 11:23:21 | ERROR | stderr | File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
2024-04-16 11:23:21 | ERROR | stderr | exec(code, run_globals)
2024-04-16 11:23:21 | ERROR | stderr | File "/opt/FastChat/fastchat/serve/model_worker.py", line 414, in
2024-04-16 11:23:21 | ERROR | stderr | args, worker = create_model_worker() 2024-04-16 11:23:21 | ERROR | stderr | File "/opt/FastChat/fastchat/serve/model_worker.py", line 385, in create_model_worker 2024-04-16 11:23:21 | ERROR | stderr | worker = ModelWorker( 2024-04-16 11:23:21 | ERROR | stderr | File "/opt/FastChat/fastchat/serve/model_worker.py", line 77, in init 2024-04-16 11:23:21 | ERROR | stderr | self.model, self.tokenizer = load_model( 2024-04-16 11:23:21 | ERROR | stderr | File "/opt/FastChat/fastchat/model/model_adapter.py", line 376, in load_model 2024-04-16 11:23:21 | ERROR | stderr | model.to(device) 2024-04-16 11:23:21 | ERROR | stderr | File "/usr/local/lib/python3.9/dist-packages/accelerate/big_modeling.py", line 456, in wrapper 2024-04-16 11:23:21 | ERROR | stderr | return fn(*args, **kwargs) 2024-04-16 11:23:21 | ERROR | stderr | File "/usr/local/lib/python3.9/dist-packages/transformers/modeling_utils.py", line 2554, in to 2024-04-16 11:23:21 | ERROR | stderr | raise ValueError( 2024-04-16 11:23:21 | ERROR | stderr | ValueError: .tois not supported for4-bitor8-bitbitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correctdtype
Could you please explain how you specify the --conv-template?