I am trying to prune with python main.py
--model mistralai/Mistral-7B-Instruct-v0.2
--prune_method wanda
--sparsity_ratio 0.5
--sparsity_type unstructured
--save out/mistral_7b/unstructured/wanda/ and the output is as below. torch 2.3.0 transformers 4.41.0.dev0 accelerate 0.31.0.dev0

of gpus: 2

loading llm model mistralai/Mistral-7B-Instruct-v0.2 ^MLoading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]^MLoading checkpoint shards: 33%|███▎ | 1/3 [00:30<01:01, 30.79s/it]^MLoading checkpoint shards: 67%|██████▋ | 2/3 [00:46<00:21,$ use device cuda:0 pruning starts loading calibdation data dataset loading complete Traceback (most recent call last): File "/mnt/parscratch/users/acq22stk/teamproject/wanda/main.py", line 110, in main() File "/mnt/parscratch/users/acq22stk/teamproject/wanda/main.py", line 69, in main prune_wanda(args, model, tokenizer, device, prune_n=prune_n, prune_m=prune_m) File "/mnt/parscratch/users/acq22stk/teamproject/wanda/lib/prune.py", line 144, in prune_wanda inps, outs, attention_mask, position_ids = inps.to(dev), outs.to(dev), attention_mask.to(dev), position_ids.to(dev) AttributeError: 'NoneType' object has no attribute 'to'

May 08 '24 17:05 kast424

It seems you are using the mistral model, for a model with different definition file (https://github.com/huggingface/transformers/blob/v4.40.0/src/transformers/models/mistral/modeling_mistral.py), the codebase needs to be adapted.

May 09 '24 02:05 Eric-mingjie

I could see magnitude pruning work on the same mistral model.

May 09 '24 13:05 kast424

It seems you are using the mistral model, for a model with different definition file (https://github.com/huggingface/transformers/blob/v4.40.0/src/transformers/models/mistral/modeling_mistral.py), the codebase needs to be adapted.

Same. So where should we change to accommodate models with different structures?

Aug 04 '24 16:08 Shinning-Zhou

Hi @Shinning-Zhou and @kast424,

Did you figure out how to work around this issue? I am facing similar error when working with llama-2-7b-chat (#67)

Aug 30 '24 06:08 NamburiSrinath

Hi @Shinning-Zhou and @kast424,

Did you figure out how to work around this issue? I am facing similar error when working with llama-2-7b-chat (#67)

I removed the attention_mask variable because llama doesn't have it

Aug 30 '24 14:08 Shinning-Zhou

Thanks, that resolved the issue :)

Sep 01 '24 14:09 NamburiSrinath

Hi @Shinning-Zhou, @kast424

I am trying to run this repo on Mistral but am facing a different error, is there a fix for this yet i.e can we prune using SparseGPT and Wanda on Mistral!

Here's the issue in more detail - #68

Sep 12 '24 20:09 NamburiSrinath