Medusa icon indicating copy to clipboard operation
Medusa copied to clipboard

[report bug] Encountered when inferencing with Mistral models

Open shrango opened this issue 1 year ago • 0 comments

I met the following error when inferencing with base_model Mistral-7b-Instruct-v0.2

File "~/Medusa/medusa/model/modeling_mistral_kv.py", line 74, in _make_sliding_window_causal_mask 
                    
mask = torch.triu(mask, diagonal=-sliding_window)
                                 ^^^^^^^^^^^^^^^
TypeError: bad operand type for unary -: 'NoneType'

I think I have found the reason: the sliding_window value is null for both Mistral-7b-Instruct-v0.2 and Mistral-7b-Instruct-v0.3, which triggers the bug above.

To fix it, I suggest the author add a line to file "medusa/model/modeling_mistral_kv.py", line 74, as follows: switch

mask = torch.triu(mask, diagonal=-sliding_window)

into

if sliding_window is not None:
    mask = torch.triu(mask, diagonal=-sliding_window)

shrango avatar Oct 10 '24 21:10 shrango