TechxGenus comments

Results 11 comments of


                                            TechxGenus

Multi-image mixed input

Thanks for your reply. Close this issue.

AttributeError: 'Catcher' object has no attribute 'self_attn' #29352

This issue seems to still be unresolved. Inference for the AWQ model is now back to normal, but errors still occur when trying to quantify the Llama or Gemma models.

Waiting sequence group should have only one prompt sequence.

I found the same problem. It occurs when setting n (the number of sequences returned) greater than 1, and occurs frequently when there is less gpu memory. A simple solution...

Phi-3 mini support?

I checked its architecture and it shouldn't be very hard to implement basic quantization. But its position encoding is special (LongRoPE) and implementing the fusion layer might need more work.

Issues with quantizing Cohere model

I think this is the best way to scale for Cohere: ``` from .base import BaseAWQForCausalLM from transformers.models.cohere.modeling_cohere import ( CohereDecoderLayer as OldCohereDecoderLayer, CohereForCausalLM as OldCohereForCausalLM, ) class CohereAWQForCausalLM(BaseAWQForCausalLM): layer_type...

TechxGenus

Multi-image mixed input

AttributeError: 'Catcher' object has no attribute 'self_attn' #29352

Waiting sequence group should have only one prompt sequence.

Supported Models

Phi-3 mini support?

Issues with quantizing Cohere model

Cohere Support

Cohere Support

llama : support Jamba hybrid Transformer-Mamba models

Add DDP token averaging for equivalent non-parallel training similar to #34191