TechxGenus issues

Results 12 issues of


                                            TechxGenus

Supported Models

Congratulations on coming up with such an excellent quantization algorithm! I'm trying to use AQLM to quantize Deepseek-Coder and Starcoder2, but the repository doesn't seem to have direct support. Are...

Fine-tuning Script

Congratulations to DeepSeek for the wonderful work. I wonder if there is a script for fine-tuning DeepSeek-VL? Thanks!

Multi-image mixed input

Does DeepSeek-VL series support input of multiple images? This doesn't seem to be stated in the paper, but `images` in the example script are `list`, which seems to be supported.

[FEATURE] ADD Jamba Support

**Is your feature request related to a problem? Please describe.** Jamba is a state-of-the-art, hybrid SSM-Transformer LLM. It delivers throughput gains over traditional Transformer-based models, while outperforming or matching the...

enhancement

Support for RecurrentGemma (Gemma with Griffin Architecture)

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged...

enhancement

model

Cohere Support

Initial support for Jamba

Add support for Jamba. Currently no implement of the fusion module. Its mamba layer, attention layer and moe layer are all different from the implementation of existing cuda kernels and...

[BUG] Load StarCoder2 AWQ using Transformers

When loading starcoder2-AWQ using transformers, I received a confusing error: ```py model = AutoModelForCausalLM.from_pretrained( "TechxGenus/starcoder2-3b-AWQ", torch_dtype=torch.float16, device_map="auto", ) ``` get: > Some weights of the model checkpoint at TechxGenus/starcoder2-3b-AWQ were...

🤗 [REQUEST] - CodeGemma-2b

### Model introduction We've fine-tuned Gemma-2b with an additional 0.7 billion high-quality, code-related tokens for 3 epochs. This model operates using the Alpaca instruction format (excluding the system prompt). ###...

model eval

🤗 [REQUEST] - CodeGemma-7b

model eval