TechxGenus

Results 12 issues of TechxGenus

Congratulations on coming up with such an excellent quantization algorithm! I'm trying to use AQLM to quantize Deepseek-Coder and Starcoder2, but the repository doesn't seem to have direct support. Are...

Congratulations to DeepSeek for the wonderful work. I wonder if there is a script for fine-tuning DeepSeek-VL? Thanks!

Does DeepSeek-VL series support input of multiple images? This doesn't seem to be stated in the paper, but `images` in the example script are `list`, which seems to be supported.

**Is your feature request related to a problem? Please describe.** Jamba is a state-of-the-art, hybrid SSM-Transformer LLM. It delivers throughput gains over traditional Transformer-based models, while outperforming or matching the...

enhancement

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged...

enhancement
model

Add support for Jamba. Currently no implement of the fusion module. Its mamba layer, attention layer and moe layer are all different from the implementation of existing cuda kernels and...

When loading starcoder2-AWQ using transformers, I received a confusing error: ```py model = AutoModelForCausalLM.from_pretrained( "TechxGenus/starcoder2-3b-AWQ", torch_dtype=torch.float16, device_map="auto", ) ``` get: > Some weights of the model checkpoint at TechxGenus/starcoder2-3b-AWQ were...

### Model introduction We've fine-tuned Gemma-2b with an additional 0.7 billion high-quality, code-related tokens for 3 epochs. This model operates using the Alpaca instruction format (excluding the system prompt). ###...

model eval

### Model introduction We've fine-tuned Gemma-2b with an additional 0.7 billion high-quality, code-related tokens for 3 epochs. This model operates using the Alpaca instruction format (excluding the system prompt). ###...

model eval