DewEfresh issues

Results 7 issues of


                                            DewEfresh

RuntimeError: CUDA error: no kernel image is available for execution on the device on 3xP40

I'm having an issue trying to get mamba running on 3xP40. The model will load into vram but then crashes with "RuntimeError: CUDA error: no kernel image is available for...

TypeError: GemmaModel.forward() got an unexpected keyword argument 'cache_position'

I made a colab( https://colab.research.google.com/drive/1Z3NdoT0WS8KXnSUS3_xxT39NBZD6eGcN?usp=sharing ) to test and I ran into some issue. GemmaModel.forward() got an unexpected keyword argument 'cache_position'. I had to change some of the main.py to...

BitNet_Llama_model_test_huggingface_GPU.ipynb

I was testing in Colab and when I ran "model.model.layers[0].mlp.gate_proj.weight". I recieved very different results from yours. You got: Parameter containing: tensor([[ 0.0032, -0.0339, 0.0150, ..., 0.0041, -0.0048, 0.0061], [-0.0105,...

Mamba or Jamba models

Has any work been done with state space models. I'd be curious how they would perform with this framework applied.

BitLinear

What are your thoughts about adding bitlinear?

Condense a models layers.

I am trying to condense a model by 1/4. I want to merge the 4 layer over the previous 3 layers, When i try this i get 0 layers on...

Convert existing model to mmfree

I came across another repo https://github.com/astramind-ai/BitMat/tree/main, They convert existing models to 1.58bit. Would something similar be possible with matmulfreellm.