DewEfresh
DewEfresh
I'm having an issue trying to get mamba running on 3xP40. The model will load into vram but then crashes with "RuntimeError: CUDA error: no kernel image is available for...
I made a colab( https://colab.research.google.com/drive/1Z3NdoT0WS8KXnSUS3_xxT39NBZD6eGcN?usp=sharing ) to test and I ran into some issue. GemmaModel.forward() got an unexpected keyword argument 'cache_position'. I had to change some of the main.py to...
I was testing in Colab and when I ran "model.model.layers[0].mlp.gate_proj.weight". I recieved very different results from yours. You got: Parameter containing: tensor([[ 0.0032, -0.0339, 0.0150, ..., 0.0041, -0.0048, 0.0061], [-0.0105,...
Has any work been done with state space models. I'd be curious how they would perform with this framework applied.
What are your thoughts about adding bitlinear?
I am trying to condense a model by 1/4. I want to merge the 4 layer over the previous 3 layers, When i try this i get 0 layers on...
I came across another repo https://github.com/astramind-ai/BitMat/tree/main, They convert existing models to 1.58bit. Would something similar be possible with matmulfreellm.