Ar57m
Ar57m
before I tried slicing the tensors to match the small base model size, but it seems very unstable even at low weights(like 0.01, 0.005), so in this branch breaking_math I...
> This is really interesting! I'll definitely have to play with it a bit. > > I'm a little surprised this kind of interpolation isn't more immediately harmful to the...
I did Sheared-LLaMA-2.7B-ShareGPT + SanjiWatsuki/Silicon-Maid-7B at 10% if you guys wanna test [Aryanne/sheared-silicon10p](https://huggingface.co/Aryanne/sheared-silicon10p), if I did everything right(and didn't break anything), this should be working better than before. But I'm...
here's a code that shows what bilinear/linear is doing: ``` import torch import torch.nn.functional as F torch.manual_seed(42) x = torch.tensor([[1., 3.],[5., 7.],[9.,11.],[13.,15.]]) y = torch.tensor([[2.0, 4.0],[6.0,8.0],[10.,12.]]) print("X :") print(x) print("\nBase...
> check thanks for the check, this branch seems to be degrading a lot the base model(I submitted some trials using this branch and the other weighting_interp to the open-llm-leaderboard,...
idk, but this doesn't seems very promising at the moment, don't know if there's a good way of merging models with different sizes. btw @cg123 , I've been messing with...
@david565656 hey, this pr I was trying in the breaking_math and weighting_interp branch. [here](https://huggingface.co/Aryanne/9b_plus_tie13/tree/main) I did a merge a 13b into a 9b, but probably made the model invent words,...
> Hi, I am trying to merge two models following this [post](https://towardsdatascience.com/merge-large-language-models-with-mergekit-2118fb392b54). Here is my config: > > ``` > yaml_config = """ > slices: > - sources: > -...
> @cg123 I set `clone_tensors=True` in `MergeOptions` class and still got the same error > >  seems that you wrote clone_tensor it's clone_tensors
> I'm trying to merge fine tuned safetensors into a model so I can save it as a gguf format. The directory contains adapter_config.json and adapter_model.safetensors, but the error message...