SAM-Adapter-PyTorch how to use other versions of SAM2, such as the tiny version

Hello, I see that your work is for the large version of SAM2, but I currently need to experiment in the tiny version. I made changes to hieradet.py and image_encoder.py by referring to the tiny configuration in the official SAM2 code, and although the experiment works, the loss in the large version is small at the beginning, but the loss in the tiny version is more than 200 at the beginning. So may I ask what other work I need to do if I want to load the tiny version of SAM2 in the SAM2 adapter? I would appreciate if you could answer me.

Here's the loss drop for the tiny version of SAM2 adapter loaded with SAM2：

Here's the loss drop for the large version of SAM2 adapter loaded with SAM2：

Apr 20 '25 03:04 Deepleen

Hello, I see that your work is for the large version of SAM2, but I currently need to experiment in the tiny version. I made changes to hieradet.py and image_encoder.py by referring to the tiny configuration in the official SAM2 code, and although the experiment works, the loss in the large version is small at the beginning, but the loss in the tiny version is more than 200 at the beginning. So may I ask what other work I need to do if I want to load the tiny version of SAM2 in the SAM2 adapter? I would appreciate if you could answer me.

Here's the loss drop for the tiny version of SAM2 adapter loaded with SAM2：

Here's the loss drop for the large version of SAM2 adapter loaded with SAM2：

From your question, it seems like you successfully loaded the SAM2 checkpoint and have started training. However, I'm stuck at that step. My issue is that even though I downloaded the code from the SAM2 adaptor branch, I noticed that the model name in the official YAML file is still "sam". Did you make any changes to the official configuration file? I'm sorry that I can't answer your question and even ended up asking you one. If you could help me solve my problem, I’ll do my best to help you as well! Thank you!

Apr 30 '25 08:04 APushingBoy

Hello, I see that your work is for the large version of SAM2, but I currently need to experiment in the tiny version. I made changes to hieradet.py and image_encoder.py by referring to the tiny configuration in the official SAM2 code, and although the experiment works, the loss in the large version is small at the beginning, but the loss in the tiny version is more than 200 at the beginning. So may I ask what other work I need to do if I want to load the tiny version of SAM2 in the SAM2 adapter? I would appreciate if you could answer me. Here's the loss drop for the tiny version of SAM2 adapter loaded with SAM2： Here's the loss drop for the large version of SAM2 adapter loaded with SAM2：

From your question, it seems like you successfully loaded the SAM2 checkpoint and have started training. However, I'm stuck at that step. My issue is that even though I downloaded the code from the SAM2 adaptor branch, I noticed that the model name in the official YAML file is still "sam". Did you make any changes to the official configuration file? I'm sorry that I can't answer your question and even ended up asking you one. If you could help me solve my problem, I’ll do my best to help you as well! Thank you!

Ok, you can refer to the parameters of the configuration files in the official SAM2 code (e.g. sam2_hiera_s.yaml) to modify hieradet.py and image_encoder.py in the SAM2 Adapter.Since the SAM2 Adapter is adapted to the LARGE version, you can compare sam2_ hiera_l.yaml and sam2_hiera_s.yaml for different parameters to modify those two python files. I had problems with training after modifying those two python files, so I don't know if the modifications are problematic or if there are other things that have to be modified, except that you can try this method first.

May 06 '25 01:05 Deepleen

I had problems with training after

Thank you for your reply! During this period, I carefully read the SAM2 paper and code, and I found that in SAM2, the Meta developers indeed still use the name SAM, but when building the SAM2 structure, they used a function called build_sam2. This is indeed a point that can easily cause confusion. In addition, I also noticed that the difference between SAM2 models of different sizes lies in the number of layers in different modules, rather than the internal settings of each layer (e.g., kernel size, stride, etc. of convolutional layers), which simplifies my problem.

Next, I will try to modify the SAM2 adaptor based on the SAM2 code itself, and I will also try the tiny-sized SAM2. I will keep you updated on any progress I make!

May 07 '25 01:05 APushingBoy

Hello, it's me again. I've been working on adapting the Hiera structure for the tiny version of SAM2 and am currently encountering difficulties with certain hard-coded parameters within the hieradet.py file.

As an AI, I understand you cannot directly share modified code files. However, could you please guide me on where specifically in hieradet.py I should look to make the necessary modifications for the tiny model variant? Pointing me to the relevant sections or types of parameters would be greatly appreciated.

Thank you very much for your help!

Hello, I see that your work is for the large version of SAM2, but I currently need to experiment in the tiny version. I made changes to hieradet.py and image_encoder.py by referring to the tiny configuration in the official SAM2 code, and although the experiment works, the loss in the large version is small at the beginning, but the loss in the tiny version is more than 200 at the beginning. So may I ask what other work I need to do if I want to load the tiny version of SAM2 in the SAM2 adapter? I would appreciate if you could answer me. Here's the loss drop for the tiny version of SAM2 adapter loaded with SAM2： Here's the loss drop for the large version of SAM2 adapter loaded with SAM2：

From your question, it seems like you successfully loaded the SAM2 checkpoint and have started training. However, I'm stuck at that step. My issue is that even though I downloaded the code from the SAM2 adaptor branch, I noticed that the model name in the official YAML file is still "sam". Did you make any changes to the official configuration file? I'm sorry that I can't answer your question and even ended up asking you one. If you could help me solve my problem, I’ll do my best to help you as well! Thank you!

Ok, you can refer to the parameters of the configuration files in the official SAM2 code (e.g. sam2_hiera_s.yaml) to modify hieradet.py and image_encoder.py in the SAM2 Adapter.Since the SAM2 Adapter is adapted to the LARGE version, you can compare sam2_ hiera_l.yaml and sam2_hiera_s.yaml for different parameters to modify those two python files. I had problems with training after modifying those two python files, so I don't know if the modifications are problematic or if there are other things that have to be modified, except that you can try this method first.

May 08 '25 04:05 APushingBoy

Hello, I see that your work is for the large version of SAM2, but I currently need to experiment in the tiny version. I made changes to hieradet.py and image_encoder.py by referring to the tiny configuration in the official SAM2 code, and although the experiment works, the loss in the large version is small at the beginning, but the loss in the tiny version is more than 200 at the beginning. So may I ask what other work I need to do if I want to load the tiny version of SAM2 in the SAM2 adapter? I would appreciate if you could answer me.

Here's the loss drop for the tiny version of SAM2 adapter loaded with SAM2：

Here's the loss drop for the large version of SAM2 adapter loaded with SAM2：

Hello, one hour after I wrote my last comment, I successfully solved the issue of changing the hard-coded parameters in the original SAM2 adaptor to be suitable for the tiny version of SAM2.

I am currently trying to train the tiny version of SAM2, using a configuration file based on the official demo.yaml. Although my training is not yet finished, I see that my loss value is very small (around 1.3), unlike yours which was initially as high as over 200. Here is my console output:

model_grad_params:3860760
model_total_params:34581976
epoch 1/20, train G: loss=1.3063, val: sm=0.5457, val: em=0.4971, val: wfm=0.2160, val: mae=0.3367, 2.2m 2.2m/43.0m
epoch 2/20, train G: loss=1.2524, val: sm=0.4837, val: em=0.4186, val: wfm=0.1816, val: mae=0.4359, 2.2m 4.3m/43.3m
epoch 3/20, train G: loss=1.2545, val: sm=0.5442, val: em=0.4923, val: wfm=0.2203, val: mae=0.3396, 2.2m 6.5m/43.3m
epoch 4/20, train G: loss=1.2375, val: sm=0.5545, val: em=0.5216, val: wfm=0.2401, val: mae=0.3098, 2.2m 8.7m/43.3m
train:   5%|███▎                                                                | 24/500 [00:05<01:15,  6.33it/s]

I think your problem might lie in the modifications made to hieradet.py. Although I successfully got the training code running, I am not sure whether the way I modified it is the optimal. In the second, third, and fourth stages of the tiny version's four stages, I added a prompt generator before the layers except for the first layer in each of these stages, because I encountered many errors like RuntimeError: mat1 and mat2 shapes cannot be multiplied (131072x96 and 192x6) (which was a previous reason of frustration for me). I found that the input dimensions of the first layer in the second, third, and fourth stages did not match the required input dimensions for the corresponding embedding_generator, so I skipped the first layer of these stages.

Furthermore, based on your screenshot, I noticed that your model's model_grad_params are different from mine, possibly because we modified the model in different ways. I'm not saying your modification method is wrong, but rather want to offer this as a point for consideration for you. I hope this helps you.

If you have any questions or thoughts, feel free to discuss them with me!

May 08 '25 06:05 APushingBoy

Hello, I see that your work is for the large version of SAM2, but I currently need to experiment in the tiny version. I made changes to hieradet.py and image_encoder.py by referring to the tiny configuration in the official SAM2 code, and although the experiment works, the loss in the large version is small at the beginning, but the loss in the tiny version is more than 200 at the beginning. So may I ask what other work I need to do if I want to load the tiny version of SAM2 in the SAM2 adapter? I would appreciate if you could answer me. Here's the loss drop for the tiny version of SAM2 adapter loaded with SAM2： Here's the loss drop for the large version of SAM2 adapter loaded with SAM2：

Hello, one hour after I wrote my last comment, I successfully solved the issue of changing the hard-coded parameters in the original SAM2 adaptor to be suitable for the tiny version of SAM2.

I am currently trying to train the tiny version of SAM2, using a configuration file based on the official demo.yaml. Although my training is not yet finished, I see that my loss value is very small (around 1.3), unlike yours which was initially as high as over 200. Here is my console output:
model_grad_params:3860760
model_total_params:34581976
epoch 1/20, train G: loss=1.3063, val: sm=0.5457, val: em=0.4971, val: wfm=0.2160, val: mae=0.3367, 2.2m 2.2m/43.0m
epoch 2/20, train G: loss=1.2524, val: sm=0.4837, val: em=0.4186, val: wfm=0.1816, val: mae=0.4359, 2.2m 4.3m/43.3m
epoch 3/20, train G: loss=1.2545, val: sm=0.5442, val: em=0.4923, val: wfm=0.2203, val: mae=0.3396, 2.2m 6.5m/43.3m
epoch 4/20, train G: loss=1.2375, val: sm=0.5545, val: em=0.5216, val: wfm=0.2401, val: mae=0.3098, 2.2m 8.7m/43.3m
train:   5%|███▎                                                                | 24/500 [00:05<01:15,  6.33it/s]
I think your problem might lie in the modifications made to hieradet.py. Although I successfully got the training code running, I am not sure whether the way I modified it is the optimal. In the second, third, and fourth stages of the tiny version's four stages, I added a prompt generator before the layers except for the first layer in each of these stages, because I encountered many errors like RuntimeError: mat1 and mat2 shapes cannot be multiplied (131072x96 and 192x6) (which was a previous reason of frustration for me). I found that the input dimensions of the first layer in the second, third, and fourth stages did not match the required input dimensions for the corresponding embedding_generator, so I skipped the first layer of these stages.

Furthermore, based on your screenshot, I noticed that your model's model_grad_params are different from mine, possibly because we modified the model in different ways. I'm not saying your modification method is wrong, but rather want to offer this as a point for consideration for you. I hope this helps you.

If you have any questions or thoughts, feel free to discuss them with me!

hello, I've been also encountering difficulties with the config settings in demo.yaml. Can you tell me how to change the hard-coded parameters in the original SAM2 adaptor to be suitable for the tiny version of SAM2? Thanks a lot!

May 15 '25 09:05 LiYufengzz

Hello, I see that your work is for the large version of SAM2, but I currently need to experiment in the tiny version. I made changes to hieradet.py and image_encoder.py by referring to the tiny configuration in the official SAM2 code, and although the experiment works, the loss in the large version is small at the beginning, but the loss in the tiny version is more than 200 at the beginning. So may I ask what other work I need to do if I want to load the tiny version of SAM2 in the SAM2 adapter? I would appreciate if you could answer me. Here's the loss drop for the tiny version of SAM2 adapter loaded with SAM2： Here's the loss drop for the large version of SAM2 adapter loaded with SAM2：

Hello, one hour after I wrote my last comment, I successfully solved the issue of changing the hard-coded parameters in the original SAM2 adaptor to be suitable for the tiny version of SAM2. I am currently trying to train the tiny version of SAM2, using a configuration file based on the official demo.yaml. Although my training is not yet finished, I see that my loss value is very small (around 1.3), unlike yours which was initially as high as over 200. Here is my console output:
model_grad_params:3860760
model_total_params:34581976
epoch 1/20, train G: loss=1.3063, val: sm=0.5457, val: em=0.4971, val: wfm=0.2160, val: mae=0.3367, 2.2m 2.2m/43.0m
epoch 2/20, train G: loss=1.2524, val: sm=0.4837, val: em=0.4186, val: wfm=0.1816, val: mae=0.4359, 2.2m 4.3m/43.3m
epoch 3/20, train G: loss=1.2545, val: sm=0.5442, val: em=0.4923, val: wfm=0.2203, val: mae=0.3396, 2.2m 6.5m/43.3m
epoch 4/20, train G: loss=1.2375, val: sm=0.5545, val: em=0.5216, val: wfm=0.2401, val: mae=0.3098, 2.2m 8.7m/43.3m
train:   5%|███▎                                                                | 24/500 [00:05<01:15,  6.33it/s]
I think your problem might lie in the modifications made to hieradet.py. Although I successfully got the training code running, I am not sure whether the way I modified it is the optimal. In the second, third, and fourth stages of the tiny version's four stages, I added a prompt generator before the layers except for the first layer in each of these stages, because I encountered many errors like RuntimeError: mat1 and mat2 shapes cannot be multiplied (131072x96 and 192x6) (which was a previous reason of frustration for me). I found that the input dimensions of the first layer in the second, third, and fourth stages did not match the required input dimensions for the corresponding embedding_generator, so I skipped the first layer of these stages. Furthermore, based on your screenshot, I noticed that your model's model_grad_params are different from mine, possibly because we modified the model in different ways. I'm not saying your modification method is wrong, but rather want to offer this as a point for consideration for you. I hope this helps you. If you have any questions or thoughts, feel free to discuss them with me!
hello, I've been also encountering difficulties with the config settings in demo.yaml. Can you tell me how to change the hard-coded parameters in the original SAM2 adaptor to be suitable for the tiny version of SAM2? Thanks a lot!

我看你的名字，你应该是中国人吧？所以我就直接用中文了。据我现在的了解，这个项目只修改了sam2的image encoder，在hiera（也就是image encoder）的很多层里面加了好多个prompt generator（也就是adaptor）。然后不同大小的sam2的规模差距主要是网络层数的差距，这些在sam2的原论文里是有提到的。你再具体描述一下你的问题吧，比如你报错信息啥的。

May 15 '25 09:05 APushingBoy