Medical-SAM2 icon indicating copy to clipboard operation
Medical-SAM2 copied to clipboard

Batch size limitations for 3D/video training

Open ff98li opened this issue 1 year ago • 2 comments

Thank you for sharing the awesome work!

I'm trying to fine-tune it for 3D/video segmentation but ran into an issue. The code seems to have the batch size hard-coded to 1: https://github.com/SuperMedIntel/Medical-SAM2/blob/534a6d808b2efcf7222c149be06211205e5eb053/func_3d/dataset/init.py#L29-L44

When I tried bumping it up to 2, it threw an error: image

Quick question: In your 3D/video model experiments, did you also train with just one video per batch? Would love to know if this is expected behavior or if I might be missing something.

Thanks in advance for any insights!

ff98li avatar Dec 10 '24 12:12 ff98li

Traceback (most recent call last): File "/root/Medical-SAM2/train_3d.py", line 112, in main() File "/root/Medical-SAM2/train_3d.py", line 95, in main loss, prompt_loss, non_prompt_loss = function.train_sam(args, net, optimizer1, optimizer2, nice_train_loader, epoch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/Medical-SAM2/func_3d/function.py", line 115, in train_sam _, _, _ = net.train_add_new_bbox( ^^^^^^^^^^^^^^^^^^^^^^^ File "/root/Medical-SAM2/sam2_train/sam2_video_predictor.py", line 439, in train_add_new_bbox out_frame_idx, out_obj_ids, out_mask_logits = self.train_add_new_points( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/Medical-SAM2/sam2_train/sam2_video_predictor.py", line 523, in train_add_new_points current_out, _ = self._run_single_frame_inference( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/Medical-SAM2/sam2_train/sam2_video_predictor.py", line 1351, in _run_single_frame_inference pred_masks_gpu = fill_holes_in_mask_scores( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/Medical-SAM2/sam2_train/utils/misc.py", line 255, in fill_holes_in_mask_scores is_hole = (labels > 0) & (areas <= max_area) ^^^^^^^^^^ RuntimeError: CUDA error: no kernel image is available for execution on the device Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Hello, may I ask if you encountered the error mentioned above while adjusting the code? How did you handle it?

CodeHarcourt avatar Jan 14 '25 01:01 CodeHarcourt

感谢您分享出色的工作!

我正在尝试针对 3D/视频分割对其进行微调,但遇到了问题。该代码的批量大小似乎硬编码为 1:

医学 - SAM2/func_3d/数据集/init.py

第 29 至 44 行 在534A6D8

if args.数据集 == 'btcv': '''btcv 数据''' btcv_train_dataset = BTCV(args, args.data_path,transform = None,transform_msk= None,mode = 'Training',prompt=args。提示) btcv_test_dataset = BTCV(args, args.data_path,transform = None,transform_msk= None,mode = 'Test',prompt=args。提示)

 nice_train_loader = DataLoader(btcv_train_dataset, batch_size=1, shuffle=True, num_workers=8, pin_memory=True) 
 nice_test_loader = DataLoader(btcv_test_dataset, batch_size=1, shuffle=False, num_workers=1, pin_memory=True) 
 ''完'' 

elif args 的dataset == 'amos': '''amos 数据''' amos_train_dataset = AMOS(args, args.data_path,transform = None,transform_msk= None,mode = 'Training',prompt=args。提示) amos_test_dataset = AMOS(args, args.data_path,transform = None,transform_msk= None,mode = 'Test',prompt=args。提示)

 nice_train_loader = DataLoader(amos_train_dataset, batch_size=1, shuffle=True, num_workers=8, pin_memory=True) 
 nice_test_loader = DataLoader(amos_test_dataset, batch_size=1, shuffle=False, num_workers=1, pin_memory=True) 
 ''完'' 

当我尝试将其提高到 2 时,它引发了一个错误: image

快速提问:在您的 3D/视频模型实验中,您是否也只用每批次一个视频进行训练?很想知道这是预期的行为,还是我可能遗漏了什么。

提前感谢您的任何见解!

Thank you for sharing the awesome work!

I'm trying to fine-tune it for 3D/video segmentation but ran into an issue. The code seems to have the batch size hard-coded to 1:

Medical-SAM2/func_3d/dataset/init.py

Lines 29 to 44 in 534a6d8

if args.dataset == 'btcv': '''btcv data''' btcv_train_dataset = BTCV(args, args.data_path, transform = None, transform_msk= None, mode = 'Training', prompt=args.prompt) btcv_test_dataset = BTCV(args, args.data_path, transform = None, transform_msk= None, mode = 'Test', prompt=args.prompt)

 nice_train_loader = DataLoader(btcv_train_dataset, batch_size=1, shuffle=True, num_workers=8, pin_memory=True) 
 nice_test_loader = DataLoader(btcv_test_dataset, batch_size=1, shuffle=False, num_workers=1, pin_memory=True) 
 '''end''' 

elif args.dataset == 'amos': '''amos data''' amos_train_dataset = AMOS(args, args.data_path, transform = None, transform_msk= None, mode = 'Training', prompt=args.prompt) amos_test_dataset = AMOS(args, args.data_path, transform = None, transform_msk= None, mode = 'Test', prompt=args.prompt)

 nice_train_loader = DataLoader(amos_train_dataset, batch_size=1, shuffle=True, num_workers=8, pin_memory=True) 
 nice_test_loader = DataLoader(amos_test_dataset, batch_size=1, shuffle=False, num_workers=1, pin_memory=True) 
 '''end''' 

When I tried bumping it up to 2, it threw an error: image

Quick question: In your 3D/video model experiments, did you also train with just one video per batch? Would love to know if this is expected behavior or if I might be missing something.

Thanks in advance for any insights!

I also have a similar problem, how do you solve it? If I modify batchsize, it seems that pos_embed also needs to be modified, right? thank you

yyy1998-i avatar Feb 24 '25 02:02 yyy1998-i