SAM example code does not work
System Info
-
transformersversion: 4.29.0.dev0 - Platform: Linux-3.10.0-957.12.2.el7.x86_64-x86_64-with-glibc2.10
- Python version: 3.8.3
- Huggingface_hub version: 0.13.4
- Safetensors version: not installed
- PyTorch version (GPU?): 1.5.0 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
No response
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [X] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
img_url = "https://huggingface.co/ybelkada/segment-anything/resolve/main/assets/car.png" raw_image = Image.open(requests.get(img_url, stream=True).raw).convert("RGB") input_points = [[[450, 600]]] # 2D location of a window in the image
inputs = processor(raw_image, input_points=input_points, return_tensors="pt").to(device) outputs = model(**inputs)
masks = processor.image_processor.post_process_masks( outputs.pred_masks.cpu(), inputs["original_sizes"].cpu(), inputs["reshaped_input_sizes"].cpu() ) scores = outputs.iou_scores
Expected behavior
RuntimeError Traceback (most recent call last)
~/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs) 548 result = self._slow_forward(*input, **kwargs) 549 else: --> 550 result = self.forward(*input, **kwargs) 551 for hook in self._forward_hooks.values(): 552 hook_result = hook(self, input, result)
~/miniconda3/envs/pytorch/lib/python3.8/site-packages/transformers/models/sam/modeling_sam.py in forward(self, pixel_values, input_points, input_labels, input_boxes, input_masks, image_embeddings, multimask_output, output_attentions, output_hidden_states, return_dict, **kwargs) 1331 ) 1332 -> 1333 sparse_embeddings, dense_embeddings = self.prompt_encoder( 1334 input_points=input_points, 1335 input_labels=input_labels,
~/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs) 548 result = self._slow_forward(*input, **kwargs) 549 else: --> 550 result = self.forward(*input, **kwargs) 551 for hook in self._forward_hooks.values(): 552 hook_result = hook(self, input, result)
~/miniconda3/envs/pytorch/lib/python3.8/site-packages/transformers/models/sam/modeling_sam.py in forward(self, input_points, input_labels, input_boxes, input_masks) 669 if input_labels is None: 670 raise ValueError("If points are provided, labels must also be provided.") --> 671 point_embeddings = self._embed_points(input_points, input_labels, pad=(input_boxes is None)) 672 sparse_embeddings = torch.empty((batch_size, point_batch_size, 0, self.hidden_size), device=target_device) 673 sparse_embeddings = torch.cat([sparse_embeddings, point_embeddings], dim=2)
~/miniconda3/envs/pytorch/lib/python3.8/site-packages/transformers/models/sam/modeling_sam.py in _embed_points(self, points, labels, pad) 619 padding_point = torch.zeros(target_point_shape, device=points.device) 620 padding_label = -torch.ones(target_labels_shape, device=labels.device) --> 621 points = torch.cat([points, padding_point], dim=2) 622 labels = torch.cat([labels, padding_label], dim=2) 623 input_shape = (self.input_image_size, self.input_image_size)
RuntimeError: Expected object of scalar type double but got scalar type float for sequence element 1.
Hello @YubinXie
Thanks for the issue!
I did not managed to reproduce your issue with torch==1.13.1, and here is the snippet I used:
from PIL import Image
import requests
import torch
from transformers import AutoModel, AutoProcessor
device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModel.from_pretrained("facebook/sam-vit-base").to(device)
processor = AutoProcessor.from_pretrained("facebook/sam-vit-base")
img_url = "https://huggingface.co/ybelkada/segment-anything/resolve/main/assets/car.png"
raw_image = Image.open(requests.get(img_url, stream=True).raw).convert("RGB")
input_points = [[[450, 600]]] # 2D location of a window in the image
inputs = processor(raw_image, input_points=input_points, return_tensors="pt").to(device)
with torch.no_grad():
outputs = model(**inputs)
I can see that you are using torch==1.5.x. Note that transformers has a minimum required version of 1.9 for torch: https://github.com/huggingface/transformers/blob/main/setup.py#L180 - hence I have tried to run that script with torch==1.9.1 and did not encountered the issue. I strongly recommend you to install a greater version of torch (i.e. use at least the version 1.9). Could you try to update torch and let us know if you still face the issue?
Hello @YubinXie Thanks for the issue! I did not managed to reproduce your issue with
torch==1.13.1, and here is the snippet I used:from PIL import Image import requests import torch from transformers import AutoModel, AutoProcessor device = "cuda" if torch.cuda.is_available() else "cpu" model = AutoModel.from_pretrained("facebook/sam-vit-base").to(device) processor = AutoProcessor.from_pretrained("facebook/sam-vit-base") img_url = "https://huggingface.co/ybelkada/segment-anything/resolve/main/assets/car.png" raw_image = Image.open(requests.get(img_url, stream=True).raw).convert("RGB") input_points = [[[450, 600]]] # 2D location of a window in the image inputs = processor(raw_image, input_points=input_points, return_tensors="pt").to(device) with torch.no_grad(): outputs = model(**inputs)I can see that you are using
torch==1.5.x. Note thattransformershas a minimum required version of1.9fortorch: https://github.com/huggingface/transformers/blob/main/setup.py#L180 - hence I have tried to run that script withtorch==1.9.1and did not encountered the issue. I strongly recommend you to install a greater version oftorch(i.e. use at least the version1.9). Could you try to updatetorchand let us know if you still face the issue?
Hi @younesbelkada Thank you for your response. I updated my torch and now the model works! However, I got another error the the post process:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-abdc2d7068b8> in <module>
6 outputs = model(**inputs)
7
----> 8 masks = processor.image_processor.post_process_masks(
9 outputs.pred_masks.cpu(), inputs["original_sizes"].cpu(), inputs["reshaped_input_sizes"].cpu()
10 )
~/miniconda3/envs/pytorch/lib/python3.8/site-packages/transformers/models/sam/image_processing_sam.py in post_process_masks(self, masks, original_sizes, reshaped_input_sizes, mask_threshold, binarize, pad_size)
404 interpolated_mask = F.interpolate(masks[i], target_image_size, mode="bilinear", align_corners=False)
405 interpolated_mask = interpolated_mask[..., : reshaped_input_sizes[i][0], : reshaped_input_sizes[i][1]]
--> 406 interpolated_mask = F.interpolate(interpolated_mask, original_size, mode="bilinear", align_corners=False)
407 if binarize:
408 interpolated_mask = interpolated_mask > mask_threshold
~/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/functional.py in interpolate(input, size, scale_factor, mode, align_corners, recompute_scale_factor, antialias)
3957 if antialias:
3958 return torch._C._nn._upsample_bilinear2d_aa(input, output_size, align_corners, scale_factors)
-> 3959 return torch._C._nn.upsample_bilinear2d(input, output_size, align_corners, scale_factors)
3960 if input.dim() == 5 and mode == "trilinear":
3961 assert align_corners is not None
TypeError: upsample_bilinear2d() received an invalid combination of arguments - got (Tensor, list, bool, NoneType), but expected one of:
* (Tensor input, tuple of ints output_size, bool align_corners, tuple of floats scale_factors)
didn't match because some of the arguments have invalid types: (Tensor, list of [Tensor, Tensor], bool, NoneType)
* (Tensor input, tuple of ints output_size, bool align_corners, float scales_h, float scales_w, *, Tensor out)
The code is from hugging face SAM page. I wonder if it is code issue or, other package issue.
Hi @YubinXie
Thanks for iterating, it seems that this is a duplicate of https://github.com/huggingface/transformers/issues/22904
Could you try to uninstall transformers and re-install it from source?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.