unilm icon indicating copy to clipboard operation
unilm copied to clipboard

RuntimeError: "compute_indices_weights_cubic" not implemented for 'Half'

Open 1170300814 opened this issue 3 years ago • 9 comments

hi,i meet the problem when i run beit3. i check the cuda is available and the gpu memory is 32g. i search it in internet that they say this caused by the model run on cpu. so can you give me some advice? thank you!

1170300814 avatar Apr 10 '23 07:04 1170300814

Could you provide more information about how you use beit3?

wenhui0924 avatar Apr 10 '23 09:04 wenhui0924

the running order is python run_beit3_finetuning.py --model beit3_base_patch16_384 --input_size 384 --task coco_retrieval --batch_size 16 --sentencepiece_model cocomodel/beit3.spm --finetune cocomodel/beit3_base_itc_patch16_224.pth --data_path coco/ --eval --dist_eval

the flow is Not using distributed mode Namespace(aa='rand-m9-mstd0.5-inc1', auto_resume=True, batch_size=16, captioning_mask_prob=0.6, checkpoint_activations=None, clip_grad=None, color_jitter=0.4, crop_pct=None, cutmix=0, cutmix_minmax=None, data_path=coco/', device='cuda', dist_eval=True, dist_on_itp=False, dist_url='env://', distributed=False, drop_path=0.1, drop_worst_after=12000, drop_worst_ratio=0.2, enable_deepspeed=False, epochs=20, eval=True, eval_batch_size=None, finetune=cocomodel/beit3_base_itc_patch16_224.pth', initial_scale_power=16, input_size=384, label_smoothing=0.1, layer_decay=0.9, length_penalty=0.6, local_rank=-1, log_dir=None, lr=0.0005, min_lr=1e-06, mixup=0, mixup_mode='batch', mixup_prob=1.0, mixup_switch_prob=0.5, model='beit3_base_patch16_384', model_ema=False, model_ema_decay=0.9999, model_ema_force_cpu=False, model_key='model|module', model_prefix='', momentum=0.9, nb_classes=1000, num_beams=3, num_max_bpe_tokens=64, num_workers=10, opt='adamw', opt_betas=[0.9, 0.999], opt_eps=1e-08, output_dir='', pin_mem=True, randaug=False, recount=1, remode='pixel', reprob=0.25, resplit=False, resume='', save_ckpt=True, save_ckpt_freq=5, seed=0, sentencepiece_model='cocomodel/beit3.spm', smoothing=0.1, start_epoch=0, task='coco_retrieval', task_cache_path='', task_head_lr_weight=0, train_interpolation='bicubic', update_freq=1, vocab_size=64010, warmup_epochs=5, warmup_lr=1e-06, warmup_steps=-1, weight_decay=0.05, world_size=1, zero_stage=0) True Load 566747 image-text pairs from coco/coco_retrieval.train.jsonl. Load 25010 image-text pairs from coco/coco_retrieval.val.jsonl. model_config = beit3_base_patch16_384_retrieval Load ckpt from cocomodel/beit3_base_itc_patch16_224.pth Load state_dict by model_key = model Position interpolate from 14x14 to 24x24 Traceback (most recent call last): File "run_beit3_finetuning.py", line 448, in main(opts, ds_init) File "run_beit3_finetuning.py", line 267, in main utils.load_model_and_may_interpolate(args.finetune, model, args.model_key, args.model_prefix) File "/unilm2/unilm/beit3/utils.py", line 577, in load_model_and_may_interpolate pos_tokens = torch.nn.functional.interpolate( File "/root/anaconda3/envs/beit3/lib/python3.8/site-packages/torch/nn/functional.py", line 3946, in interpolate return torch._C._nn.upsample_bicubic2d(input, output_size, align_corners, scale_factors) RuntimeError: "compute_indices_weights_cubic" not implemented for 'Half'

1170300814 avatar Apr 10 '23 09:04 1170300814

Could you provide more information about how you use beit3?

hi,the more info is update

1170300814 avatar Apr 10 '23 09:04 1170300814

I also do not know what caused this problem. It seems that it is because of interpolating image position embedding from 14x14 to 24x24.

wenhui0924 avatar Apr 10 '23 12:04 wenhui0924

i solve it! i download the wrong model click the link to download not the Previous text

1170300814 avatar Apr 11 '23 02:04 1170300814

I also found the same problem when doing model fine-tuning. It seems that when training without enabled-deepspeed, the data format of pos_token is float16. However, half support on CPU is very limited, and won't be extended. I don't know if this problem can be solved.

liuxuannan avatar Apr 24 '23 03:04 liuxuannan

I just converted the tenosr at this line https://github.com/microsoft/unilm/blob/9102ed91f8e56baa31d7ae7e09e0ec98e77d779c/beit3/utils.py#L574 to float pos_tokens = pos_tokens.reshape(-1, orig_size, orig_size, embedding_size).permute(0, 3, 1, 2).float()

parkersell avatar Apr 24 '23 15:04 parkersell

yes it caused by the cpu don't support some type maybe! can it solve the problem?

1170300814 avatar Apr 25 '23 02:04 1170300814

Hey, go to "stable-diffusion-webui/extensions/deforum-for-automatic1111-webui/scripts/deforum_helpers/depth.py" file and there in class MidasModel find function "_initialize"

There in def _initialize(self, models_path, device, half_precision=True, keep_in_vram=False, use_zoe_depth=False, Width=512, Height=512): Change half_precision to False def _initialize(self, models_path, device, half_precision=False, keep_in_vram=False, use_zoe_depth=False, Width=512, Height=512):

It worked for me :)

MarySueXLsD avatar May 03 '23 21:05 MarySueXLsD