Minxiangliu comments

Results 20 comments of


                                            Minxiangliu

QR 掃了沒反應

現在1.9.8的版本能使用QRcode登入嗎? 使用beanfun網頁可以正常登入但是使用登入器就不行....

Could not use GPU for pycox

Hi @havakv , In my case, I can see that I have set up the GPU correctly, and then during model training, the GPU memory is being used, but the...

I know why I'm wrong old code: ``` class getDataset(Dataset): def __init__(self, datasets): self.dataset=datasets self._trans=transforms.Compose(....) def __getitem__(self, index): return self._trans(self.dataset[index]) ...... dataset_train = getDataset(....) dl_train = DataLoader(dataset_train, ...) ``` new...

no loss for training with small batch size,

Hi @yikuanli , I also want to know.

2 node speed is not faster than 1 node

Hi @lmolhw5252 , I currently have one A100 (40GB) GPU and I am training using your recommendations. However, I encounter an issue where it ultimately displays `exits with return code...

2 node speed is not faster than 1 node

> you may check the resource like memory，or cpu ，you can set batch=1 Are you suggesting setting both `per_device_train_batch_size` and `per_device_eval_batch_size` to 1?

torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

I also encounter a similar issue when fine-tuning llama, and I hope someone can assist in answering it! ``` /root/miniconda3/envs/vicuna/lib/python3.10/site-packages/transformers/training_args.py:1388: FutureWarning: using `--fsdp_transformer_layer_cls_to_wrap` is deprecated. Use fsdp_config instead warnings.warn( Loading...