Arij-Aladel
Arij-Aladel
How to make it transparent in this case?
facing th same issue @pacman100
> ```python > if not tensor.is_contiguous(): > tensor = tensor.contiguous() > ``` This is my accelerate config ``` compute_environment: LOCAL_MACHINE deepspeed_config: offload_optimizer_device: cpu gradient_clipping: 1.0 zero_stage: 2 distributed_type: DEEPSPEED downcast_bf16:...
@ofirpress Sorry but correct me if I am wrong. The positional encoding is needed just in self-attention we do not need it in cross ateention I am referring to [T5...
@Nagoudi @elmadany @mageed Thanks for this great work can we have access to datasets?