Fei

Results 6 comments of Fei

@tfboyd Thanks! Do you have the run script for multi-node without using Hororvod? Also, can you let me know what is fixed for Eval? I doubt if I have the...

I met the same problem. Have you solved this?

> It looks like tensorflow grappler has some trouble creating clusters and the error is coming from [here](https://github.com/tensorflow/tensorflow/blob/e9db4aec6714173c1e556b701feda06cc5203380/tensorflow/core/grappler/clusters/virtual_cluster.cc#L50). This step should happen during compilation, maybe xla can skip these optimizations...

Update: I modified the above script to use `torch.multiprocessing.start_processes` to launch the processes. Here is the script: ``` import os import json import argparse import torch import torch_xla.distributed.xla_multiprocessing as xmp...

@ppwwyyxx Just want to double check if I catch your idea right. Do you suggest to put the images with same aspect ratio (either portrait or landscape) into the same...

Hi @SunMarc, not sure what you meant basic mixed-precision training. For Pytorch, both autocast and FSDP requires model weights to stay in fp32. I think 2 should be working to...