Pierre-Yves

Results 10 comments of Pierre-Yves

Hey @adamantike , Fargate is not supported at the moment. We'll evaluate adding it. Happy for contributions on this one.

@sean-smith ready now?

Still draft or shall we merge?

You could link to the AWS doc: - https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-NVIDIA-GPU.html - https://docs.nvidia.com/deploy/xid-errors/index.html - https://aws.amazon.com/blogs/compute/capturing-gpu-telemetry-on-the-amazon-ec2-accelerated-computing-instances/ - https://repost.aws/knowledge-center/ec2-linux-troubleshoot-xid-errors

cancel or do we move forward with it?

@mhuguesaws how about CloudWatch or profilers like Nsight?

Approaching 2 months, shall we close @bkulnik-auvaria ?

Hey @nithiyn , you'll want to amend the [readme](https://github.com/aws-samples/awsome-distributed-training/tree/main/3.test_cases/10.FSDP#0-prerequisites) file. For example, you could add a subsection in the **3.Launch Training** to show how to run this new case (example...

@awsankur @KeitaW are we good on this?

Add digits for the directory number?