Saaketh Narayan comments

Results 102 comments of


                                            Saaketh Narayan

GPU utilisation drop between epochs

@rishabhm12 @smilenaderi Are both of you using pre-processing functions for each sample/batch before training? Also, how big are your batch sizes and epoch sizes?

GPU utilisation drop between epochs

@miguelalba96 In the past we've seen that treating GCSFuse as "local" can be slow. Have you tried treating it as remote, or moving your data to local disk?

GPU utilisation drop between epochs

@miguelalba96 some things you could try, given that local disk works well instead of FUSE-mounted: * increase prefetch factor on dataloader, and predownload on dataset * Set remote to be...

GPU utilisation drop between epochs

I am curious, what launchers are people using? I have reproduced the issue of low utilization between epochs when using TorchDistributor, but the issue goes away with the Composer launcher.

GPU utilisation drop between epochs

Hey @rishabhm12 @Matagi1996 @miguelalba96 @smilenaderi -- @XiaohanZhangCMU was able to root cause and fix the hangs between epochs. In internal testing, this has resolved inter-epoch hangs and has improved overall...

Integrating MDS Streaming with HF Dataset Streaming

Hey, this would be great! What did you have in mind regarding the implementation -- what should be done on Streaming's side?

Integrating MDS Streaming with HF Dataset Streaming

Hey @lhoestq, @orionw added support for storing MDS datasets in huggingface. The relevant section in the docs is [here](https://docs.mosaicml.com/projects/streaming/en/stable/how_to_guides/configure_cloud_storage_credentials.html#huggingface-datasets). Will ask internally about posting on socials! @orionw provided this simple...

Saaketh Narayan

GPU utilisation drop between epochs

GPU utilisation drop between epochs

GPU utilisation drop between epochs

GPU utilisation drop between epochs

GPU utilisation drop between epochs

Integrating MDS Streaming with HF Dataset Streaming

Integrating MDS Streaming with HF Dataset Streaming

Integrating MDS Streaming with HF Dataset Streaming

Support large size index.json (20GB +)

Integer overflow and data corruption (uncompressed mds file size is larger than 2^32)