Rohit Gupta
Rohit Gupta
@mariosasko, @lhoestq, @albertvillanova hey guys! can anyone help? or can you guys suggest who can help with this?
> if dataset.n_shards % world_size != 0 then all the nodes will read/stream the full dataset in order (possibly reading/streaming the same data multiple times), BUT will only yield one...
what if the number of samples in that shard % num_nodes != 0? it will break/get stuck? or is the data repeated in that case for gradient sync?
hey @FlorentMeyer, mind check the file you uploaded, looks like it's too big and there might be some redundant stuff here. Might clean it up?
is it possible to configure the same for discussions as well? we have labels there.
if it's on images, I would say no.. it's open source.
@tchaton do you think adding an additional `'validation'` interval would be a good idea? can't think of any configuration to support it with `'step'|'epoch'`. Although there are 2 cases in...
```py def test_multiple_dataloaders_logging(tmpdir): class TestModel(BoringModel): def validation_step(self, batch, batch_idx, dataloader_idx): self.log("value_1", dataloader_idx, add_dataloader_idx=False) ``` isn't this incorrect behavior since we have a single resultcollection instance handling all the keys but...
@tchaton I believe `add_dataloader_idx` is meant to distinguish the metrics stored internally by appending the index in front of it. The flow with multiple dataloaders is like this: ``` complete...
IMO, it's a good thing to support cross-reduction across dataloaders, but I'd argue from the user's point of view that what we have on master is good enough right now...