data icon indicating copy to clipboard operation
data copied to clipboard

Caching doesn't work with cycle

Open Modexus opened this issue 2 years ago • 0 comments

🐛 Describe the bug

When cycle with later caching is used it works the very first time but afterwards it crashes because of the demux because it checks infinitely for todo files but all are cached.

from torchdata.datapipes.iter import IterableWrapper

dp = IterableWrapper(["test"])
dp = dp.cycle()
dp = dp.on_disk_cache(filepath_fn=lambda x: f"./{x}")
dp = dp.map(lambda x: (x, x))
dp = dp.end_caching(mode="t", same_filepath_fn=True)

next(iter(dp))
next(iter(dp))

This happens because of how demux works so the same applies for cycle and demux:

from torchdata.datapipes.iter import IterableWrapper

dp = IterableWrapper(["test"])
dp = dp.cycle()
dp0, dp1 = dp.demux(2, lambda x: 1)

next(iter(dp0))

Versions

torchdata.version=='0.7.0a0+deeacb4'

Modexus avatar May 18 '23 19:05 Modexus