feat: add `withIterator`
Fixes #40
it isn't a true iterator of paths since we still read each directory as a batch under the hood. but it does mean we're only ever holding at most whats in one directory in memory
we could one day follow up on this by switching to opendir and reading entries one by one
if this implementation seems of any interest, let me know and i can review with you
@thecodrr @SuperchupuDev
Notable changes
Tests use an execute helper
An execute helper has been extracted so we can run the tests against all types without caring if its an iterator or not.
Now in a test, you just await execute(api, type); and it'll handle normalising the result.
onlyCounts tests disabled for withIterator type
It wouldn't throw if we tried, but onlyCounts doesn't make any sense since there's no result when using withIterator. It'd be an empty iteration. So we just don't test this in that case (as there'd be type errors too i suspect).
group-files now exposes a pushGroup parameter
The file grouping mechanism decides if to group or not, then goes ahead and pushes a group if it needs to.
With an iterator, we need a way to hook into the underlying push, not the outer groupFiles function.
For that reason, i've introduced a parameter, pushGroup which is (item, arr) => void.
This means we can pass a custom pushGroup in iterator mode which doesn't actually push to arr, but feeds it into the iterable. In all other cases, we use a default pushGroup which just pushes to arr.
pushFile and pushDirectory also have a pushPath parameter
For the same reason as groupFiles, both pushFile and pushDirectory now take a `pushPath.
This is used when we want to push a path, whether it be a directory or not, to the result set. In iterator mode, we can use this to feed it into the iterable. Otherwise, we push to this.state.paths (arr).
Counts#directories takes filters into account
The directories count used to be all directories we have visited, rather than directories we have matched. This means filters and such don't get taken into account (e.g. if we don't include dirs, we still count them right now. and if we filter dirs, we sometimes count the ones we filtered out).
Meanwhile, files count was always the files we have matched.
In iterator mode, we need a way to know how many paths (files or directories) we pushed so far as we don't have a set anywhere to check the length of. This would be as simple as directories + files if not for the inconsistency mentioned above.
So i have fixed the inconsistency, only incrementing directories for matched directories (i.e. those that end up in the result).
WalkerIterator
A basic iterator that essentially loops until the async callback of the existing Walker has been called.
On each iteration, it waits until we either push a new path or we complete (so isn't just mindlessly looping).
can you expand on what parts you think are hacky?
the outer part is just a generator, whatever way we implement this, that'll have to exist somewhere
the way we hook into the what paths have been emitted is the way it is since i didn't want to change how we read directories under the hood.
if we change the way reading directories works, we could trigger the next read only when the next iteration happens i imagine.
can you expand on what parts you think are hacky?
The way the pushPath and pushGroup functions are passed down is not the most elegant. I also think an iterator should be different from the normal walker. There's no need to support onlyCounts and withGroups in an iterator. The WalkerIterator itself looks very confusing since it is basically overriding an existing implementation and I am sure there are edge cases it doesn't handle (e.g. what happens to items that enter the #queue while the for..loop is running?)
the way we hook into the what paths have been emitted is the way it is since i didn't want to change how we read directories under the hood.
if we change the way reading directories works, we could trigger the next read only when the next iteration happens i imagine.
It wouldn't be a bad idea to write separate directory walking logic for the iterator. We can reuse whatever parts we can.
There's no need to support onlyCounts and withGroups in an iterator
to be clear, it doesn't support onlyCounts. it does support withGroups
The WalkerIterator itself looks very confusing since it is basically overriding an existing implementation and I am sure there are edge cases it doesn't handle (e.g. what happens to items that enter the #queue while the for..loop is running?)
i was just trying to provide an iterator around what already exists. it can be written better if we don't need to use the existing implementation. worth noting it is a regular pattern for implementing an iterator of a mixture of async/sync though. i haven't done anything unusual here, its a fairly normal solution if we want to use the existing implementation.
it seems clear that you'd rather just use a new implementation, so i will revisit this some day when i get time.