Adding multisample feature along with testcases
Before submitting
- [x] Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
- [x] Did you read the contributor guideline, Pull Request section?
- [ ] Did you make sure to update the docs?
- [x] Did you write any new necessary tests?
What does this PR do?
Fixes #317
PR review
Added support for multisample item.
Basically added a sample_count parameter which creates a batch of sub samples for each sample, given a single transform function.
Note:
Multi-sample behavior applies only when the transform is passed to the
StreamingDataset constructor (i.e., via the `transform` argument),
and not when overriding `__init__` in this subclass.
Sample code:
def transform_fn_sq(x, sample_idx, *args, **kwargs):
"""A simple transform function that doubles the input."""
return x * sample_idx
dataset = StreamingDataset(
data_dir,
cache_dir=str(cache_dir),
shuffle=False,
transform=[transform_fn_sq],
sample_count=3,
)
Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃
@tchaton @deependujha @bhimrazy Can you verify the approach once? I can then make changes to the README.
Codecov Report
:x: Patch coverage is 84.21053% with 3 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 80%. Comparing base (b070032) to head (229ff5b).
Additional details and impacted files
@@ Coverage Diff @@
## main #740 +/- ##
===================================
- Coverage 80% 80% -0%
===================================
Files 52 52
Lines 7343 7357 +14
===================================
- Hits 5885 5876 -9
- Misses 1458 1481 +23
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.