Dispatch for drawing multiples
Closes #1984
Implementation
- [x] Dispatch for
MixtureModel - [x] Dispatch for
Truncated
Checkpoints
- [x] Tests are passing
- [x] Docs are updated -- discussed that docstrings are placed to dispatched functions for consistency (#1986)
Sanity checks
Speed
For mixture model,
julia> using Distributions
julia> using BenchmarkTools
julia> d = MixtureModel([Normal(1,0.1), Normal(2,0.1), Normal(3,0.1)], [0.3,0.3,0.4])
julia> @btime rand($d, 10_000);
73.708 μs (7 allocations: 125.67 KiB)
julia> @btime [rand($d) for _ in 1:10_000];
124.125 μs (3 allocations: 96.06 KiB)
For truncated
t = truncated(Normal(), 0, 3)
@btime rand($t, 10_000); # 114.625 μs (6 allocations: 256.12 KiB)
@btime [rand($t) for _ in 1:10_000]; # 169.875 μs (3 allocations: 96.06 KiB)
Visual
d = MixtureModel([Normal(1,0.1), Normal(2,0.1), Normal(3,0.1)], [0.3,0.3,0.4])
s1 = [rand(d) for _ in 1:10000]
s2 = rand(d, 10000)
let
plot();
stephist!(s1, bins=range(0, 4, 100), lab="rand(d) n times")
stephist!(s2, bins=range(0, 4, 100), lab="rand(d,n)")
end
t = truncated(Normal(), 0, 3)
r1 = [rand(t) for _ in 1:10000]
r2 = rand(t, 10000)
let
plot();
stephist!(t1, bins=range(0, 4, 100), lab="rand(d) n times")
stephist!(t2, bins=range(0, 4, 100), lab="rand(d,n)")
end
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 86.26%. Comparing base (
abb151c) to head (97e3b25).
Additional details and impacted files
@@ Coverage Diff @@
## master #1985 +/- ##
==========================================
- Coverage 86.28% 86.26% -0.03%
==========================================
Files 146 146
Lines 8787 8847 +60
==========================================
+ Hits 7582 7632 +50
- Misses 1205 1215 +10
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@devmotion thanks a lot for the review.
- [x] Questions should be addressed.
- [x] truncated of small mass: I've fixed broken pipeline for large truncated of small remaining volume - now, it falls back to
rand(n)if needs to generate more than 10M, this cut value is somewhat arbitrary, so suggestions are welcome. - [x] docstrings are removed.
- [x] made sanity checks of comparing samples of
rand(d)andrand(d,n)
No docs are added
The last item that I think about is the 10M cap for switching between algorithm. It's bad to switch between algorithms in general for predictability, so the design of Truncated with a switch is not ideal. But we will not address it in this MR.
I'm thinking to proceed with a similar if/else of 0.25% of distribution. Similar (not-ideal) patterns will be easy to identify in feature and update.
@devmotion thanks for the last comments.
- the truncated was easy to fix,
- I've modified the code
rand + resizestrategy for the mixture model
Re-evaluated benchmarks in the header, the rand-n version is faster for both models
Comments are resolved, this PR is ready for review
As a general comment before any detailed review:
the rand-n version is faster for both models
Can we test different number of samples? n = 10000 is a bit extreme. Can we check n = 1, n = 5, n = 10, n = 50, n = 100, n = 500, n = 1000 etc. as well?