StatsBase.jl icon indicating copy to clipboard operation
StatsBase.jl copied to clipboard

Use dot-notation: quantile.(d, X)

Open ron-wolf opened this issue 5 years ago • 5 comments

StatsBase.iqr() calls Distributions.quantile().

https://github.com/JuliaStats/StatsBase.jl/blob/54cc6fe2e2f709623b9c75275884f9b295fdbd35/src/scalarstats.jl#L340

However, calling iqr() emits a warning, reproduced by the following code.

using StatsBase, Statistics, Distributions

prices = 0:20:600
probs = pweights([100-99, 99-99, 99-99, 99-99, 99-99, 99-98, 98-94, 94-90, 90-81, 81-74, 74-72, 72-69, 69-67, 67-64, 64-61, 61-58, 58-55, 55-52, 52-48, 48-44, 44-40, 40-36, 36-31, 31-26, 26-21, 21-15, 15-9, 9-3, 3-0, 0-0, 0-0] / 100)

avg_price = mean(prices, probs)
dev_price = std(prices, probs, mean=avg_price)

price_dist = Normal(avg_price, dev_price)
range = iqr(price_dist)
┌ Warning: `quantile(d::UnivariateDistribution, X::AbstractArray)` is deprecated, use `quantile.(d, X)` instead.
│   caller = iqr(::Normal{Float64}) at scalarstats.jl:340
└ @ StatsBase ~/.julia/packages/StatsBase/548SN/src/scalarstats.jl:340

The warning stems from a line in Distributions.

ron-wolf avatar Jun 22 '20 04:06 ron-wolf

As a fun side note, I had a lot of difficulty finding this line. I tried getting at it using methods(), but I could not find find the returned file on GitHub.

using Distributions

methods(quantile, (UnivariateDistribution, AbstractArray))
# 1 method for generic function "quantile":
[1] quantile(d::Distribution{Univariate,S} where S<:ValueSupport, X::AbstractArray) in Distributions at deprecated.jl:65

The documentation for quantile() is available online, but from the Statistics library, begging the question of why documentation only exists for the re-exported name. Even so, the source code link has been broken as of Julia 1.0. Compare the “source code” link in the v1.0 documentation and the v0.7 documentation.

I’m not familiar enough with Julia’s development to ascertain where this additional bug should be reported, so I‘m attaching it here so I’m not the only soul who knows about it. I also wasn’t sure what package versions I’m using, so if those are needed, any guidance would be appreciated.

ron-wolf avatar Jun 22 '20 05:06 ron-wolf

Good catch. Can you also change nquantile?

The documentation for quantile() is available online, but from the Statistics library, begging the question of why documentation only exists for the re-exported name. Even so, the source code link has been broken as of Julia 1.0. Compare the “source code” link in the v1.0 documentation and the v0.7 documentation.

I’m not familiar enough with Julia’s development to ascertain where this additional bug should be reported, so I‘m attaching it here so I’m not the only soul who knows about it. I also wasn’t sure what package versions I’m using, so if those are needed, any guidance would be appreciated.

Yes, that's an unfortunate consequence of the fact that Statistics.jl has been moved to a separate repository. At some point the manual will probably be changed to point to the separate repository too.

nalimilan avatar Jul 05 '20 20:07 nalimilan

Sorry, I just realized that the current form is faster when the input is an array, as it sorts the data only once, while the proposed form would do it for each requested quantile. I think it would make sense to undeprecate that method in Distributions for consistency with Statistics. Can you file an issue there to discuss it?

nalimilan avatar Jul 05 '20 20:07 nalimilan

@nalimilan

Yes, that's an unfortunate consequence of the fact that Statistics.jl has been moved to a separate repository. At some point the manual will probably be changed to point to the separate repository too.

I’m not sure I completely understand. I get that the code has been separated out, while the docs have yet to be ported. But what used to be the case—did Statistics used to be under Distributions?

Can you file an issue there to discuss it?

Yes, will do! Does that mean this PR should be closed?

ron-wolf avatar Jul 24 '20 20:07 ron-wolf

I’m not sure I completely understand. I get that the code has been separated out, while the docs have yet to be ported. But what used to be the case—did Statistics used to be under Distributions?

No. What used to be the case is that Statistics lived in the Julia repository.

Yes, will do! Does that mean this PR should be closed?

We can keep it open for now.

nalimilan avatar Jul 26 '20 14:07 nalimilan