Computation of median for discrete data: non-standard computation
Using (require math), we can compute (median < data), where data is a list of numbers. When data length is even, then normally the median is computed by averaging the two central elements, after ordering. The problem is that running the following code we obtain 2, but considering tipical references (e.g. wikipedia median) the value should be 2.5: #lang racket (require math) (define data (list 1 2 3 4)) (median < data)
Could we maybe add some parameterization to the function median (and may be also to quantile) in order to compute the average of the central two elements, when the number of elements is even?
Thank you and congratulations for all your great work.
Den fre. 30. aug. 2019 kl. 23.06 skrev E. Comer [email protected]:
Using (require math), we can compute (median < data), where data is a list of numbers. When data length is even, then normally the median is computed by averaging the two central elements, after ordering.
...
Could we maybe add some parameterization to the function median (and may be also to quantile) in order to compute the average of the central two elements, when the number of elements is even? Adding a keyword argument to median and quantile to choose the convention is a great idea.
I remember the first time I taught medians and quantiles - I was very surprised to find that the conventions of defining median and quantile differ from country to country. Here (Denmark) we also use the "average of the the two central elements" method, but out computer programs of choice didn't. For real examples where the sample size is large, there is no problems - but for practise problems it is somewhat annoying.
I believe the rationale for avoiding the averaging is that without, the median becomes one of the observations.
Now two different conventions are annoying, but according to MathWorld, there are 9 (nine!) different conventions of the quantile "in common use".
[1] http://mathworld.wolfram.com/Quantile.html
/Jens Axel