cpp-sort icon indicating copy to clipboard operation
cpp-sort copied to clipboard

Better complexity for measures of presortedness

Open Morwenn opened this issue 9 years ago • 2 comments

According to Computing and ranking measures of presortedness by J. Chen, the complexity of some of our measures of presortedness simply outrightly suck. With better algorithms, we could change them as follows, (quotes are from the paper):

  • [ ] « Exc(X) is also linear computable (see , The Art of Computer Programming, Vol. 3: Sorting and Searching, Ex. 5.2.2-2) and we cannot do better. »
  • [x] « Altman and Chlebus [Sorting roughly sorted sequences in parallel] proved that Par(X) can be computed with O(n log Par(X)) comparisons », which means that Par(X) can be computed with a O(n log n) algorithm since 0 <= Par(X) <= n - 1.
  • [x] « Osc(X) can be computed in O(n log(Osc(X)/n)) time [Adaptive Sorting by O. Pertersson]. », which is roughly equivalent to O(n log n) considering that 0 <= Osc(X) <= (n * (n - 2) - 1) / 2. However Adaptive Sorting mentions that computing Osc(X) is an open problem, but that it is easy to compute it in O(n log n).
  • [x] Our implementation of Enc is O(n ²) while it could be reduced to O(n log n) by using a binary search during the construction of the encroaching lists.
  • [ ] T. M. Chan and M. Pătraşcu give a O(n sqrt log n) algorithm to compite Inv(X) in Counting Inversions, Offline Orthogonal Range Counting, and Related Problems.

Also, some of the comments make me doubt the implementation of the of the measures of presortedness. Did I really understand every definition, or did I take some mental shortcuts? It's left to investigate... I'll gladly accept any help if anyone knows how to implement these algorithms with the complexities above.

Morwenn avatar Oct 29 '16 11:10 Morwenn

As explained in this lengthy comment, the measure of presortedness Radius(X) described in by T. Altman and Y. Igarashi Roughly Sorting: Sequential and Parallel Approach is pretty much the same as Par(X) described by V. Estivill-Castro and D. Wood in A New Measure of Presortedness, which means that we won't implement a probe::radius in cpp-sort.

However the good news is that the Roughly Sorting papers gives a linear method to compute Radius(X), which means that we can surely change our Par(X) implementation. The described method has the following properties:

  • O(n) time instead of O(n² log n)
  • O(n) space instead of O(1)
  • Works on bidirectional iterators instead of random-access

Having an O(n² log n) algorithm is kind of unsightly, but as can be seen in the benchmarks, the current implementation of Par(X) is a sneaky beast with a high enough short-circuiting power that it manages to run faster than O(n) algorithms when it doesn't encounter an adverse pattern. It would be nice if we could take advantage of the short-circuits while running in O(n) time, but I fear that it isn't possible.

Morwenn avatar Mar 03 '21 17:03 Morwenn

Commit 77c4c756d1cff0ce3f2566b4a942fa10fbad9270 changes probe::osc to run in O(n log n) time and O(n) space. Thanks a lot to @Control55 from @The-Studio-Discord for coming up with the new algorithm.

Morwenn avatar Aug 03 '21 11:08 Morwenn