Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ranges::sort is slow #406

Open
ericniebler opened this issue Jun 14, 2016 · 5 comments
Open

ranges::sort is slow #406

ericniebler opened this issue Jun 14, 2016 · 5 comments

Comments

@ericniebler
Copy link
Owner

The current implementations of ranges::sort and friends were not chosen for their speed, but for their clarity. That was fine when range-v3 was more of a proof-of-concept. Now that range-v3 has significant adoption, we should think about replacing its implementation with something a bit snappier.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jun 15, 2016

Disclaimer: the following is an unstructured dump from my notes on the sorting patterns, so it is basically a collection of random thoughts on work that could be done to improve sort/stable_sort performance in range-v3.

  • Clean/complete the sorting benchmarks (in the perf/ folder):
    • some patterns from libc++ test suite are missing,
    • incorporate Boost.Sort benchmarks,
    • consider strings, float, and more complex objects in the benchmarks.
  • Do we win anything by hiding sorting "primitives" in algorithm/aux and ranges::aux? These are useful, maybe we should expose them more directly.
  • Once sorting primitives are more exposed it might be worth considering relaxing sort/stable_sort RandomAccessRange requirements or providing a relaxed_sort/relaxed_stable_sort that does so. A sorting algorithm on ForwardRanges might be slower than copying the range into a std::vector and using sort, but we should provide an "I hope you know what you are doing" interface that allows the user to avoid this if desired.
  • Extend the API of stable_sort with a memory allocator to bypass std::get_temporary_buffer and allow e.g. reusing a buffer between algorithm calls. Expose in-place stable sort more directly.
  • Implement/update the sort/stable_sort primitives, benchmark them, and consider integrating them into the sort/stable_sort algorithms for specific ranges, range value types, and comparison function objects:
    • copy the recently improved pattern-adaptive intro-sort that libc++ uses,
    • copy hybrid radix from Boost.Sort for integers, floats, and strings,
    • copy spreadsort from Boost.Sort,
    • copy Timsort (from python/java).
    • implement GrailSort for in-place stable sort.
  • Since this is a ton of work, consider making this a Google Summer of Code project (if we find a mentor we could apply for the summer of 2017).

@Morwenn
Copy link

Morwenn commented Jun 27, 2016

I've already got the following algorithms here with full support for move-only types, projections and proxy iterators:

  • Block sort (from WikiSort, better than Grail sort in my benchmarks)
  • Counting sort (trivial)
  • Grail sort (adapted from the C code), augmented so that it can take a "buffer provider" and turned into its cousin SqrtSort (originally from the same guy who implemented Grail sort)
  • Heapsort (from the standard library)
  • Insertion sort (trivial, linear and binary)
  • Merge insertion sort (Ford-Johnson algorithm, mostly useless)
  • Mergesort (top-down, adaptative in the implementation details, works with forward iterators)
  • pdqsort (from Orson Peters)
  • Poplar sort (too slow)
  • Quicksort (median of 9, not kek at all but works with forward iterators)
  • Selection sort (trivial)
  • Smoothsort (from Keith Schwartz, too slow)
  • Spreadsort (from Boost, handles projections)
  • Timsort (from Goro Fuji)
  • Vergesort (works with bidirectional iterators)

There are also a few sorting networks and some other fixed-size sorting algorithms for shit and giggles. The complexity of every sorting algorithm is documented, along with the memory complexity and the category of iterators the algorithms work with. There are also notes about specific details.

There were plans to add super scalar sample sort, QuickMergeSort and BlockQuickSort too (license problems for the latter since it's GPL, but it seems really good). I couldn't get my hands on a license-compatible BurstSort implementation. The algorithms are generally more generic in the implementation details than under the common interface. There are some benchmarks too, but I can't guarantee that they are good. Also I can't guarantee the quality my implementations of quicksort & mergesort; I believe that there are better ones around.

Anyway, all the code is licensed either under the Boost, MIT, zlib or apparently compatible permissive licenses (see at the top of each header for details). If you aren't sure about the licenses, you can reuse whatever is explicitly from me if needed without asking, and I know that Orson Peters would also be glad to have his pdqsort used everywhere. If you think that it's a good idea, don't hesitate to pick whatever you want from the library.

There are some benchmarks in the project, but I don't know whether they are good or not. At least they seem consistent most of the time. They are also MIT or zlib licensed.

I don't think I can be fluent enough in range-v3 implementation to contribute, but you can use whatever you want to. Hope that helps :)

@ericniebler
Copy link
Owner Author

That's great! Do they handle sentinels, when the end is not an iterator? I find that's the trickiest part of porting algorithms to range-v3.

@Morwenn
Copy link

Morwenn commented Jun 27, 2016

Unfortunately not. Since I have dedicated sorters objects and tutorials about how to make them, I came to the conclusion that having sorter writers implement sentinels without concepts would be too cumbersome (I don't have macros not concept checks in the library, so... raw std::enable_if_t SFINAE all over the place). I decided to wait for concepts to be supported by both g++ and clang++ before starting a branch with concepts and thus sentinels.

What's the trickiest part of implementing sentinels by the way? From the looks of it, I thought that it would only be a matter of consistently using std::advance and friends instead of operators. In my experience, correctly handling proxy iterators with move-only types and algorithms that allocate and reuse memory buffers was the trickiest part so far.

Projections were rather easy to handle though (with some code borrowed from range-v3 to be honest), and I had the pleasant surprise to realize that they could be useful even when comparisons weren't handled (spreadsort being the canonical example of such a case).


On a side note, if you want a fast in-place unstable sorting algorithm, the new version of pattern-defeating quicksort is quite promising, with execution times typically twice as fast as libstdc++'s std::sort. It is quite recent but looks more than decent. It's licensed under zlib so it's compatible (and proposed for integration in Boost.Sort; older versions of the algorithm were proposed to libstdc++ and libc++).

@MikeGitb
Copy link
Contributor

Not sure, if this is still an issue, but if it is, can't you just implement ranges::sort in terms of std::sort?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants