persistent-vector icon indicating copy to clipboard operation
persistent-vector copied to clipboard

Use SmallArray# instead of Array#

Open treeowl opened this issue 5 years ago • 4 comments

The arrays are fairly small, and, just as importantly, they're used in a "mostly-pure" fashion. So performance will almost certainly be better with SmallArray# than Array#.

treeowl avatar Oct 17 '20 00:10 treeowl

Good idea - I think SmallArray# was introduced after this code was started, but I never re-evaluated. I suspect this would help some of the unfortunate GC behavior I've observed.

travitch avatar Oct 17 '20 00:10 travitch

What unfortunate behavior have you observed? SmallArray is much more about trying to make the mutator fast; Array tries to avoid certain pathological GC perf problems that just probably don't show up here.

treeowl avatar Oct 17 '20 00:10 treeowl

It has been difficult to get a detailed breakdown of where the GC spends its time, but my benchmarks have just spent way more time than I expected collecting garbage. I think it was a combination of small nursery sizes (in older GHC versions) and each modification re-allocating multiple array chunks (to rebuild the spine of the tree).

The documentation for SmallArray indicates that it is more efficient during GC as long as your card table would have only had a single entry (i.e., < 128), which should apply in this case. I'm not sure how much that will help, of course.

travitch avatar Oct 17 '20 01:10 travitch

Oh, interesting. I'd never read that. But I can't imagine an unnecessary card table check would explain particularly bad GC performance. Probably something else going on. Of course, performance building really big vectors is basically guaranteed to be shit regardless, just because it violates the generational hypothesis. We can only do what we can do.

treeowl avatar Oct 17 '20 01:10 treeowl