bytestring icon indicating copy to clipboard operation
bytestring copied to clipboard

Remove internal uses of `withForeignPtr`

Open clyring opened this issue 3 years ago • 5 comments

As discussed at #535, withForeignPtr will be much slower in at least GHC-9.2.4 and GHC-9.4.1. So I made a quick pass at removing all internal uses of this function. To do this, I added variants of the create family of functions whose "action" arguments take ForeignPtr Word8 instead of Ptr Word8, allowing the caller to decide whether to use withForeignPtr or unsafeWithForeignPtr. Then I used those ForeignPtr variants everywhere.

I haven't yet had time to investigate whether this actually helps performance with 9.2.4 and obviously for a release I should need to do this against the 0.11 branch. But I thought I should push this for review now anyway.

clyring avatar Jul 29 '22 23:07 clyring

My machine is giving me a very hard time, so I might not be able to continue work on this for a while. The timing is unfortunate. Maintainers, please feel free to push review suggestions to my branch as time is scarce.

A few other thoughts:

  • I'm not attached to the names of the helper functions I introduced.
  • Do we want to export these helper functions from D.B.Internal? This is always awkward. Anything we don't want to export from there can be moved to a new module like D.B.Internal.Utils which {-# SOURCE #-} imports D.B.Internal.
  • There are a few bytestring-internal uses of create that can't be safely replaced by create_but_with_unsafeWithForeignPtr. One of these is scanl, already handled correctly in this patch. It looks like I missed filter in my first pass. Are there others?

clyring avatar Jul 30 '22 10:07 clyring

Sorry, I’m AFK atm, will take a look tomorrow night.

Bodigrim avatar Jul 30 '22 22:07 Bodigrim

  • I'm not attached to the names of the helper functions I introduced.

  • Do we want to export these helper functions from D.B.Internal? This is always awkward. Anything we don't want to export from there can be moved to a new module like D.B.Internal.Utils which {-# SOURCE #-} imports D.B.Internal.

Yeah, I'd prefer to hide these helpers from users, D.B.Internal.Utils is a good location. And hiding them from users would save us bikeshedding of names.

Bodigrim avatar Jul 31 '22 21:07 Bodigrim

I was finally able to work on this again tonight. The diff is rather large, but the majority of it is a very mechanical translation from code that uses Ptr to code that uses ForeignPtr via the new helper functions. So I hope it is easy to review despite its size.

  • I have moved the new helper functions into Data.ByteString.Internal.Utils, so there is no change to the exposed API.
  • I audited every use of unsafeWithForeignPtr in the library, and adjusted every use site that didn't look "obviously always terminating."
  • I briefly looked into backporting this to the 0.11 branch; there is a conflict since #443 has not been backported. But it's probably easy to fix, and we could also just backport #443 first.

clyring avatar Aug 07 '22 02:08 clyring

Thank you for doing this, @clyring. For what it's worth, I agree with @Bodigrim regarding naming. The lower-case fp and f suffixes make for names that are very hard to visually parse.

bgamari avatar Aug 07 '22 18:08 bgamari

ping @sjakobi

clyring avatar Aug 22 '22 01:08 clyring

Apologies for the delay, @clyring! I intend to get back to this at the end of this week.

BTW, has anyone investigated the perf impact already?

I haven't yet had time to investigate whether this actually helps performance with 9.2.4 and obviously for a release I should need to do this against the 0.11 branch.

Otherwise I'll do that when I'm back.

sjakobi avatar Aug 22 '22 19:08 sjakobi

BTW, has anyone investigated the perf impact already?

I haven't yet had time to investigate whether this actually helps performance with 9.2.4 and obviously for a release I should need to do this against the 0.11 branch.

Otherwise I'll do that when I'm back.

I tried measuring the effect on runtime via our benchmarks but the noise in my attempts was greater than the effect size.

I also just tried running our entire testsuite with --quickcheck-replay=953718 +RTS -s and found this branch consistently gets 0.6% fewer "bytes allocated in the heap" than does master. That seems promising, but I'm sure it's not a complete picture.

clyring avatar Aug 22 '22 22:08 clyring

I've verified that the improvement in runtime allocations is about 10% for the relevant GHC perf test. That test also allocates a bunch of list objects, so the improvement when looking at "only" bytestring stuff is probably a fair bit larger. (But I don't plan to investigate this any further.)

PS: Thanks sjakobi for bringing the performance question back up.

clyring avatar Aug 26 '22 02:08 clyring

I've started comparing the benchmark results with this patch and GHC 9.2.4 vs. master with GHC 9.2.3. On first sight, this patch did not seem to recover a 2x slowdown in some of the Builder benchmarks. I'll take a closer look tomorrow or on Thursday.

sjakobi avatar Aug 30 '22 11:08 sjakobi

Here are some examples of persistent slowdowns (comparing master with 9.2.3 (OLD) vs. this patch with 9.2.4 (NEW)):

Name,Old,New,Ratio
All.Data.ByteString.Builder.Encoding wrappers.foldMap word8 (10000),61623738,113407945,1.84033
All.Data.ByteString.Builder.ByteString insertion.foldMap byteStringInsert byteStringChunks4Data (10000),266916778,323701687,1.21274
All.Data.ByteString.Builder.ByteString insertion.foldMap byteString byteStringChunks4Data (10000),50016262,99267230,1.9847
All.Data.ByteString.Builder.ByteString insertion.foldMap byteStringCopy byteStringChunks4Data (10000),50212852,97953266,1.95076

sjakobi avatar Aug 30 '22 11:08 sjakobi

(Off topic: @clyring would you mind to drop me an email? I have a question to discuss, but cannot find your contacts)

Bodigrim avatar Sep 03 '22 23:09 Bodigrim

Not sure what else we can do here. If this branch is better than status quo, let’s merge as is? I’d like to push out 0.11.4.0 soon.

Bodigrim avatar Sep 17 '22 16:09 Bodigrim

I'd like to understand the Builder regressions a bit better. But they seem likely unrelated to the expected keepAlive# regressions this patch aims to fix, and shouldn't block it.

Later today I will try to reproduce the 9.2.4 Builder regressions and inquire about them at GHC's issue tracker.

clyring avatar Sep 18 '22 12:09 clyring

I was able to reproduce; it's rather terrible. byteStringHex gets 10x slower for me. That particular regression seems to happen because GHC is no longer eta-expanding something it probably should. I created this GHC issue on the matter.

clyring avatar Sep 18 '22 22:09 clyring