iabtcf-java icon indicating copy to clipboard operation
iabtcf-java copied to clipboard

perf(decoder): avoid lots of calls to ensure data and field length/offset lookups

Open mkjois opened this issue 2 years ago • 12 comments

I've made three changes (three commits) to reduce the number of calls to ensureReadable() and to the LengthOffsetCache internal maps. We've seen 2x speed improvement in TCString.decode() and 2% overall CPU usage improvement in our production workload, which at high scale results in tangible cost savings.

Just to note, our use case doesn't benefit much from lazy decoding, since we need to decode the vendor consent, vendor legitimate interest, and publisher restrictions for almost all requests, and those three seem to dominate the decoding time in our profiling measurements when TCStringV2.hashCode() is invoked upon eager decoding.

mkjois avatar Mar 23 '23 16:03 mkjois

Tagging some common contributors to help review: @laktech @imayankmishra @iabmayank @srinivas81

mkjois avatar Mar 23 '23 16:03 mkjois

thanks! i'll be able to review in a few days.

laktech avatar Mar 23 '23 23:03 laktech

I also have another change on top of this where we make sure to use primitive int in decoding everywhere instead of Integer. Looks like that can squeeze out another 0.5% overall on our production workload. I'll submit that as another PR.

mkjois avatar Mar 24 '23 11:03 mkjois

@laktech As promised, I've made a couple more PRs on top of this to test even further optimizations:

  1. (https://github.com/mkjois/iabtcf-java/pull/1) Avoids any boxed integers for a ~25% speedup.
  2. (https://github.com/mkjois/iabtcf-java/pull/2) A more speculative change to avoid lots of invariant checking in BitSet for a ~15% speedup.

Let me know what you think, when you're ready.

mkjois avatar Mar 24 '23 16:03 mkjois

hey, are you able to share the benchmark that you're running?

laktech avatar Apr 10 '23 15:04 laktech

@laktech It wasn't a proper offline benchmark like with JMH or anything. I basically just tried it with a production-like workload as part of a much larger code path, and viewed some flame graph profiles. If you have any better JMH or other benchmarks set up, please feel free to test this out.

mkjois avatar Apr 10 '23 21:04 mkjois

@laktech Any progress in reviewing?

mkjois avatar Apr 19 '23 22:04 mkjois

Nice

ChristopherWirt avatar May 25 '23 20:05 ChristopherWirt

Hey, any progress on reviewing this?

lukhar avatar May 25 '23 20:05 lukhar

@laktech Another bump 🙏

mkjois avatar Jul 11 '23 13:07 mkjois

sorry i stepped away from this for a few months. i'll be reviewing this once again.

laktech avatar Aug 28 '23 19:08 laktech

Overall, this looks great. I just have some concerns around the two areas I commented above.

laktech avatar Sep 05 '23 17:09 laktech