cats-collections icon indicating copy to clipboard operation
cats-collections copied to clipboard

Make Predicate stack-safe using Eval

Open bbjubjub2494 opened this issue 5 years ago • 7 comments

Fixes last comment in #142, Also adds contramap() ~~to compensate for the loss of inheritance from Function1.~~

Benchmarks

pre-existing 0.9.0 stack-unsafe implementation

(at 992770b8685ee0287df19b068a4d1f14818d27b7)

Benchmark                                               (n)   Mode  Cnt        Score       Error  Units
ChainedPredicateBench.catsCollectionsPredicateUnravel    10  thrpt   25  8168148.133 ± 83522.583  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel   100  thrpt   25   635545.444 ±  3674.601  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel  1000  thrpt   25    51776.603 ±   335.972  ops/s

(SOE at n = 10,000)

this PR (mark 3)

Benchmark                                                (n)   Mode  Cnt        Score       Error  Units
ChainedPredicateBench.catsCollectionsPredicateUnravel     10  thrpt   25  3092451.360 ± 20246.653  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel    100  thrpt   25   311739.000 ±  3842.162  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel   1000  thrpt   25    27788.238 ±   226.262  ops/s
ChainedPredicateBench.catsCollectionsPredicateUnravel  10000  thrpt   25     2626.175 ±    20.365  ops/s

That's roughly -50% at n = 100 and -45% at n = 1,000. It was a convoluted intensive example with all the stress on the internals, so I would consider it a worst case scenario. Most real world intensional sets probably put most of their complexity in the basic predicates rather than in thousands of combinations. Hence I would say -50% is the lower bound of the performance degradation in general. Is that acceptable?

bbjubjub2494 avatar Mar 05 '20 17:03 bbjubjub2494

There are other ways to make Predicate stack safe without giving up all the performance (see what was done in cats): https://github.com/typelevel/cats/blob/master/core/src/main/scala/cats/data/AndThen.scala

Will get back to you on that. I think a mix between using AndThen where possible and Eval otherwise might work.

bbjubjub2494 avatar Mar 05 '20 19:03 bbjubjub2494

Codecov Report

Merging #283 into master will decrease coverage by 0.08%. The diff coverage is 86.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #283      +/-   ##
==========================================
- Coverage    91.5%   91.42%   -0.09%     
==========================================
  Files          24       24              
  Lines        1625     1632       +7     
  Branches      214      219       +5     
==========================================
+ Hits         1487     1492       +5     
- Misses        138      140       +2
Impacted Files Coverage Δ
...re/src/main/scala/cats/collections/Predicate.scala 79.31% <86.66%> (+2.03%) :arrow_up:
core/src/main/scala/cats/collections/BitSet.scala 97.14% <0%> (-0.22%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0a98846...1c4a746. Read the comment docs.

codecov-io avatar Mar 05 '20 19:03 codecov-io

The PR stabilized and finalized. @johnynek would you mind reviewing again or pinging somebody else?

bbjubjub2494 avatar Apr 08 '20 21:04 bbjubjub2494

Ok there we go. Kleisli is much nicer to work with. We have AST simplification, a more complete scalacheck generator, and the Boolean algebra. I haven't run final benchmarks yet but I think performance profile is the same as 2 days ago.

bbjubjub2494 avatar Jul 30 '20 22:07 bbjubjub2494

Codecov Report

Merging #283 into master will decrease coverage by 0.01%. The diff coverage is 94.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #283      +/-   ##
==========================================
- Coverage   91.50%   91.49%   -0.02%     
==========================================
  Files          24       24              
  Lines        1625     1646      +21     
  Branches      214      207       -7     
==========================================
+ Hits         1487     1506      +19     
- Misses        138      140       +2     
Impacted Files Coverage Δ
...re/src/main/scala/cats/collections/Predicate.scala 85.71% <93.75%> (+8.44%) :arrow_up:
core/src/main/scala/cats/collections/Set.scala 88.79% <100.00%> (+0.09%) :arrow_up:
...ats/collections/arbitrary/ArbitraryPredicate.scala 100.00% <100.00%> (ø)
core/src/main/scala/cats/collections/BitSet.scala 97.14% <0.00%> (-0.22%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 51d6479...94604fb. Read the comment docs.

codecov-commenter avatar Jul 30 '20 22:07 codecov-commenter

Benchmarks

tip of PR

[info] Benchmark                                                (n)   Mode  Cnt        Score       Error  Units
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel     10  thrpt   25  1441476.494 ± 16104.839  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel    100  thrpt   25   150259.432 ±   447.757  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel   1000  thrpt   25    15139.120 ±    36.946  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel  10000  thrpt   25     1509.519 ±    15.303  ops/s

same benchmark against on master

[info] Benchmark                                               (n)   Mode  Cnt         Score       Error  Units
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel    10  thrpt   25  20347254.412 ± 17631.518  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel   100  thrpt   25   2097695.179 ± 16756.560  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel  1000  thrpt   25    188526.011 ±   276.839  ops/s

Woah that's not good is it? 13x loss, how did that happen?

bbjubjub2494 avatar Jul 31 '20 10:07 bbjubjub2494

benchmark without Kleisli

[info] Benchmark                                                (n)   Mode  Cnt        Score       Error  Units
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel     10  thrpt   25  4396333.243 ± 64479.117  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel    100  thrpt   25   520604.540 ±  2450.074  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel   1000  thrpt   25    48156.396 ±   125.049  ops/s
[info] ChainedPredicateBench.catsCollectionsPredicateUnravel  10000  thrpt   25     4519.274 ±     7.830  ops/s

still 5x

bbjubjub2494 avatar Jul 31 '20 11:07 bbjubjub2494