Faster equality in generic contexts
What is this?
This is the first part of this effort to resurrect the awesome work on compiler performance by @manofstick.
How is this different?
Many things have changed, among other things we now have a bigger team & community to review things and a different release cadence allowing us to dogfood things a bit.
More importantly, the original PRs often contained multiple ideas (fixes, breaking, optimizations, experiments, ...), whereas I want to narrow PRs as much as possible to make things easier to review and control.
Is this PR breaking?
A tiny bit. This is mostly an optimization.
Here is an example of code where the behavior does change:
[<Struct; CustomEquality; NoComparison>]
type Blah =
interface IEquatable<Blah> with
member lhs.Equals rhs = failwith "bang"
let eq x y =
printfn "hello" // long code insuring no inlining
...
x = y // generic equality context
let result = eq (Blah()) (Blah()) // currently false, will be exception
We agreed that it's OK here.
What's the source of inspiration here?
This is essentially part of the #5112, with the following changes:
- Optimizer is not touched (there wasn't a consensus on that)
- tail calls are not touched (same, also partially was there to work around JIT issues that were resolved later)
- => hence IL is not changed
- some comparison optimizations are removed (I wasn't convinced by them)
- refactoring is removed (might be done in a followup)
- => hence diff is minimized
- benchmarks are added
So where does this improve things?
More theory and motivation is in this document - link will be updated once the PR merged.
TL;DR: this improves things when HashIdentity.Structural<'T> comparison is used in non-inlined code.
Benchmarks
Main targets: structs, enums, floats, and special generic types:
Structs and enums
Before:
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|
| Struct | 301.8 us | 14.77 us | 43.56 us | 68.3594 | - | - | 420.61 KB |
| Enum | 299.4 us | 19.55 us | 57.63 us | 22.9492 | - | - | 140.65 KB |
After:
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|
| Struct | 143.72 us | 7.325 us | 21.598 us | 0.1221 | - | - | 792 B |
| Enum | 25.93 us | 1.803 us | 5.289 us | 0.0458 | - | - | 336 B |
2x and 12x speed improvements for structs and enums respectively and nearly zero allocations now in both cases.
Value tuples
Before:
| Method | Mean | Ratio | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|
| ValueTuple3 | 673.4 us | 1.00 | 61.5234 | 15.8691 | - | 378.13 KB |
| ValueTuple4 | 812.2 us | 1.22 | 69.0918 | 19.7754 | - | 424.98 KB |
| ValueTuple5 | 1,004.2 us | 1.50 | 84.9609 | 24.4141 | - | 523.63 KB |
| ValueTuple6 | 1,100.7 us | 1.65 | 92.7734 | 23.4375 | - | 570.48 KB |
| ValueTuple7 | 1,324.9 us | 1.97 | 117.1875 | 57.6172 | 29.2969 | 669.14 KB |
| ValueTuple8 | 1,461.9 us | 2.20 | 117.1875 | 58.1055 | 29.2969 | 762.85 KB |
After:
| Method | Mean | Ratio | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|
| ValueTuple3 | 173.0 us | 1.00 | 28.5645 | 9.3994 | - | 175.11 KB |
| ValueTuple4 | 174.9 us | 1.03 | 28.5645 | 9.4604 | - | 175.11 KB |
| ValueTuple5 | 208.9 us | 1.22 | 34.4238 | 11.3525 | - | 211.29 KB |
| ValueTuple6 | 217.0 us | 1.26 | 34.4238 | 11.3525 | - | 211.29 KB |
| ValueTuple7 | 293.7 us | 1.73 | 29.2969 | 29.2969 | 29.2969 | 247.48 KB |
| ValueTuple8 | 293.8 us | 1.73 | 29.2969 | 29.2969 | 29.2969 | 247.48 KB |
~80% in speed and ~50% in memory reduction, also much steeper ratios' increase for both.
Options and co
Before:
| Method | Mean | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|
| Option | 165.0 us | 16.3574 | 3.1738 | - | 101.74 KB |
| ValueOption | 157.1 us | 28.8086 | 4.3945 | - | 177.02 KB |
| Result | 186.3 us | 40.4053 | 10.0098 | - | 248.25 KB |
After:
| Method | Mean | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|
| Option | 82.13 us | 12.6953 | 3.0518 | - | 78.33 KB |
| ValueOption | 55.09 us | 9.7656 | 1.5869 | - | 59.98 KB |
| Result | 75.92 us | 22.5830 | 5.6152 | - | 138.93 KB |
50-75% speed and 25-75% memory improvements.
Nullable
Before:
| Method | Mean | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|
| Nullable | 443.7 us | 24.9023 | 3.9063 | - | 153.66 KB |
After:
| Method | Mean | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|
| Nullable | 60.16 us | 9.7656 | 1.5869 | - | 59.94 KB |
About 7x speed and 3x memory improvements.
Floats
Before:
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|
| FloatER | 51.47 us | 2.228 us | 6.570 us | 7.0801 | 0.3662 | - | 43.68 KB |
| Float32ER | 56.39 us | 2.566 us | 7.525 us | 7.1106 | 0.3662 | - | 43.68 KB |
| FloatPER | 149.49 us | 2.952 us | 7.513 us | 38.3301 | 2.6855 | - | 231.34 KB |
| Float32PER | 136.09 us | 5.584 us | 16.201 us | 37.5977 | 2.4414 | - | 227.01 KB |
After:
| Method | Mean | Error | StdDev | Median | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|---|
| FloatER | 15.26 us | 0.487 us | 1.397 us | 15.38 us | 3.2806 | 0.1678 | - | 20.1 KB |
| Float32ER | 15.80 us | 0.316 us | 0.666 us | 15.82 us | 3.2806 | 0.1678 | - | 20.1 KB |
| FloatPER | 82.88 us | 5.580 us | 16.452 us | 93.36 us | 14.9536 | 1.2817 | - | 90.46 KB |
| Float32PER | 94.22 us | 1.851 us | 3.904 us | 93.97 us | 14.1602 | 0.9766 | - | 86.43 KB |
PER comparison still takes more time and memory but still the improvements are 2-3 times in all cases.
Also (positively) affected: basic types, arrays, reference types - due to shorter call chains and less casting:
Arrays
Before:
| Method | Mean | Error | StdDev |
|---|---|---|---|
| Int32 | 974.3 us | 73.48 us | 214.3 us |
| Int64 | 1,090.7 us | 58.65 us | 172.9 us |
| Byte | 1,075.3 us | 41.56 us | 121.9 us |
| Obj | 1,451.8 us | 43.91 us | 128.8 us |
After:
| Method | Mean | Error | StdDev |
|---|---|---|---|
| Int32 | 253.3 us | 18.09 us | 52.78 us |
| Int64 | 312.3 us | 14.06 us | 41.45 us |
| Byte | 246.5 us | 6.11 us | 17.82 us |
| Obj | 489.3 us | 17.69 us | 51.61 us |
About 3x faster.
F# basic types
Before (countBy):
| Method | Mean | Error | StdDev |
|---|---|---|---|
| Bool | 39.06 us | 1.936 us | 5.709 us |
| SByte | 55.23 us | 2.032 us | 5.992 us |
| Byte | 50.62 us | 1.617 us | 4.766 us |
| Int16 | 85.46 us | 4.668 us | 13.764 us |
| UInt16 | 86.66 us | 3.189 us | 9.351 us |
| Int32 | 86.21 us | 4.690 us | 13.827 us |
| UInt32 | 87.69 us | 4.911 us | 14.480 us |
| Int64 | 112.81 us | 3.962 us | 11.681 us |
| UInt64 | 112.66 us | 4.003 us | 11.550 us |
| IntPtr | 114.61 us | 3.430 us | 10.114 us |
| UIntPtr | 109.40 us | 3.322 us | 9.796 us |
| Char | 98.99 us | 2.825 us | 8.330 us |
| String | 214.52 us | 5.968 us | 17.503 us |
| Decimal | 315.84 us | 12.545 us | 36.988 us |
After (countBy):
| Method | Mean | Error | StdDev |
|---|---|---|---|
| Bool | 29.19 us | 1.608 us | 4.740 us |
| SByte | 50.72 us | 2.167 us | 6.390 us |
| Byte | 47.40 us | 1.719 us | 5.069 us |
| Int16 | 83.50 us | 3.508 us | 10.342 us |
| UInt16 | 84.48 us | 2.949 us | 8.649 us |
| Int32 | 87.24 us | 2.670 us | 7.832 us |
| UInt32 | 86.17 us | 3.630 us | 10.703 us |
| Int64 | 95.78 us | 4.763 us | 14.044 us |
| UInt64 | 113.86 us | 4.278 us | 12.479 us |
| IntPtr | 110.92 us | 3.192 us | 9.412 us |
| UIntPtr | 105.11 us | 3.219 us | 9.440 us |
| Char | 91.03 us | 4.164 us | 12.211 us |
| String | 214.23 us | 5.859 us | 17.276 us |
| Decimal | 150.39 us | 3.703 us | 10.683 us |
Before (distinct):
| Method | Mean | Error | StdDev |
|---|---|---|---|
| Bool | 9.489 us | 0.7345 us | 2.142 us |
| SByte | 12.568 us | 0.4049 us | 1.168 us |
| Byte | 12.393 us | 0.4176 us | 1.231 us |
| Int16 | 23.539 us | 0.8906 us | 2.626 us |
| UInt16 | 22.311 us | 0.8351 us | 2.449 us |
| Int32 | 22.302 us | 0.4448 us | 1.180 us |
| UInt32 | 22.092 us | 0.4760 us | 1.319 us |
| Int64 | 25.567 us | 0.8679 us | 2.518 us |
| UInt64 | 26.200 us | 1.4150 us | 4.172 us |
| IntPtr | 25.528 us | 1.4250 us | 4.202 us |
| UIntPtr | 25.000 us | 0.8785 us | 2.590 us |
| Char | 22.883 us | 0.8124 us | 2.383 us |
| String | 47.659 us | 1.6451 us | 4.799 us |
| Decimal | 58.086 us | 4.7662 us | 13.828 us |
After:
| Method | Mean | Error | StdDev |
|---|---|---|---|
| Bool | 9.408 us | 0.9121 us | 2.689 us |
| SByte | 11.795 us | 0.3516 us | 1.020 us |
| Byte | 11.225 us | 0.4078 us | 1.170 us |
| Int16 | 23.211 us | 0.7982 us | 2.354 us |
| UInt16 | 21.682 us | 1.0798 us | 3.184 us |
| Int32 | 20.035 us | 0.6128 us | 1.787 us |
| UInt32 | 20.123 us | 0.8883 us | 2.591 us |
| Int64 | 22.967 us | 0.7908 us | 2.319 us |
| UInt64 | 24.757 us | 1.3208 us | 3.832 us |
| IntPtr | 26.448 us | 1.3528 us | 3.989 us |
| UIntPtr | 24.596 us | 1.1292 us | 3.330 us |
| Char | 22.670 us | 0.8961 us | 2.642 us |
| String | 40.480 us | 1.3767 us | 3.994 us |
| Decimal | 35.401 us | 1.4948 us | 4.289 us |
Note that decimals also show improvement in memory allocations:
Before (countBy)
| Method | Mean | Error | StdDev | Median | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|---|
| Decimal | 149.9 us | 13.45 us | 39.01 us | 136.1 us | 38.4521 | 12.8174 | - | 237.55 KB |
After (countBy)
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|
| Decimal | 65.17 us | 5.013 us | 13.807 us | 28.5645 | 9.4604 | - | 175.09 KB |
Before (distinct)
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|
| Decimal | 57.83 us | 6.097 us | 17.977 us | 26.3062 | 6.5308 | - | 162.35 KB |
After (distinct)
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|
| Decimal | 34.79 us | 1.572 us | 4.585 us | 21.2708 | 5.3101 | - | 131.1 KB |
These mostly stay the same (as expected), apart from decimal, which show ~50% speed and ~25% alloc improvements.
Records
Before:
| Method | Mean | Error | StdDev |
|---|---|---|---|
| Record | 157.8 us | 8.98 us | 25.61 us |
After:
| Method | Mean | Error | StdDev |
|---|---|---|---|
| Record | 143.1 us | 7.13 us | 20.00 us |
Which is about 10% improvement.
Generic unions
Before:
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|
| GenericUnion | 439.8 us | 17.85 us | 51.78 us | 41.9922 | 12.2070 | - | 260 KB |
After:
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|
| GenericUnion | 167.1 us | 3.32 us | 7.22 us | 26.9775 | 8.9722 | - | 166.3 KB |
About 60% and 30% improvements in speed and allocs.
Some (positive) implications for F#.Core.
F# core functions in question
Before:
| Method | Mean | Error | StdDev | Median |
|---|---|---|---|---|
| ArrayCountBy | 230.20 us | 7.098 us | 20.929 us | 230.07 us |
| ArrayGroupBy | 125.80 us | 5.344 us | 15.674 us | 127.24 us |
| ArrayDistinct | 121.08 us | 8.648 us | 25.500 us | 127.72 us |
| ArrayDistinctBy | 113.45 us | 2.636 us | 7.732 us | 114.15 us |
| ArrayExcept | 92.28 us | 1.835 us | 5.354 us | 92.66 us |
| ListCountBy | 225.45 us | 11.730 us | 34.585 us | 230.79 us |
| ListGroupBy | 167.40 us | 12.814 us | 37.783 us | 151.49 us |
| ListDistinct | 125.20 us | 4.192 us | 12.229 us | 127.01 us |
| ListDistinctBy | 109.52 us | 3.480 us | 10.097 us | 110.94 us |
| ListExcept | 194.14 us | 20.467 us | 60.347 us | 164.68 us |
| SeqCountBy | 460.13 us | 13.224 us | 38.575 us | 459.04 us |
| SeqGroupBy | 354.81 us | 15.000 us | 43.992 us | 348.92 us |
| SeqDistinct | 359.59 us | 12.339 us | 36.381 us | 361.93 us |
| SeqDistinctBy | 128.8 us | 5.70 us | 15.97 us | 127.2 us |
| SeqExcept | 127.8 us | 5.03 us | 14.74 us | 128.1 us |
After:
| Method | Mean | Error | StdDev | Median |
|---|---|---|---|---|
| ArrayCountBy | 137.30 us | 5.008 us | 14.767 us | 137.01 us |
| ArrayGroupBy | 78.80 us | 4.673 us | 13.778 us | 80.87 us |
| ArrayDistinct | 71.63 us | 3.698 us | 10.904 us | 73.49 us |
| ArrayDistinctBy | 69.93 us | 2.148 us | 6.299 us | 70.02 us |
| ArrayExcept | 68.39 us | 3.071 us | 9.008 us | 69.59 us |
| ListCountBy | 138.17 us | 6.708 us | 19.566 us | 142.05 us |
| ListGroupBy | 129.57 us | 12.765 us | 37.639 us | 110.78 us |
| ListDistinct | 85.99 us | 1.662 us | 3.683 us | 85.73 us |
| ListDistinctBy | 67.36 us | 3.001 us | 8.707 us | 67.95 us |
| ListExcept | 173.01 us | 21.536 us | 63.499 us | 138.14 us |
| SeqCountBy | 289.70 us | 12.109 us | 35.514 us | 295.43 us |
| SeqGroupBy | 245.97 us | 6.969 us | 20.329 us | 246.79 us |
| SeqDistinct | 254.17 us | 14.776 us | 43.336 us | 257.62 us |
| SeqDistinctBy | 101.7 us | 5.35 us | 14.91 us | 102.4 us |
| SeqExcept | 105.9 us | 5.20 us | 15.35 us | 102.1 us |
The improvement varies 20-40% in speed here.
Other considerations.
>64 bit value types
Before:
| Method | Mean | Error | StdDev |
|---|---|---|---|
| BigStruct | 47.43 ms | 3.136 ms | 9.098 ms |
After:
| Method | Mean | Error | StdDev |
|---|---|---|---|
| BigStruct | 44.57 ms | 3.243 ms | 9.562 ms |
This became marginally faster - but more importantly, it's here to address concerns about 64 bit JIT. So likely the underlying JIT problem got fixed meanwhile.
TODO
- [ ] AOT tests to see if startup performance is not too affected
- [ ] Finish the design doc on equality
- [ ] Describe the changed code flow
Followups
- Revive equality devirtualization
- Add optimizations for recently added primitive types, as suggested here
:heavy_exclamation_mark: Release notes required
:white_check_mark: Found changes and release notes in following paths:
Change path Release notes path Description src/FSharp.Coredocs/release-notes/.FSharp.Core/8.0.300.md
Benchmark results are weird, numbers before and after are within the statistical error. What about allocations? We shouldn't be boxing as much now?
@vzarytovskii yeah those are not the right benchmarks for this PR. We had a session with Don today to figure out the right ones and I am already getting some 25-30% improvements there, will post soon - stay tuned.
/azp run
Azure Pipelines successfully started running 2 pipeline(s).
Is this PR breaking? No. This is (supposed to be just) an optimization.
Well... It'll be calling IEquatable.Equals<>, not object.Equals, which is hence a breaking change - albeit I feel in a fashion that should be part of an evolution...
...and really should be paired with the optimizer change (@TIHan I believe worked on one at some stage?) Otherwise you have the somewhat bizarre situation where using an comparison of an external struct type such as a NodaTime.Instant in a generic context doesn't box and is fast, vs in a non-generic context where it does box is an is slower... Unless that has been resolved already?
Anyway, glad to see this moving forward. My beard is somewhat grayer now that when I first started trying to improve comparison/equality in F#!!
/azp run
Azure Pipelines successfully started running 2 pipeline(s).
Marking as ready to review since the CI is green finally :D
And glad to see you here, @manofstick :) BTW thanks for the clear commits and comments to the commits in your original PRs, that really helped me a lot.
As for this one, well, it's not breaking in terms of NaN behavior and in terms of IL. So far. That's probably what we care about the most.
Since it touches an important part of the language, I would like to have three things:
-
What changed, in the form of Given (generic) code: ... How it worked before: ... How does it work now: ...
Unless it exists somewhere already. It should help everyone in understanding this change in future in case of any issues.
-
Some AOT compilation tests involving changed equality. It can be a new project, like the trimming one we have. We should start having more AOT testing, and this is a good place to start, since we know AOT was working with equality before, it's crucial it's still working after the change.
-
Can we have some compiler perf comparison/profiling of using compiler+fslib before and after changes on something like FCS? It would be nice to have it as reference.
Thanks all for the reviews and the feedback.
This needs a thorough rereview with @dsyme, meanwhile I will be improving the description, clarifying the code and adding tests.
...and really should be paired with the optimizer change (@TIHan I believe worked on one at some stage?) Otherwise you have the somewhat bizarre situation where using an comparison of an external struct type such as a NodaTime.Instant in a generic context doesn't box and is fast, vs in a non-generic context where it does box is an is slower... Unless that has been resolved already?
Agreed that in principle these should go together, though can be done in a separate PR.
Well... It'll be calling IEquatable.Equals<>, not object.Equals, which is hence a breaking change - albeit I feel in a fashion that should be part of an evolution...
I think this specific change is OK - use IEquatable.Equals if it exists.
Well... It'll be calling IEquatable.Equals<>, not object.Equals, which is hence a breaking change - albeit I feel in a fashion that should be part of an evolution...
@manofstick Could you remind me of the specific thing that causes the call to IEquatable.Equals<, and which types it applies to. I assume it comes from EqualityComparer<'T>.Default and thus only applies to types that pass canUseDefaultEqualityComparer?
I think that is OK as a change. It is vanishingly rare to explicitly implement IEquatable.Equals<> on struct types in F# code, (except the automatic implementations provided by the compiler) - and even then it would always be the desired behaviour to have that implementation invoked
Also, does this need an addition to spec (rfc), since it changes how we do equality? Or rather shall the doc from the other PR live in the design repo?
@vzarytovskii I think it's a good idea to have that doc (from the other PR) in the design repo. This PR is meant to be an optimization and hence focuses on the implementation. I am looking at the spec while working on this and my impression so far is that the spec doesn't go that deep in the implementation to make this change anyhow misalign with it. If we discover anything of that nature, we can make the RFC or further discuss it.
I added benchmarks for the value tuples, options and that stuff - the results are very convincing so far. More to come soon!
@manofstick Could you remind me of the specific thing that causes the call to IEquatable.Equals<, and which types it applies to. I assume it comes from EqualityComparer<'T>.Default and thus only applies to types that pass canUseDefaultEqualityComparer?
Yep, simple as that.
I think that is OK as a change. It is vanishingly rare to explicitly implement IEquatable.Equals<> on struct types in F# code, (except the automatic implementations provided by the compiler) - and even then it would always be the desired behaviour to have that implementation invoked
Oh, completely agree. It was just that this was (as far as I understand it) the main reason why this change never went through in the past.
And just confirming there, given your comment, this change, as implemented, would be all types (that pass the check) using IEquatable<>.Equals in preference to object.Equals not just value types (obviously struct types get the most benefit, as they avoid the boxing step, but ref types can also get a slight improvement as they, don't need to cast input). But the check includes IsSealed, hence negating any potential inheritance issues. (Once again, from memory, don't have a compiler here to test, but F# records are created as Sealed always)
@manofstick just to be sure we're on the same page - which input cast do you mean? :)
@psfinaki
...forgive my github comment coding, I think memory serves, but it's more than enough for the gist of things if it's incorrect...
[<Sealed>] // only for sealed
type Blah(...) =
...
override lhs.Equals (rhsObj:obj) = // (1)
if obj.ReferenceEquals (lhs, rhsObj) then
true
else
match rhsObj with
| :? Blah as rhs -> // (2)
(lhs:>IEquatable<Blah>).Equals rhs // (4)
| _ -> false
interface IEquatable<Blah> with
member lhs.Equals rhs = // (3)
// actually do the check
Well (1) would of been the entry point, where at (2) would of been the cast (well type check), but EqualityComparer<Blah>.Default will call (3) directly.
Most likely the actually equality check in these cases will be the significant component of the cost, so savings is minimal, but as mentioned, this is just for complete understanding of what's going on here...
(4) ... just because I'm showing off, the IL that F# used to generate actually caused this operation to have a cost, but way back at the dawn of history I fixed that... ;-)
@psfinaki I believe it should be possible to write a test case that detects the change, e.g.
[<Sealed, Struct>]
type Blah() =
override lhs.Equals (rhsObj:obj) = true
interface IEquatable<Blah> with
member lhs.Equals rhs = failwith "bang"
let eq x y =
printfn "hello" // do this enough times to make sure no inlining
printfn "hello" // do this enough times to make sure no inlining
printfn "hello" // do this enough times to make sure no inlining
printfn "hello" // do this enough times to make sure no inlining
(x = y) // generic equality context
eq (Blah()) (Blah()) // true prior to this, exception now
I don't really think this needs a runtime library switch to disable, and we don't currently have such a mechanism to do those kinds of flags anyway
The equality spec should probably be updated to mention this case
Yes, correct. For reference, I guess this is the minimal code to demo the behavior difference:
[<Struct; CustomEquality; NoComparison>]
type Blah =
interface IEquatable<Blah> with
member lhs.Equals rhs = failwith "bang"
let eq x y =
printfn "hello" // do this enough times to make sure no inlining
...
x = y // generic equality context
let result = eq (Blah()) (Blah()) // false prior to this, exception now
Alright, so we've identified what breaks here and blessed it at the same time. Indeed, this is quite an esoteric scenario.
Good stuff, I will update the spec and the description to reflect this.
Non-inlined stacktrace examples before and after could be helpful for reviewers as well, could you add it pelase, if it's not too much work.
Yep I am going to add it to the PR description.
Okay so hopefully getting to the finish line here, I think I have added enough micro benchmarks and grouped them in the PR description.
Next week will add AOT testing, refresh the equality design doc, finalize the PR description and give this a final review with Don :)
/azp run
Azure Pipelines successfully started running 2 pipeline(s).
Nice
64 bit value types This became marginally faster
I'm just relying on your words here, as I have not built this, but they should be significantly faster, and non boxing (I'm not talking about my concern re tail calls here). I'm guessing that because they implement 'IStructuralEquality' they are using that path, and hence not being sped up. I thought in my original PR I handled these too, maybe I didn't, or maybe you didn't carry that code across, I don't know.
Anyway, what I would suggest is that the code for 'canUseDefault...' is used at compile time, and if default can be used then the 'IStructuralEquals' interface isn't implemented. This, once again, is a breaking change, but I think it's even lesser than the one introduced by this PR.
Another follow up, if it hasn't been started yet, would be used user Compare<>.Default for inequalities....
Yes, I think we will get to those as well. Next bigger one will be devirt equality, then we'll probably look into comparison. There is a lot of useful stuff to pull out of your contributions :)