go-cmp icon indicating copy to clipboard operation
go-cmp copied to clipboard

cmp.Diff() switching from property diffs to "new object" diff when object changes significantly

Open jgalliers opened this issue 2 years ago • 1 comments

I am using cmp.Diff() with a reporter to manage some complex tests. The library is excellent but I have encountered confusing behaviour which also seems to be undocumented, unless I have missed it.

When a large portion of a slice struct changes, the slice is reported as an entirely new object. From what I can tell this happens when more than half of the properties change.

I cannot find this behaviour documented anywhere except a passing reference (possibly?) in the searchBudget variable .

https://github.com/google/go-cmp/blob/c3ad8435e7bef96af35732bc0789e5a2278c6d5f/cmp/internal/diff/diff.go#L185-L188

My tests assume that when I change a property of an object with 6 properties, I receive the associated diff, rather than arbitrarily seeing (for example) 3 changes, and then "a new object" when 4 changes are present.

I appreciate this may be unique to me but it caught me by surprise and took a considerable amount of time to troubleshoot.

I have posted a minimum reproducible example here - https://github.com/jgalliers/go-cmp-repro

go run . will show a diff with 3 changes followed by changing a property (bringing the total changes to 4) which converts the diff output to a single change, which is an entirely new object.

3 diffs:

[]*main.Parent{
        &{
                Id: "app",
                Slice: []main.Child{
                        {
-                               Src:              "src",
+                               Src:              "src2",
-                               Dst:              "dst",
+                               Dst:              "dst2",
-                               DstVersion:       "v1",
+                               DstVersion:       "v2",
                                Routes:           {"test1", "test2"},
                                DeployDependency: false,
                        },
                },
        },
  }

4 diffs:

[]*main.Parent{
        &{
                Id: "app",
                Slice: []main.Child{
-                       {
-                               Src:              "src",
-                               Dst:              "dst",
-                               DstVersion:       "v1",
-                               Routes:           []string{"test1", "test2"},
-                               DeployDependency: true,
-                       },
+                       {Src: "src2", Dst: "dst2", DstVersion: "v2", Routes: []string{"test1", "test2"}},
                },
        },
  }

Is this desired or expected behaviour? Is it possible to disable this behaviour?

I would vastly prefer to receive a full property-wise diff, each property of which is reportable via the vx/vy objects rather than suddenly receiving a vy object with IsValid() = false due to the sudden conversion to a null pointer.

jgalliers avatar Oct 12 '23 04:10 jgalliers

Hi, assuming the searchBudget is the cause, we can adjust the value a little but not remove it. The search budget is necessary to keep the diffing algorithm as O(N), while the theoretical optimal diffing algorithm is O(N^2).

Assuming your repro is representative of what you usually do, then we could probably remove the searchBudget constraint if N is below a certain number since N^2 is still not that big if N is small.

dsnet avatar Jan 12 '24 18:01 dsnet