diff-match-patch icon indicating copy to clipboard operation
diff-match-patch copied to clipboard

imperfect match with deletions in the diff will cause incorrect patch (C#)

Open derammo opened this issue 5 years ago • 1 comments

Here is an example that fails:

    private static void TestBrokenImperfectMatch()
    {
        string referenceInput = "diff matching patching";
        string referenceOutput = "diff match patch";
        string imperfectInput = "diff matching pthing";
        diff_match_patch googleDiff = new diff_match_patch();
        List<Diff> diffs = googleDiff.diff_main(referenceInput, referenceOutput);
        googleDiff.diff_cleanupSemantic(diffs);
        List<Patch> patches = googleDiff.patch_make(diffs);
        Debug.WriteLine(googleDiff.patch_toText(patches));
        string patched = (string)googleDiff.patch_apply(patches, imperfectInput)[0];
        Debug.WriteLine(patched);
        Debug.Assert(patched == "diff match pth");
    }

The root cause is that, while processing the imperfect match, the reference text and actual text (text1 and text2) offsets are inaccurate after the first deletion ("a") and so the index calculation gets the wrong result for the removal of the second "ing". It ends up removing "ng". I could not fix it while still using index2 as both the offset into the actual string being edited ("text") and into the unedited input ("text2") so I introduced an offset between the two.

A fix is here https://github.com/derammo/diff-match-patch/commit/0f5aa41edb2b312d2c8c3b6effb51fe237028cde but I won't make a PR because I don't have code for the other languages and I have basically no test coverage.

derammo avatar Mar 21 '20 02:03 derammo

Looks like this repo is dead / unsupported at this point. I will just work in my fork forever.

derammo avatar Jul 14 '20 14:07 derammo