imperfect match with deletions in the diff will cause incorrect patch (C#)
Here is an example that fails:
private static void TestBrokenImperfectMatch()
{
string referenceInput = "diff matching patching";
string referenceOutput = "diff match patch";
string imperfectInput = "diff matching pthing";
diff_match_patch googleDiff = new diff_match_patch();
List<Diff> diffs = googleDiff.diff_main(referenceInput, referenceOutput);
googleDiff.diff_cleanupSemantic(diffs);
List<Patch> patches = googleDiff.patch_make(diffs);
Debug.WriteLine(googleDiff.patch_toText(patches));
string patched = (string)googleDiff.patch_apply(patches, imperfectInput)[0];
Debug.WriteLine(patched);
Debug.Assert(patched == "diff match pth");
}
The root cause is that, while processing the imperfect match, the reference text and actual text (text1 and text2) offsets are inaccurate after the first deletion ("a") and so the index calculation gets the wrong result for the removal of the second "ing". It ends up removing "ng". I could not fix it while still using index2 as both the offset into the actual string being edited ("text") and into the unedited input ("text2") so I introduced an offset between the two.
A fix is here https://github.com/derammo/diff-match-patch/commit/0f5aa41edb2b312d2c8c3b6effb51fe237028cde but I won't make a PR because I don't have code for the other languages and I have basically no test coverage.
Looks like this repo is dead / unsupported at this point. I will just work in my fork forever.