Octodiff
Octodiff copied to clipboard
Adler32RollingChecksumV2 seems to give bad results
Description
When using the V2 rolling checksum algorithm, files that are identical or very slightly different result in huge deltas: the whole new file gets added as the delta.
Environment
- repo freshly cloned from the current master branch (commit d87ee313dfd8e48fe96dc66a6b08af12751e06c1)
- VS2022 17.2.6 on Windows 10 x64
I had to make a small code change so the command line app would use the V2 algorithm by default:
diff --git a/source/Octodiff/Core/SupportedAlgorithms.cs b/source/Octodiff/Core/SupportedAlgorithms.cs
index 2cc2aa5..5552f13 100644
--- a/source/Octodiff/Core/SupportedAlgorithms.cs
+++ b/source/Octodiff/Core/SupportedAlgorithms.cs
@@ -52,7 +52,7 @@ namespace Octodiff.Core
public virtual IRollingChecksum Default()
{
- return Adler32Rolling();
+ return Adler32Rolling(true);
}
public virtual IRollingChecksum Create(string algorithm)
Steps to reproduce
- grab a random binary file; my test was
kernel32.dllfrom windows\system32 - create 2 copies of it:
copy1.dllandcopy2.dll - modify
copy2.dllvery slightly; I simply changed the first byte from'M'to'A' - run octodiff to create the deltas:
-
Octodiff.exe signature kernel32.dll signature.bin -
Octodiff.exe delta signature.bin copy1.dll delta1.bin -
Octodiff.exe delta signature.bin copy2.dll delta2.bin
-
- observe how the delta files are very "not delta-y"
Other notes
The V1 version of the algorithm does produce expectedly small delta files.