webdiff icon indicating copy to clipboard operation
webdiff copied to clipboard

Get diffs from diff

Open danvk opened this issue 6 years ago • 0 comments

Both diff and git diff have lots of command line options to control how the diff is determined. It would make a lot of sense for webdiff to just use the diffs found by those tools, rather than calculating its own.

See also #100.

You can use git diff to generate a diff between arbitrary files (outside of a git repo) using git diff --no-index filea fileb (source). This also works with directories, e.g. git diff --no-index testdata/manyfiles/{left,right}.

Options that look relevant (from git diff --help):

   --indent-heuristic
       Enable the heuristic that shifts diff hunk boundaries to make patches easier to read. This is the default.

   --no-indent-heuristic
       Disable the indent heuristic.

   --minimal
       Spend extra time to make sure the smallest possible diff is produced.

   --patience
       Generate a diff using the "patience diff" algorithm.

   --histogram
       Generate a diff using the "histogram diff" algorithm.

   --anchored=<text>
       Generate a diff using the "anchored diff" algorithm.

       This option may be specified more than once.

       If a line exists in both the source and destination, exists only once, and starts with this text, this algorithm attempts to prevent
       it from appearing as a deletion or addition in the output. It uses the "patience diff" algorithm internally.

   --diff-algorithm={patience|minimal|histogram|myers}
       Choose a diff algorithm. The variants are as follows:

       default, myers
           The basic greedy diff algorithm. Currently, this is the default.

       minimal
           Spend extra time to make sure the smallest possible diff is produced.

       patience
           Use "patience diff" algorithm when generating patches.

       histogram
           This algorithm extends the patience algorithm to "support low-occurrence common elements".

       For instance, if you configured the diff.algorithm variable to a non-default value and want to use the default one, then you have to
       use --diff-algorithm=default option.

   --word-diff[=<mode>]
       Show a word diff, using the <mode> to delimit changed words. By default, words are delimited by whitespace; see --word-diff-regex
       below. The <mode> defaults to plain, and must be one of:

       color
           Highlight changed words using only colors. Implies --color.

       plain
           Show words as [-removed-] and {+added+}. Makes no attempts to escape the delimiters if they appear in the input, so the output
           may be ambiguous.

       porcelain
           Use a special line-based format intended for script consumption. Added/removed/unchanged runs are printed in the usual unified
           diff format, starting with a +/-/` ` character at the beginning of the line and extending to the end of the line. Newlines in the
           input are represented by a tilde ~ on a line of its own.

       none
           Disable word diff again.

       Note that despite the name of the first mode, color is used to highlight the changed parts in all modes if enabled.

   -M[<n>], --find-renames[=<n>]
       Detect renames. If n is specified, it is a threshold on the similarity index (i.e. amount of addition/deletions compared to the
       file’s size). For example, -M90% means Git should consider a delete/add pair to be a rename if more than 90% of the file hasn’t
       changed. Without a % sign, the number is to be read as a fraction, with a decimal point before it. I.e., -M5 becomes 0.5, and is thus
       the same as -M50%. Similarly, -M05 is the same as -M5%. To limit detection to exact renames, use -M100%. The default similarity index
       is 50%.

   -C[<n>], --find-copies[=<n>]
       Detect copies as well as renames. See also --find-copies-harder. If n is specified, it has the same meaning as for -M<n>.

   --find-copies-harder
       For performance reasons, by default, -C option finds copies only if the original file of the copy was modified in the same changeset.
       This flag makes the command inspect unmodified files as candidates for the source of copy. This is a very expensive operation for
       large projects, so use it with caution. Giving more than one -C option has the same effect.

   --ignore-cr-at-eol
       Ignore carriage-return at the end of line when doing a comparison.

   --ignore-space-at-eol
       Ignore changes in whitespace at EOL.

   -b, --ignore-space-change
       Ignore changes in amount of whitespace. This ignores whitespace at line end, and considers all other sequences of one or more
       whitespace characters to be equivalent.

   -w, --ignore-all-space
       Ignore whitespace when comparing lines. This ignores differences even if one line has whitespace where the other line has none.

   --ignore-blank-lines
       Ignore changes whose lines are all blank.

   --inter-hunk-context=<lines>
       Show the context between diff hunks, up to the specified number of lines, thereby fusing hunks that are close to each other. Defaults
       to diff.interHunkContext or 0 if the config option is unset.

   -W, --function-context
       Show whole function as context lines for each change. The function names are determined in the same way as git diff works out patch
       hunk headers (see Defining a custom hunk-header in gitattributes(5)).

danvk avatar Oct 07 '19 20:10 danvk