python-unidiff icon indicating copy to clipboard operation
python-unidiff copied to clipboard

Unidiff parsing correctly p4-describe's patches

Open matlo607 opened this issue 7 years ago • 2 comments

Perforce allows the generation of a patch from a changelist (~commit) using p4 describe

-du : unified output format, showing added and deleted lines with num lines of context, in a form compatible with the patch(1) utility.

However the format is a bit different from the one of git.

Perforce patch:

Change 4292377 by [email protected] on 2018/01/05 10:12:22

    perf/aggregate: implement codespeed JSON output
    
    Codespeed (https://github.com/tobami/codespeed/) is an open source
    project that can be used to track how some software performs over
    time. It stores performance test results in a database and can show
    nice graphs and charts on a web interface.
    
    As it can be interesting to use Codespeed to see how Git performance
    evolves over time and releases, let's implement a Codespeed output
    in "perf/aggregate.perl".
    
    Helped-by: Eric Sunshine <[email protected]>
    Signed-off-by: Christian Couder <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>

Affected files ...

... //depot/myproject/t/perf/aggregate.perl#16 edit

Differences ... 

==== //depot/myproject/t/perf/aggregate.perl#16 (text) ====

@@ -3,6 +3,7 @@
 use lib '../../perl/blib/lib';
 use strict;
 use warnings;
+use JSON;
 use Git;
 
 sub get_times {
@@ -35,10 +36,15 @@ sub format_times {
 	return $out;
 }
 
-my (@dirs, %dirnames, %dirabbrevs, %prefixes, @tests);
+my (@dirs, %dirnames, %dirabbrevs, %prefixes, @tests, $codespeed);
 while (scalar @ARGV) {
 	my $arg = $ARGV[0];
 	my $dir;
+	if ($arg eq "--codespeed") {
+		$codespeed = 1;
+		shift @ARGV;
+		next;
+	}
 	last if -f $arg or $arg eq "--";
 	if (! -d $arg) {
 		my $rev = Git::command_oneline(qw(rev-parse --verify), $arg);
@@ -70,8 +76,10 @@ if (not @tests) {
 }
 
 my $resultsdir = "test-results";
+my $results_section = "";
 if (exists $ENV{GIT_PERF_SUBSECTION} and $ENV{GIT_PERF_SUBSECTION} ne "") {
 	$resultsdir .= "/" . $ENV{GIT_PERF_SUBSECTION};
+	$results_section = $ENV{GIT_PERF_SUBSECTION};
 }
 
 my @subtests;
@@ -174,6 +182,58 @@ sub print_default_results {
 	}
 }
 
+sub print_codespeed_results {
+	my ($results_section) = @_;
+
+	my $project = "Git";
+
+	my $executable = `uname -s -m`;
+	chomp $executable;
+
+	if ($results_section ne "") {
+		$executable .= ", " . $results_section;
+	}
+
+	my $environment;
+	if (exists $ENV{GIT_PERF_REPO_NAME} and $ENV{GIT_PERF_REPO_NAME} ne "") {
+		$environment = $ENV{GIT_PERF_REPO_NAME};
+	} elsif (exists $ENV{GIT_TEST_INSTALLED} and $ENV{GIT_TEST_INSTALLED} ne "") {
+		$environment = $ENV{GIT_TEST_INSTALLED};
+		$environment =~ s|/bin-wrappers$||;
+	} else {
+		$environment = `uname -r`;
+		chomp $environment;
+	}
+
+	my @data;
+
+	for my $t (@subtests) {
+		for my $d (@dirs) {
+			my $commitid = $prefixes{$d};
+			$commitid =~ s/^build_//;
+			$commitid =~ s/\.$//;
+			my ($result_value, $u, $s) = get_times("$resultsdir/$prefixes{$d}$t.times");
+
+			my %vals = (
+				"commitid" => $commitid,
+				"project" => $project,
+				"branch" => $dirnames{$d},
+				"executable" => $executable,
+				"benchmark" => $shorttests{$t} . " " . read_descr("$resultsdir/$t.descr"),
+				"environment" => $environment,
+				"result_value" => $result_value,
+			    );
+			push @data, \%vals;
+		}
+	}
+
+	print to_json(\@data, {utf8 => 1, pretty => 1}), "\n";
+}
+
 binmode STDOUT, ":utf8" or die "PANIC on binmode: $!";
 
-print_default_results();
+if ($codespeed) {
+	print_codespeed_results($results_section);
+} else {
+	print_default_results();
+}

Git patch:

commit 05eb1c37ed345d0ea244a239dad18de830e022f6
Author: Christian Couder <[email protected]>
Date:   Fri Jan 5 10:12:22 2018 +0100

    perf/aggregate: implement codespeed JSON output
    
    Codespeed (https://github.com/tobami/codespeed/) is an open source
    project that can be used to track how some software performs over
    time. It stores performance test results in a database and can show
    nice graphs and charts on a web interface.
    
    As it can be interesting to use Codespeed to see how Git performance
    evolves over time and releases, let's implement a Codespeed output
    in "perf/aggregate.perl".
    
    Helped-by: Eric Sunshine <[email protected]>
    Signed-off-by: Christian Couder <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>

diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
index 3609cb5..5c439f6 100755
--- a/t/perf/aggregate.perl
+++ b/t/perf/aggregate.perl
@@ -3,6 +3,7 @@
 use lib '../../perl/blib/lib';
 use strict;
 use warnings;
+use JSON;
 use Git;
 
 sub get_times {
@@ -35,10 +36,15 @@ sub format_times {
 	return $out;
 }
 
-my (@dirs, %dirnames, %dirabbrevs, %prefixes, @tests);
+my (@dirs, %dirnames, %dirabbrevs, %prefixes, @tests, $codespeed);
 while (scalar @ARGV) {
 	my $arg = $ARGV[0];
 	my $dir;
+	if ($arg eq "--codespeed") {
+		$codespeed = 1;
+		shift @ARGV;
+		next;
+	}
 	last if -f $arg or $arg eq "--";
 	if (! -d $arg) {
 		my $rev = Git::command_oneline(qw(rev-parse --verify), $arg);
@@ -70,8 +76,10 @@ if (not @tests) {
 }
 
 my $resultsdir = "test-results";
+my $results_section = "";
 if (exists $ENV{GIT_PERF_SUBSECTION} and $ENV{GIT_PERF_SUBSECTION} ne "") {
 	$resultsdir .= "/" . $ENV{GIT_PERF_SUBSECTION};
+	$results_section = $ENV{GIT_PERF_SUBSECTION};
 }
 
 my @subtests;
@@ -174,6 +182,58 @@ sub print_default_results {
 	}
 }
 
+sub print_codespeed_results {
+	my ($results_section) = @_;
+
+	my $project = "Git";
+
+	my $executable = `uname -s -m`;
+	chomp $executable;
+
+	if ($results_section ne "") {
+		$executable .= ", " . $results_section;
+	}
+
+	my $environment;
+	if (exists $ENV{GIT_PERF_REPO_NAME} and $ENV{GIT_PERF_REPO_NAME} ne "") {
+		$environment = $ENV{GIT_PERF_REPO_NAME};
+	} elsif (exists $ENV{GIT_TEST_INSTALLED} and $ENV{GIT_TEST_INSTALLED} ne "") {
+		$environment = $ENV{GIT_TEST_INSTALLED};
+		$environment =~ s|/bin-wrappers$||;
+	} else {
+		$environment = `uname -r`;
+		chomp $environment;
+	}
+
+	my @data;
+
+	for my $t (@subtests) {
+		for my $d (@dirs) {
+			my $commitid = $prefixes{$d};
+			$commitid =~ s/^build_//;
+			$commitid =~ s/\.$//;
+			my ($result_value, $u, $s) = get_times("$resultsdir/$prefixes{$d}$t.times");
+
+			my %vals = (
+				"commitid" => $commitid,
+				"project" => $project,
+				"branch" => $dirnames{$d},
+				"executable" => $executable,
+				"benchmark" => $shorttests{$t} . " " . read_descr("$resultsdir/$t.descr"),
+				"environment" => $environment,
+				"result_value" => $result_value,
+			    );
+			push @data, \%vals;
+		}
+	}
+
+	print to_json(\@data, {utf8 => 1, pretty => 1}), "\n";
+}
+
 binmode STDOUT, ":utf8" or die "PANIC on binmode: $!";
 
-print_default_results();
+if ($codespeed) {
+	print_codespeed_results($results_section);
+} else {
+	print_default_results();
+}

In order to use unidiff with Perforce's patches, they had to be converted to git-like patches.

def unifieddiff_p4tognu(p4diff, strip_depth=0):
    """
    Converts p4 describe's unified diff format into GNU unified diff format.
    https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html
    """
    def strip_path(path, count):
        try:
            return '/'.join(path.split('/')[count:])
        except IndexError:
            return path

    import re
    import io
    output = io.StringIO("")
    header = re.compile('==== /([^#]+)#\d+ [^ ]+ ====')
    begin_new_file = False
    for line in p4diff.splitlines(False):
        # Workaround unidiff: a space after filenames generate an exception
        if begin_new_file:
            begin_new_file = False
        else:
            m = header.match(line)
            if m:
                path_source = 'a/' + strip_path('/a{}'.format(m.group(1)), strip_depth)
                path_target = 'b/' + strip_path('/b{}'.format(m.group(1)), strip_depth)

                # print GNU unified diff compliant header
                output.write('--- {}\n'.format(path_source))
                output.write('+++ {}\n'.format(path_target))
                begin_new_file = True
            else:
                output.write(line + '\n')
    return output.getvalue()

Would it be possible to have this feature native to unidiff ?

matlo607 avatar May 25 '18 11:05 matlo607

Sorry for not replying before. What I think it could be possible is to add some kind of plugin support allowing for unified diff variations, so if you know you are dealing with a p4 patch, you could explicitly indicate that and transform/extract specific metadata. Will keep this in the queue, not sure when I can get to it though. PRs, or other ideas welcome :-) Thanks.

matiasb avatar Jun 22 '18 14:06 matiasb

I was looking around the internet and came across this repo. I also want to be able to parse unified diffs from Perforce (P4) and have the same issue as OP. I just wanted to bump this thread and say that. I'll try doing @matlo607's idea with the conversion, though.

vishrutdixit avatar Apr 16 '24 17:04 vishrutdixit