gumtree icon indicating copy to clipboard operation
gumtree copied to clipboard

C File Comparison Result

Open qq664363232 opened this issue 4 years ago • 6 comments

When I use the docker command to compare the difference between two c files, the result seems to be a bit problematic. Some repair action that are recognized as moving by the tool do not exist in themselves, that is, the contents of this part of the two files are the same. res

The docker command I used is : docker run -v .../GumTreeDiff/test-data/original:/diff/left -v .../GumTreeDiff/test-data/modified:/diff/right -p 4567:4567 gumtreediff/gumtree webdiff left/test_c1.c right/test_c2.c

Is there a problem with the method of using gumtreediff?

qq664363232 avatar Nov 05 '21 05:11 qq664363232

Hi!

Strange result... Could you try with -m gumtree-simple ?

Cheers.

jrfaller avatar Nov 09 '21 18:11 jrfaller

After using the suggestions you provided, the above results are correct. But there are similar problems with the test results of header files, regardless of whether -m gumtree-simple is used or not.

This is input files and the corresponding result. head-test.zip

qq664363232 avatar Nov 15 '21 02:11 qq664363232

Hi!

Not sure to see where is the problem on this one ?

jrfaller avatar Nov 15 '21 08:11 jrfaller

The actual difference between the two header files includes two areas, as shown below:

image

image

But some false move actions have been identified . In the following case, the relevant content of the two files is actually the same. image

qq664363232 avatar Nov 15 '21 09:11 qq664363232

Thanks for the detailed explanation. You're right it should not happen :-)

After looking more closely to the case, I think this is due to the parser because for instance for the code

/* The layout of struct dictated by compiler */
struct kasan_global {
	const void *beg;		/* Address of the beginning of the global variable. */
	size_t size;			/* Size of the global variable. */
	size_t size_with_redzone;	/* Size of the variable + size of the red zone. 32 bytes aligned */
	const void *name;
	const void *module_name;	/* Name of the module where the global variable is declared. */
	unsigned long has_dynamic_init;	/* This needed for C++ */
#if KASAN_ABI_VERSION >= 4
	struct kasan_source_location *location;
#endif
#if KASAN_ABI_VERSION >= 5
	char *odr_indicator;
#endif
};

I receive the following AST :

 <tree type = "Declaration" pos = "3242" length = "543" line_before = "113" col_before = "0" line_after = "126" col_after = "2">
  <tree type = "DeclList" pos = "3242" length = "543" line_before = "113" col_before = "0" line_after = "126" col_after = "2">
  </tree>
 </tree>

Which is clearly not good enough. I suspect that this is the reason why it is mapped agains an unrelated struct. Unfortunately, I have no experience on cgum and cannot fix the bug :'(

jrfaller avatar Nov 15 '21 12:11 jrfaller

BTW @qq664363232 you might wanna try with the new c tree-sitter backend!

jrfaller avatar Mar 02 '22 11:03 jrfaller

If still hapenning, please post in cgum or tree-sitter-parser since

jrfaller avatar Nov 27 '23 19:11 jrfaller