binaryninja-api icon indicating copy to clipboard operation
binaryninja-api copied to clipboard

Show inlined standard functions as function call

Open op2786 opened this issue 3 years ago • 1 comments

Compilers sometimes makes standard functions (strlen, memcpy, strcat, memset, strcmp, memcmp etc) inline. I guess their code pattern can be recognized and replaced with pseudo call to function.

Example disassembly:

1800382a2  488dbda0120000     lea     rdi, [rbp+0x12a0 {Dst}]
1800382a9  33c0               xor     eax, eax  {0x0}
1800382ab  b90c010000         mov     ecx, 0x10c
1800382b0  f3aa               rep stosb byte [rdi]  {0x0}  {0x0}  {0x0}

Output in HLIL:

1800382a2          char (* rdi_1)[0x110] = &Dst
1800382b0          for (int64_t rcx_4 = 0x10c; rcx_4 != 0; rcx_4 = rcx_4 - 1) {
1800382b0              *rdi_1 = 0
1800382b0              rdi_1 = &(*rdi_1)[1]
1800382b0          }

Which can be replaced memset(Dst, 0, 0x10c). It may be related to #2185.

op2786 avatar Aug 03 '22 11:08 op2786

This is a subset of the functionality that would be required for #2185, so we're leaving this issue to track automatically resolving standard library calls that get inlined. The other issue tracks being able to make any HLIL code into an inlined function.

fuzyll avatar Aug 08 '22 17:08 fuzyll

Currently we have partial support for this feature. It is currently limited to "Constant Data" i.e. When a string or data is "usually" written to sequential stack locations. We recover these and display them as one of:

  • [x] __builtin_strcpy
  • [x] __builtin_strncpy
  • [x] __builtin_memcpy
  • [x] __builtin_wcscpy
  • [x] __builtin_memset

Still TODO: Comparison functions:

  • [ ] memcmp
  • [ ] strcmp

Recovery of non-"Constant Data" functions:

  • [ ] strlen
  • [ ] strcpy
  • [ ] strncpy
  • [ ] memcpy
  • [ ] memcmp
  • [ ] wcscpy
  • [ ] strcmp

plafosse avatar Jun 26 '23 14:06 plafosse