References to data variables should be shown as constants when used known constant contexts
Version and Platform (required):
- Binary Ninja Version: 3.4.4248-dev, c26685e2
- OS: macos
- OS Version: 13.3
- CPU Architecture: arm64
It looks like BN is producing the wrong IL for movw r2, #0x47e, e.g.
00000bdfa 40f27e42 movw r2, #0x47e
00000bdfe 86e80300 stm r6, {r0, r1} {var_a68} {var_a68} {var_a64} {0x41e5999a} {0x41dc0000}
00000be02 04f10801 add r1, r4, #8 {SUB_BDD8_DATA}
00000be06 b2a8 add r0, sp, #0x2c8 {var_920}
00000be08 bbf08cff bl #memcpy
According to IDA, this is
v6 = memcpy((unsigned int)v176, (char *)SUB_BDD8_DATA, 0x47Eu);
but BN produces HLIL
memcpy(&var_920, &SUB_BDD8_DATA, &data_47e)
This starts from LLIL where it does not recognize 0x47E as an immediate value
40 @ 00000bdfa r2 = &data_47e
It's being inferred to be a pointer to data - it looks like this is because 0x47e is a valid data address. Is the binary base in IDA also 0?
Binary base is zero for both IDA and BN.
On Fri, Apr 21, 2023 at 4:58 PM Josh F @.***> wrote:
It's being inferred to be a pointer to data - it looks like this is because 0x47e is a valid data address. Is the binary base in IDA also 0?
— Reply to this email directly, view it on GitHub https://github.com/Vector35/binaryninja-api/issues/4259#issuecomment-1517877243, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAESZ4RNDNXXIPB7FCFDRDXCKG6TANCNFSM6AAAAAAXG4VKM4 . You are receiving this because you authored the thread.Message ID: @.***>
Ok so while it's TECHNICALLY correct, the way we show it is really misleading. We should either simplify instances of &<datavar> to their static value, or never emit that as a reference in the first place
Edit:
In some places, it's nice to be able to see a reference to a data variable, like memcpy(&data_100, &data_40, 8) so this shouldn't be applied everywhere.
An easy workaround for this in most cases is to rebase the binary to some non-zero value
V35 should search for "Encourage Salesman Prompt Delay" to find the database.
Forgot to add that rebasing binary to a non-zero value may not work as it may break string references.
The only time it would break string references I would expect it to break all data variable references if the binary actually should be based at 0. Are you seeing strings break when rebasing and the binary image isn't supposed to be based at zero? If so, that might be a different bug.
I have a binary that must be based at address 0x500000 (example) for strings to be referenced. This is because the compiler inserted a bunch of fixed references to 0x50xxxx into the code.
I don't think it matters what the base address is, though. You will break string references as long as there are fixed addresses in the code and you rebase to the address the compiler does not expect.
That's my point though. Only if those fixed locations that require a base address of 0x0 are ones that would cause issue. All the rest, like your example don't require the work around but should already have less issue if they're based correctly.