binaryninja-api icon indicating copy to clipboard operation
binaryninja-api copied to clipboard

Add support for Cavium Octeon MIPS instructions, and general MIPS fixes

Open cryptwhoa opened this issue 1 year ago • 2 comments

Bear with me if I'm doing anything weird, this is literally the first time I've ever made a pull request on github. I also do most of my work on a non-internet-connected machine (whose git user name is noone), so hopefully nothing weird that came from that either.

This pull request adds support for Cavium MIPS instructions; these processors are also MIPS64 processors, so I added missing MIPS64 instructions that make Linux disassemble more nicely. While I was doing work here, I figured I might as well fix some lifting bugs as I found them.

Explanations for some potentially head-scratching decisions:

  • The way to turn on Cavium MIPS decoding is a settings option; it would be nice to have separate architectures, but the way that ELFs notify the presence of Cavium instructions is through the "options" field, which doesn't influence how the architecture is chosen, which means that without using settings, either no ELFs use Cavium MIPS or all ELFs use Cavium, which could break people's existing MIPS bndbs.
  • Registers were added for all of the coprocessor 2 registers, which really blows up the register space. As far as I know, this is the only way to put strings into intrinsics (unless you also want to blow up the intrinsic space). It would be nice to add support for something like an enum, that just maps numbers to strings, which can be passed to intrinsics, to help cut down on the number of registers that have to be added to do things like this.
  • Unaligned memory accesses (LWL/LWR/SWL/SWR/LDL/LDR/SDL/SDR) were turned into intrinsics, at least when they weren't part of a pair of adjacent instructions. This was a compromise between what would be really ugly lifted code and the previously-existing incorrect implementation.

Un-done work:

  • The cavium instructions ULW/ULD/USW/USD aren't decoded; their opcodes collide with LWL/LWR/SWL/SWR/LDL/LDR/SDL/SDR, but variably so, depending on the value of a bit in a cavium-specific register (which also makes some of those LWL/etc. into NOPs); this shouldn't be a huge deal, since they do the same thing (unaligned reads and writes), just in one instruction instead of 2.
  • VMM0, VMULU, ZCB, and ZCBT aren't lifted due to not having any code samples that use them. They should still disassemble fine though.
  • "unknown unknowns"; the cavium datasheet I found is from a pretty old generation, so if there are any instructions added in later versions, they're not handled.

An unsolicited suggestion would be that we don't need all of the MIPS version decode tables for 1 through 6; only a few are used anyways (the one for MIPS32, MIPS64, and cavium64), and as far as I can tell there aren't any forwards-incompatible instructions; the one example I saw was MIPS_LHI that only exists in version 1, which isn't an instruction in the documentation and only seems to be in SPIM documentation.

cryptwhoa avatar Jun 16 '24 04:06 cryptwhoa

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


noone seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Jun 16 '24 04:06 CLAassistant

So I do believe we can support this as a separate architecture using some of the newer platform callbacks. I can look them up when I get to a computer, but we do this now for a couple of architectures.

plafosse avatar Jun 16 '24 19:06 plafosse

As it stands right now this does not cause any regressions and the changes which do impact normal MIPS binaries are positive, I am inclined to merge this as-is. The Cavium specific instructions are not easily reviewable, however given the other changes to the MIPS architecture were correct and high-quality, the lifting of those instructions are not a cause for concern.

emesare avatar Jul 08 '24 19:07 emesare

Added in 4.1.5671, thank you again for your patience!

emesare avatar Jul 08 '24 22:07 emesare

There's one minor change I'm tempted to make to this after the next stable is out, that would be to look into swr/swl and lwr/lwl and have one be close to a nop and the other just the memory access (which is which depending on the endianness), potentially with the lifter looking ahead to make sure regs are the same (falling back to current behavior if they are not).

For the samples I've seen, I think would be generally viable and clean up intrinsics somewhat, but my sample size hasn't been enormous. You have any thoughts about this? This is strictly speaking less precise but possibly more useful.

rssor avatar Jul 17 '24 19:07 rssor