architecture doesn't have reference to the BinaryView
This will make certain arches possible - like dex smali.
I need to be able to access bv.session_data.string_table from the Architecture class.
I suspect there's another way around this -- we make Thumb2/Arm work without it, can you clarify a bit more about why you need session data from architecture?
In smali; strings, methods and other data are stored in their own distinctive tables (arrays). The instructions reference those strings and methods by index.
My current idea: save the tables in SessionData, and access it that way.
I would also like to see this addressed; the primary motivation is so that Python-based plugins could use settings defined for a Resource to change their disassembly format.
I also have a use case for this, smali is a powerful IL that's used for jvm bytecode transformations and I've been working on jvm bytecode support for some time (on and off, work is crazy). It's quite simple to convert Dalvik and "Java bytecode" to smali and with this change (and a preprocessing step to perform the initial conversion), I believe both Dalvik classes and "Java" classes would be able to leverage the benefits of the different ILs offered (and it would be much more reasonable to do now that projects support has been added)
Hopefully this issue can be improved, as it greatly limits the flexibility of developing decompilation plugins for unknown instruction sets.
For a C166 architecture, this will allow proper decoding of sring references which are addressed in a specific bank context.
@xusheng6 suggested I weigh in here: I'd like to perform some limited instruction decoding (basically a limited linear sweep with no further analysis) to extract data from executable code in a flat binary file that references data needed to inform subsequent loading of the file.
After so long, the most basic problem still hasn't been solved?
It might seem like a most basic issue but it requires some significant core changes to implement. We intentionally made some design decisions early on that made multithreaded analysis better/easier for some architectures but made it more difficult for architectures like this.
That said, I have good news, this issue is currently being worked on and we hope to have it resolved by the 5.1 release (still several months out though so no promises...)
I just want to add some context. We could have taken a simpler approach and modified the GetInstructionInfo/GetInstructionText/GetInstructionLowLevelIL APIs to take a BinaryView or Function parameter. There are a few reasons why we don't like this option:
- It's a poor design - for many DSP architectures (i.e. DSP archs that support instruction pipelines, hardware loops, etc..) you'd need to call into the
BinaryViewon every instruction to query information about previous instructions or other state-
GetInstructionInfo/GetInstructionText/GetInstructionLowLevelILare static methods that are invoked via callbacks - We'd need to invent some kind of storage mechanism on the
BinaryVieworFunctionobject for storing state across calls toGetInstructionInfo/GetInstructionText/GetInstructionLowLevelIL(sure, you could abuse the BV metadata system, but that's gross)
-
- These API changes would break every existing architecture plugin
The approach that we are actively taking is to add a new method to the Architecture class that plugin developers can choose to override to have full control over basic block analysis at the function level. This method currently lives in core and is responsible for building the basic block list (it is the primary function that calls GetInstructionInfo and GetInstructionLowLevelIL to perform control flow recovery). This method takes a Function object and analysis settings as inputs, and fills out the function's basic blocks.
We are in the process of exposing the core functionality required to perform function-level basic block analysis to the API. This is a slightly larger effort, but we believe it's a better design and will set us up to be able to support DSP architectures, WASM, and other archs that we don't currently have a good answer for.
tldr
Architecture plugins now have the ability to access the BinaryView while performing custom basic block recovery by overriding the Architecture::AnalyzeBasicBlocks (analyze_basic_blocks in Python). We have also open sourced Binary Ninja's default implementation of AnalyzeBasicBlocks which can be referenced as a code example when writing your own: https://github.com/Vector35/binaryninja-api/blob/dev/defaultabb.cpp#L59
Some more detail
ABB takes the Function object and basic block analysis context as input (virtual void AnalyzeBasicBlocks(Function* function, BasicBlockAnalysisContext& context)) From the Function object, the BinaryView can be accessed with function->GetView() providing access to the contents of the entire binary. If you look at our default implementation of ABB, you'll notice it invokes the GetInstructionInfo and is responsible for building the basic block list / CFG based on the instruction information. If you're writing your own ABB implementation, technically you don't need to invoke GetInstructionInfo at all and you can perform BB recovery with context from wherever.
One example of a problem this solves is with zero-overhead loops (PIC, Blackfin, TMS-320 C6000, Hexagon, and many other MCU/DSP architectures have these). Consider this example from PIC:
0039C do #1337,0x3a2
0039E nop
003A0 inc.w 0x000c
003A2 nop
The instruction at 0x39c sets the loop end to 0x3a2, the loop start to the next instruction (at 0x39e), and tells the CPU to loop over the instructions in between 1337 times. When GetInstructionInfo is invoked on 0x3a2, there is no way of knowing that a nop instruction is the end of basic block (and that a conditional branch occurs). This is something that can be easily handled in a custom implementation of ABB.
Remaining Work
There is still some work to be done on the lifting and disassembly text rendering front. We are tracking the remaining TODOs under a separate issue: https://github.com/Vector35/binaryninja-api/issues/742 I am closing this issue to consolidate.
Thank you, it's great to see this progress! I may end up writing a Wasm disassembler if I end up having to debug a broken Wasm binary again (unless someone beats me to it).
Very cool, though you might have to wait until you can do function at a time lifting for that. Eventually we'd like to be able to provide a way to lift directly to MLIL or HLIL. I was thinking the easiest path for WASM would be direct to HLIL, though I haven't put a ton of thought or research into it.
Thank you, it's great to see this progress! I may end up writing a Wasm disassembler if I end up having to debug a broken Wasm binary again (unless someone beats me to it).
I'm no WASM expert, but I think what we've done so far will allow for handling WASM's label-based control flow and basic block recovery. So I think you could write a fairly good disassembler with the new changes. Consider the example below: with the new ABB changes you could resolve the target for the instruction at 0041h and unroll the loops (where previously that would require gross hacks, if it was doable at all). There might be some challenges for lifting with the stack-based VM. Like if you consider the get_global $g10 instruction. The lifter needs to know where that global data var is in the binary. Local data vars can probably just be represented by temp registers.
Function _crc32_init: (-)-
+0000h: get_global $g10
+0002h: set_local $30
+0004h: get_global $g10
+0006h: i32.const 16
+0008h: i32.add
+0009h: set_global $g10
.........................
+001Dh: i32.const 5243024
+0022h: i32.add
+0023h: i32.const 0
+0025h: i32.store [@0h](http://twitter.com/0h)(4)
+0028h: i32.const 128
+002Bh: set_local $0
+002Dh: loop $1
+002Fh: block $2
+0031h: get_local $0
+0033h: set_local $22
+0035h: get_local $22
+0037h: i32.const 0
+0039h: i32.ne
+003Ah: set_local $23
+003Ch: get_local $23
+003Eh: i32.eqz
+003Fh: if $3
+0041h: br $2 (---> break out of $2 (BLOCK))
+0043h: end
+0044h: get_local $12
+0046h: set_local $24
+0048h: get_local $24
.........................
+0106h: get_local $20
+0108h: i32.const 1
+010Ah: i32.shr_u
+010Bh: set_local $21
+010Dh: get_local $21
+010Fh: set_local $0
+0111h: br $1 (---> continue to $1 (LOOP))
+0113h: end
+0114h: end
+0115h: get_local $30
+0117h: set_global $g10
+0119h: return
+011Ah: end
The global handling (also strings, I think) are what I expect to become possible now that you can get a BinaryView reference, yeah.
I think that the progress we've made so far is sufficient to handle a lot of DSPs (a bunch of TI ones and Hexagon), but for architectures that essentially begin with header-y information (so a lot of VMs, wasm possibly included?) we're definitely going to need to figure out what the best long term approach for the lifting and text callbacks would be. Whether it's some way for architectures to attach opaque blobs to functions (either serializable or not, tbd) that we could then expose via callback might be a far more robust long term option for a lot of VMs.
Thank you so much for the progress on this issue, it's truly game-changing for implementing DSP architectures !
I'm currently writing a plugin for the TMS320C6x architecture, and I still have a little problem though.
In this architecture, instructions are fetched by packets of 8 32-bits words (called Fetch Packets, or FPs), but aren't necessarily executed in parallel : this is determined by a parallelism bit placed in the 8th word of a Fetch Packet.
Instructions in parallel form what is called an Execute Packet (or EP), with the particularity that an EP can overlap two Fetch Packets. Here is an example (the || bars indicate that an instruction is to be executed in parallel with the previous one) :
80000234 7246 MV.L1X B4,A3
80000236 1f39 || CMPGT.L2X B0,A6,B1
80000238 a24f || MV.S2 B4,B5
8000023a 3977 || MVK.D2 1,B2
8000023c e6000700 .fphead n, l, W, BU, nobr, nosat, 0110000b <------ 8th word (header)
80000240 02b416a0 || MV.S1X B13,A5
(to be more specific, parallelism bits are in the 8th word only when compact, 16-bit wide instructions are used, which is the case in my example - which instructions are compact is also specified in that 8th word).
Therefore, when disassembling with GetInstructionText, sometimes looking backwards at the previous "8th word" is required to be able to correctly disassemble an instruction, and therefore requires having access to a BinaryView (were it only looking forward, I could workaround it by setting Architecture::GetMaxInstructionLength to get access to the Fetch Packet header).
Unfortunately, this work cannot be shifted to the AnalyzeBasicBlocks function at all, as it does not concern control flow analysis, and cannot be done without a BinaryView I think. My current solution is to cache all Fetch Packets in some form of mutex-protected database, much like the binja-hexagon plugin does, but this is very dirty.
Do you know of a way to solve this ? Like you said, this instruction set happens to be "header-y" as well, so maybe this will need to wait for a more long-term solution ?
@Cocosushi6 I recommend setting GetMaxInstructionLength to the size of the FPs. Then disassemble the entire packet (all contained instructions) in a single invocation of GetInstructionText (and lift the entire packet in a single call to GetInstructionLowLevelIL). This approach mirrors the behavior of the DSP hardware, which needs context from the entire packet in order to interpret the contained instructions. The main limitation you'll run into is that we currently don't have a newline disassembly text token. But, that is the lowest hanging fruit in our next steps (and we plan to work it soon). For now, I recommend going forward with this approach and keeping all the instructions on the same line (perhaps using a semi-colon to separate in cases where || isn't used for now). Maybe something like:
{ MV.L1X B4,A3 || CMPGT.L2X B0,A6,B1 || MV.S2 B4,B5 || MVK.D2 1,B2 } .fphead n, l, W, BU, nobr, nosat, 0110000b
Obviously, this isn't easy on the eyes and we'll add the newline disassembly text token ASAP. But, in the case of DSPs I've encountered, access to the BinaryView hasn't been required and I have not needed to cache data in static memory across invocations of GetInstructionText or GetInstructionLowLevelIL using the approach described above. But, we are aware of situations where access to the BinaryView in GetInstructionText could help with VM binaries (like WASM).