This will make certain arches possible - like dex smali.

I need to be able to access bv.session_data.string_table from the Architecture class.

Nov 17 '16 03:11 lucasduffey

I suspect there's another way around this -- we make Thumb2/Arm work without it, can you clarify a bit more about why you need session data from architecture?

Nov 18 '16 16:11 psifertex

In smali; strings, methods and other data are stored in their own distinctive tables (arrays). The instructions reference those strings and methods by index.

My current idea: save the tables in SessionData, and access it that way.

Nov 18 '16 16:11 lucasduffey

I would also like to see this addressed; the primary motivation is so that Python-based plugins could use settings defined for a Resource to change their disassembly format.

May 29 '24 08:05 whitequark

I also have a use case for this, smali is a powerful IL that's used for jvm bytecode transformations and I've been working on jvm bytecode support for some time (on and off, work is crazy). It's quite simple to convert Dalvik and "Java bytecode" to smali and with this change (and a preprocessing step to perform the initial conversion), I believe both Dalvik classes and "Java" classes would be able to leverage the benefits of the different ILs offered (and it would be much more reasonable to do now that projects support has been added)

May 29 '24 10:05 ccarpenter04

Hopefully this issue can be improved, as it greatly limits the flexibility of developing decompilation plugins for unknown instruction sets.

Sep 26 '24 10:09 Cussrro

For a C166 architecture, this will allow proper decoding of sring references which are addressed in a specific bank context.

Sep 27 '24 13:09 Martyx00

@xusheng6 suggested I weigh in here: I'd like to perform some limited instruction decoding (basically a limited linear sweep with no further analysis) to extract data from executable code in a flat binary file that references data needed to inform subsequent loading of the file.

Nov 21 '24 13:11 bdemick

After so long, the most basic problem still hasn't been solved?

Apr 18 '25 02:04 Cussrro

It might seem like a most basic issue but it requires some significant core changes to implement. We intentionally made some design decisions early on that made multithreaded analysis better/easier for some architectures but made it more difficult for architectures like this.

That said, I have good news, this issue is currently being worked on and we hope to have it resolved by the 5.1 release (still several months out though so no promises...)

Apr 18 '25 08:04 psifertex

I just want to add some context. We could have taken a simpler approach and modified the GetInstructionInfo/GetInstructionText/GetInstructionLowLevelIL APIs to take a BinaryView or Function parameter. There are a few reasons why we don't like this option:

It's a poor design - for many DSP architectures (i.e. DSP archs that support instruction pipelines, hardware loops, etc..) you'd need to call into the BinaryView on every instruction to query information about previous instructions or other state
- GetInstructionInfo/GetInstructionText/GetInstructionLowLevelIL are static methods that are invoked via callbacks
- We'd need to invent some kind of storage mechanism on the BinaryView or Function object for storing state across calls to GetInstructionInfo/GetInstructionText/GetInstructionLowLevelIL (sure, you could abuse the BV metadata system, but that's gross)
These API changes would break every existing architecture plugin

The approach that we are actively taking is to add a new method to the Architecture class that plugin developers can choose to override to have full control over basic block analysis at the function level. This method currently lives in core and is responsible for building the basic block list (it is the primary function that calls GetInstructionInfo and GetInstructionLowLevelIL to perform control flow recovery). This method takes a Function object and analysis settings as inputs, and fills out the function's basic blocks.

We are in the process of exposing the core functionality required to perform function-level basic block analysis to the API. This is a slightly larger effort, but we believe it's a better design and will set us up to be able to support DSP architectures, WASM, and other archs that we don't currently have a good answer for.

Apr 18 '25 13:04 zznop

tldr

Architecture plugins now have the ability to access the BinaryView while performing custom basic block recovery by overriding the Architecture::AnalyzeBasicBlocks (analyze_basic_blocks in Python). We have also open sourced Binary Ninja's default implementation of AnalyzeBasicBlocks which can be referenced as a code example when writing your own: https://github.com/Vector35/binaryninja-api/blob/dev/defaultabb.cpp#L59

Some more detail

ABB takes the Function object and basic block analysis context as input (virtual void AnalyzeBasicBlocks(Function* function, BasicBlockAnalysisContext& context)) From the Function object, the BinaryView can be accessed with function->GetView() providing access to the contents of the entire binary. If you look at our default implementation of ABB, you'll notice it invokes the GetInstructionInfo and is responsible for building the basic block list / CFG based on the instruction information. If you're writing your own ABB implementation, technically you don't need to invoke GetInstructionInfo at all and you can perform BB recovery with context from wherever.

One example of a problem this solves is with zero-overhead loops (PIC, Blackfin, TMS-320 C6000, Hexagon, and many other MCU/DSP architectures have these). Consider this example from PIC:

0039C   do #1337,0x3a2
0039E   nop
003A0   inc.w 0x000c
003A2   nop

The instruction at 0x39c sets the loop end to 0x3a2, the loop start to the next instruction (at 0x39e), and tells the CPU to loop over the instructions in between 1337 times. When GetInstructionInfo is invoked on 0x3a2, there is no way of knowing that a nop instruction is the end of basic block (and that a conditional branch occurs). This is something that can be easily handled in a custom implementation of ABB.

Remaining Work

There is still some work to be done on the lifting and disassembly text rendering front. We are tracking the remaining TODOs under a separate issue: https://github.com/Vector35/binaryninja-api/issues/742 I am closing this issue to consolidate.

Jul 02 '25 13:07 zznop

Thank you, it's great to see this progress! I may end up writing a Wasm disassembler if I end up having to debug a broken Wasm binary again (unless someone beats me to it).

Jul 02 '25 13:07 whitequark

Very cool, though you might have to wait until you can do function at a time lifting for that. Eventually we'd like to be able to provide a way to lift directly to MLIL or HLIL. I was thinking the easiest path for WASM would be direct to HLIL, though I haven't put a ton of thought or research into it.

Jul 02 '25 13:07 plafosse

Thank you, it's great to see this progress! I may end up writing a Wasm disassembler if I end up having to debug a broken Wasm binary again (unless someone beats me to it).

I'm no WASM expert, but I think what we've done so far will allow for handling WASM's label-based control flow and basic block recovery. So I think you could write a fairly good disassembler with the new changes. Consider the example below: with the new ABB changes you could resolve the target for the instruction at 0041h and unroll the loops (where previously that would require gross hacks, if it was doable at all). There might be some challenges for lifting with the stack-based VM. Like if you consider the get_global $g10 instruction. The lifter needs to know where that global data var is in the binary. Local data vars can probably just be represented by temp registers.

Function _crc32_init: (-)-
    +0000h: get_global $g10  
    +0002h: set_local $30    
    +0004h: get_global $g10  
    +0006h: i32.const 16     
    +0008h: i32.add          
    +0009h: set_global $g10  
    .........................
    +001Dh: i32.const 5243024
    +0022h: i32.add          
    +0023h: i32.const 0      
    +0025h: i32.store [@0h](http://twitter.com/0h)(4) 
    +0028h: i32.const 128    
    +002Bh: set_local $0     
    +002Dh: loop $1         
    +002Fh:   block $2      
    +0031h:     get_local $0 
    +0033h:     set_local $22
    +0035h:     get_local $22
    +0037h:     i32.const 0  
    +0039h:     i32.ne       
    +003Ah:     set_local $23
    +003Ch:     get_local $23
    +003Eh:     i32.eqz      
    +003Fh:     if $3       
    +0041h:       br $2      (---> break out of $2 (BLOCK))
    +0043h:       end        
    +0044h:     get_local $12
    +0046h:     set_local $24
    +0048h:     get_local $24
    .........................
    +0106h:     get_local $20
    +0108h:     i32.const 1  
    +010Ah:     i32.shr_u    
    +010Bh:     set_local $21
    +010Dh:     get_local $21
    +010Fh:     set_local $0 
    +0111h:     br $1        (---> continue to $1 (LOOP))
    +0113h:     end          
    +0114h:   end            
    +0115h: get_local $30    
    +0117h: set_global $g10  
    +0119h: return           
    +011Ah: end

Jul 02 '25 14:07 zznop

The global handling (also strings, I think) are what I expect to become possible now that you can get a BinaryView reference, yeah.

Jul 02 '25 14:07 whitequark

I think that the progress we've made so far is sufficient to handle a lot of DSPs (a bunch of TI ones and Hexagon), but for architectures that essentially begin with header-y information (so a lot of VMs, wasm possibly included?) we're definitely going to need to figure out what the best long term approach for the lifting and text callbacks would be. Whether it's some way for architectures to attach opaque blobs to functions (either serializable or not, tbd) that we could then expose via callback might be a far more robust long term option for a lot of VMs.

Jul 02 '25 18:07 rssor

Thank you so much for the progress on this issue, it's truly game-changing for implementing DSP architectures !

I'm currently writing a plugin for the TMS320C6x architecture, and I still have a little problem though. In this architecture, instructions are fetched by packets of 8 32-bits words (called Fetch Packets, or FPs), but aren't necessarily executed in parallel : this is determined by a parallelism bit placed in the 8th word of a Fetch Packet. Instructions in parallel form what is called an Execute Packet (or EP), with the particularity that an EP can overlap two Fetch Packets. Here is an example (the || bars indicate that an instruction is to be executed in parallel with the previous one) :

80000234       7246           MV.L1X        B4,A3
80000236       1f39 ||        CMPGT.L2X     B0,A6,B1
80000238       a24f ||        MV.S2         B4,B5
8000023a       3977 ||        MVK.D2        1,B2
8000023c   e6000700           .fphead       n, l, W, BU, nobr, nosat, 0110000b      <------ 8th word (header)
80000240   02b416a0 ||        MV.S1X        B13,A5

(to be more specific, parallelism bits are in the 8th word only when compact, 16-bit wide instructions are used, which is the case in my example - which instructions are compact is also specified in that 8th word).

Therefore, when disassembling with GetInstructionText, sometimes looking backwards at the previous "8th word" is required to be able to correctly disassemble an instruction, and therefore requires having access to a BinaryView (were it only looking forward, I could workaround it by setting Architecture::GetMaxInstructionLength to get access to the Fetch Packet header).

Unfortunately, this work cannot be shifted to the AnalyzeBasicBlocks function at all, as it does not concern control flow analysis, and cannot be done without a BinaryView I think. My current solution is to cache all Fetch Packets in some form of mutex-protected database, much like the binja-hexagon plugin does, but this is very dirty.

Do you know of a way to solve this ? Like you said, this instruction set happens to be "header-y" as well, so maybe this will need to wait for a more long-term solution ?

Jul 08 '25 09:07 Cocosushi6

@Cocosushi6 I recommend setting GetMaxInstructionLength to the size of the FPs. Then disassemble the entire packet (all contained instructions) in a single invocation of GetInstructionText (and lift the entire packet in a single call to GetInstructionLowLevelIL). This approach mirrors the behavior of the DSP hardware, which needs context from the entire packet in order to interpret the contained instructions. The main limitation you'll run into is that we currently don't have a newline disassembly text token. But, that is the lowest hanging fruit in our next steps (and we plan to work it soon). For now, I recommend going forward with this approach and keeping all the instructions on the same line (perhaps using a semi-colon to separate in cases where || isn't used for now). Maybe something like:

{ MV.L1X B4,A3 || CMPGT.L2X B0,A6,B1 || MV.S2 B4,B5 || MVK.D2 1,B2 } .fphead n, l, W, BU, nobr, nosat, 0110000b

Obviously, this isn't easy on the eyes and we'll add the newline disassembly text token ASAP. But, in the case of DSPs I've encountered, access to the BinaryView hasn't been required and I have not needed to cache data in static memory across invocations of GetInstructionText or GetInstructionLowLevelIL using the approach described above. But, we are aware of situations where access to the BinaryView in GetInstructionText could help with VM binaries (like WASM).

Jul 08 '25 12:07 zznop

architecture doesn't have reference to the BinaryView

tldr

Some more detail

Remaining Work