Call type adjustments should not apply to all calls to the same stub function within a function
Version and Platform (required):
- Binary Ninja Version: 5.1.7313-dev (7e9d5c7a)
- OS: macOS 15.4.1
- CPU Architecture: arm64
Bug Description:
Call type adjustments created by Function::SetAutoCallTypeAdjustment / Function::SetUserCallTypeAdjustment appear to be tracked by the combination of the containing function and the address of the call instruction. This causes incorrect behavior if a call whose type is being adjusted was inlined from another function that was marked as being inlined during analysis. Since the address of the call is the address within the original function, applying a call type adjustment to a single call has the effect of applying the adjustment to every copy of the call that was inlined into the same function.
This could be fixed by tracking call type adjustments by IL instruction index instead.
Steps To Reproduce:
- Open /usr/libexec/syspolicyd on macOS 15.4.x
- Go to the definition of
-[XPCActivityJob .cxx_destruct] - Right-click on one of the three calls to
_objc_storeStrongand selectOverride Call Type… - Update the type of second argument to something that will display differently (
boolorint32_twork). - Note that it updates only a single call.
- Right-click on one of the three calls to
_objc_storeStrongand selectEdit Function Properties…. - Select "Inline during analysis" and Apply.
- Try overriding the call type again.
Expected Behavior: It should only override the types for the specific call, but instead affects all three calls within the function.
Additional Information:
The steps to reproduce involve the UI, but I first saw this in a workflow that programmatically applies call type adjustments. It worked fine on normal Mach-O binaries, but triggered this behavior when run on a dyld shared cache. This happened because the shared cache's workflow calls SetAutoInlinedDuringAnalysis on stub functions when the library containing the destination of the stub call is loaded.
Can confirm this is quite a problematic limitation of call type adjustments. I have a bug in a plugin which was doing call type adjustments and I didn't realise until now that the issue is due to the limitations of the API plus the fact that the DSC plugin makes heavy use of function inlining due to stubs. Essentially I was adjusting the call type of a call in a function that was inlined in another function multiple times. Each call site had different types but because they all have the same address due to the inlining I end up observing all instances having the same call type, which was incorrect in this case.
I have a kind of gross workaround for this, that also helps with #6763, where I use a function workflow to detect stub inlining (specifically just DSC stub functions) and update the inlined instructions to have the same address as the original call to the inlined function. Given stubs, from what I've seen, after often just a couple of lines of disassembly that jump to a specific function, its not really problematic to do this.
Although I think in the long term call type adjustments should apply to IL expressions not addresses, I wonder if there's a quick fix for this, and a general fix for #6763, where there's an option to inline functions but use the address of the call instruction for all inlined instructions, rather than their original addresses. Basically what I'm doing with a function workflow but as a native feature. I think it makes sense for stub functions.
The tracking on (arch, addr) tuples is fundamental. Instruction indices are not stable and cannot be used to ground these. One possible option, in the event that the stub only has one call that could be overridden would be store it against the address of the inlined call -- I've tested something very similar to this approach for resolving jumptable stubs via inlining with some success.
In the event that a function needed to be inlined, and the call site that needed to be overridden was ambiguous, we'd essentially have to move away from (arch, addr) to probably an entire tree of (arch, addr) tuples to describe the path of inlining all the way down to the call site. I'm not sure if that's the answer either, but it can not be instruction indices -- instruction indices can change from all sorts of markup like type information, contents of structs (changing load/store splitting behavior), temps that become unnecessary (or necessary) due to jumptables being solved... it's unsafe, in general, to persist them or use them as anchor points for any reason.
Yeah I realised instruction indices can't work, but I've found my solution I mentioned above to work effectively and I'd expect it to be relatively easy work for it to be implemented. If the API provided a way to mark a function as inlined but also to use the address of the call site where it is being inlined, for all of its instructions, it would present an easy solution. Given the use case of inlining that is problematic is DSC stub function inlining, these functions tend to only have 3-5 instructions. Addresses in HLIL are already a bit disingenuous so I think the trade off is acceptable. Also as I mentioned it fixes #6763 which I've found improves my reversing experience with HLIL and diassembly side by side.
Yeah, that has pretty much the same effect as what I experimented with for the UnresolvedIndirectBranch case, just with a very minor UI artifact. Since that case is also really only used for very short stubs, I was able to get away with scoping it to only fire when there was in fact only one possible location with an indirect branch, and redirect accesses for the branch in question to the address of the call.
My reason for having not pushed that was that it obviously falls over if somebody tries to inline a more substantial function that is called multiple times in a caller, but since this represents the second such case of having a need for this functionality due to stubs I should probably just resurrect that branch, add support for call type adjusts along the same lines, and accept the fact that it really only works for stubs -- this problem has yet to rear its head beyond that.
The inline tree path thing seems necessary, for this and other combinations of il customization with types of inlining. But also in some cases you really do want to apply your customization everywhere the code is inlined, so both are needed somehow...