capstone icon indicating copy to clipboard operation
capstone copied to clipboard

5.0.1 has broken ARM operand information (in Python 2)

Open gerph opened this issue 2 years ago • 10 comments

Summary

I've just had my CI update to the recently released 5.0.1 and many of my tests have failed.

One of the reasons is that the operand structure for LDR in 32bit ARM, for 5.0.1 is no longer returning the same values. Where previously the instruction LDR r1, [r2] had two operands, it now only has one. Tested on macOS and on Linux.

Additionally, the cs_version() has not changed between 5.0.0 and 5.0.1.

Test code

#!/usr/bin/env python

import sys

from capstone import *
import capstone.arm_const

code = b'\x00\x10\x92\xe5' # LDR r1,[r2]

md = Cs(CS_ARCH_ARM, CS_MODE_ARM)
md.detail = True
md.mnemonic_setup(capstone.arm_const.ARM_INS_SVC, "SWI")
# Turn off APCS register naming
md.syntax = capstone.CS_OPT_SYNTAX_NOREGNAME

optype_names = dict((getattr(capstone.arm_const, optype), optype) for optype in dir(capstone.arm_const) if optype.startswith('ARM_OP_'))

print("cs_version() = %r" % (cs_version(),))

for i in md.disasm(code, 0x1000):
    print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str))
    for index, operand in enumerate(i.operands):
        print("  op#%i: type=%i (%s)" % (index, operand.type, optype_names.get(operand.type, 'unknown')))

Test output for 4.0.2

cs_version() = (4, 0, 1024)
0x1000:	ldr	r1, [r2]
  op#0: type=1 (ARM_OP_REG)
  op#1: type=3 (ARM_OP_MEM)

Test output for 5.0.0

cs_version() = (5, 0, 1280)
0x1000:	ldr	r1, [r2]
  op#0: type=1 (ARM_OP_REG)
  op#1: type=3 (ARM_OP_MEM)

Test output for 5.0.1

cs_version() = (5, 0, 1280)
0x1000:	ldr	r1, [r2]
  op#0: type=1 (ARM_OP_REG)

Expected output

The expected output is like the 5.0.0 - we should have two operands described by the operands list in the decoded instruction.

Additionally, notice that the test output is showing 5, 0, 1280 as the cs_version() for both 5.0.0 and 5.0.1, which makes it hard for me to recognise and reject the library that isn't behaving correctly.

gerph avatar Aug 23 '23 20:08 gerph

Seems to be a side effect of https://github.com/capstone-engine/capstone/commit/d2a39a2ef15ce4878f1186f711e97bee68845339

When I remove the lines I get the correct result again:

cd /tmp
python3 -m venv venv
source venv/bin/activate
# Get pip archive https://pypi.org/project/capstone/#files
https://files.pythonhosted.org/packages/7a/fe/e6cdc4ad6e0d9603fa662d1ccba6301c0cb762a1c90a42c7146a538c24e9/capstone-5.0.1.tar.gz
tar xzf capstone-5.0.1.tar.gz
cd capstone-5.0.1
# edit files
python3 setup.py install
# Assumes your code is in test-cs.py
python3 test-cs.py 

cs_version() = (5, 0, 1280)
0x1000:	ldr	r1, [r2]
  op#0: type=1 (ARM_OP_REG)
  op#1: type=3 (ARM_OP_MEM)

@gerph Could you try this as well. Just so we are sure.

Rot127 avatar Aug 23 '23 23:08 Rot127

Following your instructions, with 'edit files' being a reversion of the change you reference, I now see:

cs_version() = (5, 0, 1280)
0x1000:	ldr	r1, [r2]
  op#0: type=1 (ARM_OP_REG)
  op#1: type=3 (ARM_OP_MEM)

which is correct :-)

So for this ticket, this addresses the issue :-) Thanks!

gerph avatar Aug 24 '23 17:08 gerph

Yes, I also encountered this mistake

pkilller avatar Nov 18 '23 08:11 pkilller

bump, with ldr pc, [ip, #0xee8]! :grimacing:

>>> insn
<CsInsn 0x56572570 [e8febce5]: ldr pc, [ip, #0xee8]!>
>>> insn
<CsInsn 0x56572570 [e8febce5]: ldr pc, [ip, #0xee8]!>
>>> insn.mnemonic
'ldr'
>>> insn.op_str
'pc, [ip, #0xee8]!'
>>> insn.operands[1]
Traceback (most recent call last):
  File "<console>", line 1, in <module>
IndexError: list index out of range

amaanq avatar Jan 13 '24 13:01 amaanq

@amaanq Sorry for the late answer. Have you tried to use the latest v5 branch?

Rot127 avatar Feb 16 '24 04:02 Rot127

seems to be good now, thanks for checking in! I tried on both the next and v5 branch w/ the following snippet:

from capstone import *

c = Cs(CS_ARCH_ARM, CS_MODE_ARM)
c.detail = True
insn = next(c.disasm(b"\xe8\xfe\xbc\xe5", 0))  # ldr pc, [ip, #0xee8]!
print(f"{insn.mnemonic}\t{insn.op_str}")
print(insn.reg_name(insn.operands[0].reg))  # pc
print(insn.reg_name(insn.operands[1].reg))  # ip
print(hex(insn.operands[1].mem.disp))  # 0xee8

this can be closed probably

amaanq avatar Feb 16 '24 06:02 amaanq

@gerph or @kabeor Please close it

Rot127 avatar Feb 16 '24 06:02 Rot127

@amaanq next branch should work though:

./cstool -d arm e8febce5
 0  e8 fe bc e5  ldr	pc, [r12, #0xee8]!
	ID: 4 (ldr)
	op_count: 2
		operands[0].type: REG = r15
		operands[0].access: WRITE
		operands[1].type: MEM
			operands[1].mem.base: REG = r12
			operands[1].mem.scale: 0
			operands[1].mem.disp: 0xee8
		operands[1].access: READ
	Write-back: True
	Post index: False
	Registers read: r12
	Registers modified: r12 r15
	Groups: IsARM jump 

Rot127 avatar Feb 16 '24 06:02 Rot127

yeah sorry I meant both worked

amaanq avatar Feb 16 '24 06:02 amaanq

@gerph or @kabeor Please close it

Are you sure you want it closed? The fix isn't released yet, so this issue remains true of the most recent release, and thus is useful for when someone comes looking for a fix. The code has been committed, but not yet released. Happy to close if that's the intent.

gerph avatar Feb 16 '24 10:02 gerph

I added it to the v5.0.2 milestone, so it will be part of the v5.0.2 release.

Rot127 avatar Mar 20 '24 06:03 Rot127