pycdc icon indicating copy to clipboard operation
pycdc copied to clipboard

Support Python 3.11 decompilation

Open zrax opened this issue 1 year ago • 25 comments

Tasks

  • [X] Bytecode support (pycdas)
  • [ ] Handle new opcodes in AST builder
    • [X] CACHE
    • [X] PUSH_NULL
    • [ ] PUSH_EXC_INFO
    • [ ] CHECK_EXC_MATCH
    • [ ] CHECK_EG_MATCH
    • [ ] BEFORE_WITH
    • [ ] RETURN_GENERATOR
    • [ ] ASYNC_GEN_WRAP
    • [ ] PREP_RERAISE_STAR
    • [x] SWAP
    • [x] POP_JUMP_FORWARD_IF_FALSE
    • [x] POP_JUMP_FORWARD_IF_TRUE
    • [ ] COPY
    • [X] BINARY_OP
    • [ ] SEND
    • [ ] POP_JUMP_FORWARD_IF_NOT_NONE
    • [ ] POP_JUMP_FORWARD_IF_NONE
    • [ ] GET_AWAITABLE
    • [X] JUMP_BACKWARD_NO_INTERRUPT
    • [ ] MAKE_CELL
    • [X] JUMP_BACKWARD
    • [ ] COPY_FREE_VARS
    • [X] RESUME
    • [X] PRECALL
    • [X] CALL
    • [X] KW_NAMES
    • [ ] POP_JUMP_BACKWARD_IF_NOT_NONE
    • [ ] POP_JUMP_BACKWARD_IF_NONE
    • [ ] POP_JUMP_BACKWARD_IF_FALSE
    • [ ] POP_JUMP_BACKWARD_IF_TRUE
  • [ ] Handle new exception table in PyCode

zrax avatar Feb 21 '24 22:02 zrax

POP_JUMP_BACKWARD_IF_FALSE POP_JUMP_BACKWARD_IF_TRUE I don't see a case in ASTree.cpp for these opcodes, I don't think its supported yet.

TiZCrocodile avatar Feb 27 '24 16:02 TiZCrocodile

You're right, not sure why I marked those... Fixed now, thanks

zrax avatar Feb 27 '24 16:02 zrax

the opcodes POP_JUMP_FORWARD_IF_FALSE,POP_JUMP_FORWARD_IF_TRUE are supported, EDIT: POP_JUMP_BACKWARD_IF_FALSE,POP_JUMP_BACKWARD_IF_FALSE are not supported, just the forward

but anyways I wanted to ask you, how to do work on this project because the .gitignore file doesn't ignore visual studio files, then i see a lot of changes in the GitHub Desktop and its confusing. and also is there a way to communicate you about things in this project? i mean if for example i want to change all stack.top(); stack.pop(); lines to just call a function are you gonna approve this? and things like that, so i know what you want or not because i love this project, thank you very much :)

TiZCrocodile avatar Feb 27 '24 17:02 TiZCrocodile

the opcodes POP_JUMP_FORWARD_IF_FALSE,POP_JUMP_FORWARD_IF_TRUE are supported, EDIT: POP_JUMP_BACKWARD_IF_FALSE,POP_JUMP_BACKWARD_IF_FALSE are not supported, just the forward

Thanks, fixed (again)

how to do work on this project because the .gitignore file doesn't ignore visual studio files, then i see a lot of changes in the GitHub Desktop and its confusing.

Personally, I do most of my development on it (in the rare opportunities that I have time to do so) from Linux or MSYS2, since that's where the test suite stuff works. However, more generally, when I'm using MSVC with CMake projects, I'll confine the build to a single directory that I can ignore entirely with a line in my local .git/info/exclude. Since there are many IDEs that all have their own mess of files, I don't generally bother putting those in .gitignore. That's just my personal preference though, I'm not opposed to someone else adding MSVC files to .gitignore if they know what to add.

if for example i want to change all stack.top(); stack.pop(); lines to just call a function are you gonna approve this? and things like that, so i know what you want or not because i love this project, thank you very much :)

The short history there is just that the stack was originally a std::stack<...>, so I kept the API compatible with STL for simplicity. Changing it to use a more ergonomic API just hasn't been a priority so far.

zrax avatar Feb 28 '24 00:02 zrax

the opcodes POP_JUMP_FORWARD_IF_FALSE,POP_JUMP_FORWARD_IF_TRUE are supported, EDIT: POP_JUMP_BACKWARD_IF_FALSE,POP_JUMP_BACKWARD_IF_FALSE are not supported, just the forward

Thanks, fixed (again)

how to do work on this project because the .gitignore file doesn't ignore visual studio files, then i see a lot of changes in the GitHub Desktop and its confusing.

Personally, I do most of my development on it (in the rare opportunities that I have time to do so) from Linux or MSYS2, since that's where the test suite stuff works. However, more generally, when I'm using MSVC with CMake projects, I'll confine the build to a single directory that I can ignore entirely with a line in my local .git/info/exclude. Since there are many IDEs that all have their own mess of files, I don't generally bother putting those in .gitignore. That's just my personal preference though, I'm not opposed to someone else adding MSVC files to .gitignore if they know what to add.

if for example i want to change all stack.top(); stack.pop(); lines to just call a function are you gonna approve this? and things like that, so i know what you want or not because i love this project, thank you very much :)

The short history there is just that the stack was originally a std::stack<...>, so I kept the API compatible with STL for simplicity. Changing it to use a more ergonomic API just hasn't been a priority so far.

ohh the .git/info/exclude and confine to one dir was the trick i guess, thank you very much, and i mean to change the stack.top / pop, because there is a function in ASTree.cpp in the start of the file named StackPopTop amd what it does is just pop and return, but is never used, but anyways how can i communicate with you in github? is there a chat or something instead of chatting in issues

TiZCrocodile avatar Feb 28 '24 00:02 TiZCrocodile

Please support: JUMP-BACKWARD

sffool avatar Mar 13 '24 14:03 sffool

@sffool could you attach a sample .pyc with that opcode?

greenozon avatar Mar 13 '24 14:03 greenozon

Tomorrow I will send a .pyc file that I want to decompile in an email.

Thank you!My hero.

---Original--- From: @.> Date: Wed, Mar 13, 2024 22:45 PM To: @.>; Cc: @.@.>; Subject: Re: [zrax/pycdc] Support Python 3.11 decompilation (Issue #452)

@sffool could you attach a sample .pyc with that opcode?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

sffool avatar Mar 13 '24 14:03 sffool

Please stop recommending people attach .pyc files... They are not useful. A much more useful recommendation is to provide a (small) bit of python source that compiles to the opcodes in question, so it can be used as a test case.

zrax avatar Mar 13 '24 15:03 zrax

I'm sorry, I don't know how to write Python.

---Original--- From: "Michael @.> Date: Wed, Mar 13, 2024 23:22 PM To: @.>; Cc: @.@.>; Subject: Re: [zrax/pycdc] Support Python 3.11 decompilation (Issue #452)

Please stop recommending people attach .pyc files... They are not useful. A much more useful recommendation is to provide a (small) bit of python source that compiles to the opcodes in question, so it can be used as a test case.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

sffool avatar Mar 13 '24 15:03 sffool

@zrax OK, but usually having pyc you could a) have some sample to play with b) create a test case in Python (after reading/understanding provided .pyc) and thus close the gap

greenozon avatar Mar 13 '24 15:03 greenozon

I have been exploring the opcodes and what generates them, I know it aint much, but for me it was a nice way to have a list of the opcodes and their main functionality on when they are introduced, hope it can help.

import dis
import asyncio

class ContextManager:
    def __enter__(self):
        print("Entering context manager")
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        print("Exiting context manager")

def BEFORE_WITH():
    with ContextManager():
        print("Inside context manager")

def PUSH_EXC_INFO():
    try:
        raise Exception("This is an exception")
    except Exception as e:
        raise e

def CHECK_EXC_MATCH():
    try:
        raise ValueError("This is a ValueError")
    except (ValueError, TypeError) as e:
        print(f"Caught exception: {e}")

async def async_generator():
    await asyncio.sleep(0)
    yield 1
    await asyncio.sleep(0)
    yield 2

async def use_async_generator():
    async_gen = async_generator()
    try:
        value = await async_gen.asend(None)
        print(value)  
        value = await async_gen.asend(None)
        print(value)  
    finally:
        await async_gen.aclose()


def PREP_RERAISE_STAR():
    try:
        something = 1/1
        raise ValueError("This is a ValueError")
    except ValueError as e:
        try:
            raise Exception("This is an Exception") from e
        except Exception:
            print("Caught the re-raised exception")

def SWAP():
    my_array = [1, 2, 3, 4, 5]
    i = 1
    j = 3
    my_array[i], my_array[j] = my_array[j], my_array[i]

def COPY():
    a = 10
    b = 10
    c = 10
    return a == b == c

def subgenerator():
    yield 1
    yield 2
    yield 3

def SEND():
    yield from subgenerator()

def POP_JUMP_FORWARD_IF_NOT_NONE(item): # AND POP_JUMP_FORWARD_IF_NONE
    if item is None:  
        print('case 1')
    else:
        print('case 2')

    while item is None:
        pass
    
    
async def async_function():
    await asyncio.sleep(1)
    return 'Hello, World!'

async def GET_AWAITABLE():
    result = await async_function()
    print(result)

asyncio.run(GET_AWAITABLE())


def generator():
    yield 1
    yield 2
    yield 3

def JUMP_BACKWARD_NO_INTERRUPT():
    yield from generator()

gen = JUMP_BACKWARD_NO_INTERRUPT()
print(next(gen))
print(next(gen))
print(next(gen))

def JUMP_BACKWARD():
    iterable = [1, 2, 3]
    for item in iterable:
        pass

    while True:
        break
    
def COPY_FREE_VARS():
    a = 10
    def inner_function():
        nonlocal a
        print(a)
    inner_function()

def POP_JUMP_BACKWARD_IF_NOT_NONE():
    item = None
    while item is not None:
        pass

def POP_JUMP_BACKWARD_IF_NONE():
    item = None
    while item is None:
        pass

def POP_JUMP_BACKWARD_IF_FALSE():
    cond = False
    while not cond:
        pass

def POP_JUMP_BACKWARD_IF_TRUE():
    cond = True
    while cond:
        cond

if __name__ == "__main__":
    # print("BEFORE_WITH")
    # dis.dis(BEFORE_WITH) # BEFORE_WITH
    # print("PUSH_EXC_INFO")
    # dis.dis(PUSH_EXC_INFO) # PUSH_EXC_INFO
    # print("CHECK_EXC_MATCH")
    # dis.dis(CHECK_EXC_MATCH) # CHECK_EXC_MATCH
    # print("RETURN_GENERATOR | And ASYNC_GEN_WRAP when casted with asyncio.run")
    # dis.dis(use_async_generator()) # RETURN_GENERATOR | And ASYNC_GEN_WRAP when casted with asyncio.run
    # print("PREP_RERAISE_STAR")
    # dis.dis(PREP_RERAISE_STAR())
    # print("SWAP")
    # dis.dis(SWAP)
    #print("COPY")
    #dis.dis(COPY)
    #print("SEND")
    #dis.dis(SEND)
    #print("POP_JUMP_FORWARD_IF_NOT_NONE AND POP_JUMP_FORWARD_IF_NONE")
    #dis.dis(POP_JUMP_FORWARD_IF_NOT_NONE) # AND POP_JUMP_FORWARD_IF_NONE
    #print("GET_AWAITABLE")
    #dis.dis(GET_AWAITABLE) # GET_AWAITABLE
    #print("JUMP_BACKWARD_NO_INTERRUPT")
    #dis.dis(JUMP_BACKWARD_NO_INTERRUPT) # JUMP_BACKWARD_NO_INTERRUPT
    #print("JUMP_BACKWARD")
    #dis.dis(JUMP_BACKWARD) # JUMP_BACKWARD
    #print("COPY_FREE_VARS")
    #dis.dis(COPY_FREE_VARS) # COPY_FREE_VARS
    #print("POP_JUMP_BACKWARD_IF_NOT_NONE")
    #dis.dis(POP_JUMP_BACKWARD_IF_NOT_NONE) # POP_JUMP_BACKWARD_IF_NOT_NONE
    #print("POP_JUMP_BACKWARD_IF_NONE")
    #dis.dis(POP_JUMP_BACKWARD_IF_NONE) # POP_JUMP_BACKWARD_IF_NONE
    #print("POP_JUMP_BACKWARD_IF_FALSE")
    #dis.dis(POP_JUMP_BACKWARD_IF_FALSE) # POP_JUMP_BACKWARD_IF_FALSE
    #print("POP_JUMP_BACKWARD_IF_TRUE")
    #dis.dis(POP_JUMP_BACKWARD_IF_TRUE) # POP_JUMP_BACKWARD_IF_TRUE

kibernautas avatar Mar 15 '24 13:03 kibernautas

Please support: JUMP-BACKWARD

possible solution https://github.com/zrax/pycdc/pull/472

greenozon avatar Mar 16 '24 19:03 greenozon

Unsupported opcode: COPY

Please add support for it. I can provide .pyc file if you want.

hern0s-dev avatar Mar 27 '24 23:03 hern0s-dev

Unsupported opcode: CALL_FUNCTION_EX https://docs.python.org/3/library/dis.html#opcode-CALL_FUNCTION_EX Added in version 3.11. Please support: CALL_FUNCTION_EX

scylamb avatar May 11 '24 14:05 scylamb

Unsupported opcode: CALL_FUNCTION_EX https://docs.python.org/3/library/dis.html#opcode-CALL_FUNCTION_EX Added in version 3.11. Please support: CALL_FUNCTION_EX

Also facing this error ! Please support: CALL_FUNCTION_EX

uniplate avatar May 19 '24 17:05 uniplate

I found an amazing method to deal the unsupported opcodes. I get the assembly codes with pycdas and then I give them to gpt-4o and let gpt-4o reverse them to python code. It did it!

hanfangyuan4396 avatar May 25 '24 04:05 hanfangyuan4396

@zrax Do you accept maybe-source and bytecode for missing features?

Maybe in private?

stdedos avatar Jun 26 '24 20:06 stdedos

could you help support MAKE_CELL

Sun92Go avatar Jul 24 '24 13:07 Sun92Go

please add support for MAKE_CELL

NyanAlex avatar Aug 03 '24 15:08 NyanAlex

do you have some basic .py and .pyc that uses the MAKE_CELL opcode?

greenozon avatar Aug 03 '24 17:08 greenozon

do you have some basic .py and .pyc that uses the MAKE_CELL opcode?

Idk what you define as "basic". Is a script total of 105 lines (one non-stdlib dependency) "basic"?

stdedos avatar Aug 04 '24 06:08 stdedos

yeah, something like that (my samples, pyc-s, are producing almost 1M pycdas output in size...)

greenozon avatar Aug 04 '24 06:08 greenozon

do you have some basic .py and .pyc that uses the MAKE_CELL opcode?

Sorry for not answering for a long time!

import dis

def make_closure():
    x = 10
    def inner():
        return x
    return inner

def main():
    closure = make_closure()
    
    print("Result from closure:", closure())
    
    print("\nDisassembly of the closure function:")
    dis.dis(closure)

if __name__ == "__main__":
    main()

NyanAlex avatar Aug 29 '24 20:08 NyanAlex

I see no CELL opcodes:

python cell.py
Result from closure: 10

Disassembly of the closure function:
              0 COPY_FREE_VARS           1

  5           2 RESUME                   0

  6           4 LOAD_DEREF               0 (x)
              6 RETURN_VALUE

greenozon avatar Aug 30 '24 07:08 greenozon

Sir @zrax , when I want to decode a .pyc script, I get an error like std badcast or bad magic, how do I solve it, I'm just a new person and don't know anything about this.

Layzzz66 avatar Aug 30 '24 12:08 Layzzz66

@Layzzz66 please open up a new issue, attach your pyc - this is the common way...

greenozon avatar Aug 30 '24 12:08 greenozon

I see no CELL opcodes:

python cell.py
Result from closure: 10

Disassembly of the closure function:
              0 COPY_FREE_VARS           1

  5           2 RESUME                   0

  6           4 LOAD_DEREF               0 (x)
              6 RETURN_VALUE

ohh im bad at python, i asked chat GPT but it seems he didnt make right script

NyanAlex avatar Aug 30 '24 18:08 NyanAlex

I see no CELL opcodes:

python cell.py
Result from closure: 10

Disassembly of the closure function:
              0 COPY_FREE_VARS           1

  5           2 RESUME                   0

  6           4 LOAD_DEREF               0 (x)
              6 RETURN_VALUE

app.pyc.zip

NyanAlex avatar Aug 30 '24 18:08 NyanAlex