ScratchABlock icon indicating copy to clipboard operation
ScratchABlock copied to clipboard

RFC: Moving code to top-level package

Open pfalcon opened this issue 8 years ago • 12 comments

The root dir of the project is quite crowded already. Add to that thoughts of how to allow to reuse ScratchABlock in other projects, and it call to move all the modules (vs executable scripts) to a package namespace.

This ticket is to consider pros and cons and possible timeline.

pfalcon avatar Jan 05 '18 16:01 pfalcon

@maximumspatium : Would appreciate your opinion here, as well as #17. (I actually hope to dump more of my thoughts on this too.)

pfalcon avatar Jan 07 '18 09:01 pfalcon

I propose to just do that now. Because it is a mess lasting for more than 2 years without any action. No matter how good the splitting will be now, just do that. Any improvements can be done later. The recommended packaging lib is setuptools. The recommended metadata format is setup.cfg.

KOLANICH avatar Jan 08 '20 21:01 KOLANICH

I propose to just do that now.

Oh, the original version, with NOW in capitals was more convincing ;-). So, like right now? But why?

Because it is a mess

Don't hesitate to elaborate on what/where is the mess, and how doing it otherwise would improve specific things.

pfalcon avatar Jan 08 '20 23:01 pfalcon

Don't hesitate to elaborate on what/where is the mess, and how doing it otherwise would improve specific things.

The current state is mess. There is no package. There is no command. And there are lot of files in single dir with prefixes in names (I propose to drop the prefixes and use a subdir). Every piece of software should be an installable package I can run from any location. Installable package is enough to run with python3 -m if __main__.py file is created. But when I use console_scripts entry point pip also creates scripts or executables suitable for the platform. It doesn't mean I cannot edit the package: I can install using pip install --upgrade -e . and pip will use the impl from the dir.

KOLANICH avatar Jan 09 '20 00:01 KOLANICH

I see, it's mess because it's mess. Let me thank you for feedback and speculate that your time is better spent crusading against webgpu. Pokazhi im kuzkinu mat'!

pfalcon avatar Jan 09 '20 07:01 pfalcon

Should I expect you doing that yourself in near future? Or should I send a PR? Please note that PRs have a drawback - negotiating changes in them is damn slow, so it's better if you do the needed changes yourself.

KOLANICH avatar Jan 09 '20 07:01 KOLANICH

Please note that PRs have a drawback - negotiating changes in them is damn slow,

Ack. So, thanks for starting the discussion first, it allows to short-circuit unneeded work.

so it's better if you do the needed changes yourself.

Also ack. (You've generally came a place where these points are known/understood, no worries). Indeed, changes of such nature better be done by maintainers.

Should I expect you doing that yourself in near future?

There're no plans to publish a PyPI package for this project in the foreseeable future, the project just isn't at that stage yet. You install it from PyPI, what happens next? Nothing. You really need to start from git clone to get an idea what it is at all, and what it can do.

Structuring the codebase into Python-level packages (not - different meaning of the word "package") is something I almost decided on, and what may happen during this year. (So far, I'm working on other projects.)

So, if you're interested in ScratchABlock, feel free to tell how you used, or how plan to use, and then whether it worked out. "Organizational" topics, like PyPI packages, are largely out of scope for now.

pfalcon avatar Jan 09 '20 08:01 pfalcon

There're no plans to publish a PyPI package

it is not about pypi. It is about installation into the system.

So, if you're interested in ScratchABlock, feel free to tell how you used

I haven't used it yet in fact. The use case is CFG analysis.

what may happen during this year.

It is a matter of few minutes. So it should be done now without waiting the infinitely distant future.

If you need a boilerplate, you can use https://github.com/KOLANICH/RichConsole.py as one.

KOLANICH avatar Jan 10 '20 00:01 KOLANICH

it is not about pypi. It is about installation into the system.

Oh, that's simple - there's no need to install anything. Just download it (git clone in this case), unpack (no need in this case) and it just works.

I haven't used it yet in fact.

Oh, and you can't use it, until you make the changes you lobby, right? ;-) Oh, so familiar! I feel about the same, and at about the same time about somebody else's project ;-). I want to hack SSA and register allocation there. But I can't, just can't until they got there command-line options right :-D. The difference? That project already has package on PyPI, and already has pretty detailed usage docs. So, I'm just trying to help with "last mile"-style problems. This project, is at best a research prototype done in a practical way, and so far, intended to be treated as such.

The use case is CFG analysis.

Given that you didn't try to do it yet, and "CFG analysis" sounds rather generic, feel free to elaborate on your usecase, maybe I will be able to tell whether ScratchABlock is a good match for what you have in mind.

It is a matter of few minutes. So it should be done now without waiting the infinitely distant future.

It's a matter of careful planning for months.

And ok, you repeated what you want me to do for you a few enough times, so I guess this buys me moral right to say what I'd like you to do, given your desire to jump on arbitrary project and "help" them ;-). So, love Python package matters? This project https://github.com/pfalcon/pycopy-lib currently publishes source-like packages to PyPI (hundreds of them, you'll love that ;-) ). We need a master-plan how to publish "binary", wheel-like packages for multiple of sub-architectures the pareny project (Pycopy) supports. Again, what's needed isn't a patch actually, but a well-thought RFC with examples.

Oh, and that another project I mentioned above, it's github.com/windelbouwman/ppci-mirror/ . Have a look, maybe you'll like it either ;-). It can do some CFG analises too.

pfalcon avatar Jan 10 '20 07:01 pfalcon

Oh, that's simple - there's no need to install anything. Just download it (git clone in this case), unpack (no need in this case) and it just works.

It has a drawback that I have to be tied to the dir where it is situated.

Given that you didn't try to do it yet, and "CFG analysis" sounds rather generic, feel free to elaborate on your usecase, maybe I will be able to tell whether ScratchABlock is a good match for what you have in mind.

I wanna write a kind of Visual Basic 6 (for programs compiled into native code, they are still a kind of VM; for p-code there are FOSS decompilers but vb6 by default compiles into native code, so the majority of software written in vb6 is compiled this way. And it should be clearly possible to create a decompiler since there are a few commercial ones targeting VB6.) decompiler. All the stuff I see currently in the net is not free software. Though I am not yet sure if I should use ScratchABit/Block or angr or something else.

https://github.com/pfalcon/pycopy-lib

Looks not fine. IMHO it is a wrong approach to create from scratch, if the upstream is monolithic and is likely be monolithoc. IMHO a better way should be semi-heiristically rip python part of cpython github repo into packages.

https://github.com/windelbouwman/ppci-mirror/

Not sure how it can be useful for me. I need a decompiler, not a compiler.

KOLANICH avatar Jan 10 '20 23:01 KOLANICH

It has a drawback that I have to be tied to the dir where it is situated.

Trust me, that's not the biggest drawback you can face with your task ;-).

I wanna write a kind of Visual Basic 6 decompiler. ... Though I am not yet sure if I should use ScratchABit/Block or angr or something else.

ScratchABlock isn't going to realistically help you with such a task. While theoretically the aim is to decompile almost anything into almost anything, practically it's a collection of routines related to program analysis and transformation. You can't use it to decompile things. You can you it to research decompilation. angr isn't even a (static) decompiler per se, but more like dynamic binary analysis framework. YMMV in trying it for your usecase (if you do and achieve something, let me know). If you want a "state-of-the-art" open-source decompiler with commercial backing (i.e. something which does some s%^t, not just research project), it's https://github.com/avast/retdec . Just today in my notifications I got a line mentioning some kind of VB support: https://github.com/avast/retdec/issues/689. I don't deal with it, because I consider C++ to be a waste of life. I'm subscribed to it exactly to watch how many people would contribute to it something related to decompilation. Over 2 years since it was open-sourced, I saw only one clear case of one guy contributing a series of such patches for a week, before giving up. The rest is mostly "libfoo doesn't build on macos" kind of thing. Good luck with the project.

https://github.com/pfalcon/pycopy-lib Looks not fine. IMHO it is a wrong approach to create from scratch, if the upstream is monolithic and is likely be monolithoc. IMHO a better way should be semi-heiristically rip python part of cpython github repo into packages.

That's exactly what's being done there - something written from scratch, something ported from CPython stdlib. That's details, the aim is to have minimalist Python implementation, for all those who failed to acquire Stockholm syndrome for Lua. Be my guest when you find yourself affected.

https://github.com/windelbouwman/ppci-mirror/ Not sure how it can be useful for me. I need a decompiler, not a compiler.

Both compilers and decompilers are program analysis frameworks, which reuse 80+% of algorithms. If you don't trust me, ask RetDec - it's built on top of LLVM, which, as we know, commonly used for compilation. Don't get me wrong though - PPCI is very far from being able to do decompilation any time soon. But it needs hands of every Python hacker out there ;-).

P.S. Oh, and now that you look at the scene, don't pass past one of the latest hits - https://github.com/NationalSecurityAgency/ghidra . Wanna know which tools NSA uses to own you? Now you can!

pfalcon avatar Jan 11 '20 01:01 pfalcon

it's https://github.com/avast/retdec

In fact I have tried it first, it has just eaten all the RAM available on my machine. In fact it goes OOM even on small binaries (a few MiBs) IDA easily loads and decompiles on machines with lot less RAM.

I have seen some recently added VB-specific code parsing some VB structures in it ... but the info parsed seems to be unused. I also haven't seen any VB-specific cfg analysis there.

Haven't tried ghidra yet though.

Thanks.

KOLANICH avatar Jan 11 '20 09:01 KOLANICH