Feature: Async IO?
It doesn't look like to me that there is any support within Pwntools for asynchronous IO (for example, being able to start 100 remote processes and deal with them at the same time, rather that one after the other). Since there is also no Python 3 support, the only way I can see this being done is through python modules like threading or multiprocessing. Just wondering, has this ever been considered for development or is there an easy way this can be achieved?
No, there's no asynchronous I/O support (i.e. you cannot set a callback for when data arrives).
I expect that it could be added relatively easy with threading, with some simple logic that polls can_recv and fires off a callback.
Probably the most efficient way to do this is to define a new class that inherits from pwnlib.tubes.tube.tube, that proxies all xxx_raw calls to some other tube object, and supports the callback registration like shown above.
class async_tube(tube):
def __init__(self, child, callback):
self.child = child
self.callback = callback
self.thread = ...
# invoke callback whenever data is available
def threadfunc(self):
while self.callback(self, self.recv()):
pass
# proxy all xxxx_raw routines to the child object
def recv_raw(*a, **kw):
return self.child.recv_raw(*a, **kw)
...
You'd then use it something like:
def my_callback_func(tube, data):
print('received', repr(data))
r = remote(host, port)
r = async_tube(r, my_callback_func)
r.wait_for_close()
Thanks, I'll look into that. The easiest way I've found so far is to just create a process pool through multiprocessing (or thread pool) and then use the Pool.map() to get an array of results for a particular call. This may not be the most efficient, but I think for most purposes it is good enough.
See this blog post.
The biggest restriction with using multiprocessing is that each tube's state will only be valid in the sub-process that it's running -- and only then if the tube isn't touched from the main process.
Does this apply even if a ThreadPool is used rather than a Pool of processes? (e.g. from multiprocessing.pool import ThreadPool)
It all depends on how the objects are instantiated. It's outside the scope of this issue to go into Python's limitations when threading or multiprocessing.
Ultimately, threading behaves the way most people expect, and multiprocessing has object state issues for objects passed from the parent process into the child process.
Anything Pwntools supports officially (i.e. if you intend to submit a pull request) should target the threading API first, and possibly support multiprocessing.dummy as a non-default option for performance (at the cost of many gotchas).
All of that said, this hasn't come up before, so perhaps you only need to hack a one-time solution rather than contribute something back to the core codebase. What are you actually trying to do / solve?
Yes okay, if other people would find value in this then I could look at contributing something, but otherwise I'll stick to my temporary solution. Which while not most efficient, is 'good enough', and easy to implement.