qiling icon indicating copy to clipboard operation
qiling copied to clipboard

epoll support

Open libumem opened this issue 9 months ago • 25 comments

Checklist

Which kind of PR do you create?

  • [ ] This PR only contains minor fixes.
  • [ ] This PR contains major feature update.
  • [x] This PR introduces a new function/api for Qiling Framework.

Coding convention?

  • [x] The new code conforms to Qiling Framework naming convention.
  • [x] The imports are arranged properly.
  • [x] Essential comments are added.
  • [x] The reference of the new code is pointed out.

Extra tests?

  • [ ] No extra tests are needed for this PR.
  • [ ] I have added enough tests for this PR.
  • [x] Tests will be added after some discussion and review.

Changelog?

  • [ ] This PR doesn't need to update Changelog.
  • [x] Changelog will be updated after some proper review.
  • [ ] Changelog has been updated in my PR.

Target branch?

  • [x] The target branch is dev branch.

One last thing


Comments

This PR is intended to provide initial support for the epoll_create, epoll_create, epoll_ctl, and epoll_wait syscalls using the select library; similar to how poll is done already. The binaries for testing have also been PR'd here; but I have since changed the names slightly after checking out the examples/src folder.

libumem avatar Apr 11 '25 03:04 libumem

Hi,

Thanks for the PR and welcome to Qiling Framework.

Please take a look at the error. it seems lik here is some error

xwings avatar Apr 12 '25 11:04 xwings

@xwings

Ok, I disabled the unit tests I had provided, as they won't yet work until my examples are accepted upstream in rootfs; let alone the issues I outlined in my PR description.

libumem avatar Apr 13 '25 05:04 libumem

let me accept that pr

xwings avatar Apr 13 '25 09:04 xwings

you can enable the test. i updated the CI test and the merged rootfs

xwings avatar Apr 13 '25 10:04 xwings

We still have tons of issue witl multithread.

not sure how to fix it and it seems like we dont have a betteer way too.

@wtdcode @elicn you guys might want to take a look at this PR

xwings avatar Apr 13 '25 10:04 xwings

We still have tons of issue witl multithread.

FWIW, one problem I've pointed out is with a single test case -- and only under unit testing, not the underlying Qiling emulation of multithreaded code. The example binaries are single-threaded, as is the epoll implementation I have so far.

libumem avatar Apr 13 '25 22:04 libumem

I appreciate the feedback @elicn

I'm running into a couple of issues with the tests I have written, both in test_elf.py

1: When trying to feed my epoll_0 example input via stdin, similar to the examples in the qiling docs; the input doesn't seem to make it through. If I am not doing anything to modify stdin; I can run the binary under Qiling normally, as it will echo my input or exit on receiving "stop", which is what it's supposed to do. I've disabled the test case for now.

2: When running the test ELFTest.test_elf_linux_x8664_epoll_server by itself, it works fine. But when I run the whole test suite, with PYTHONPATH=</path/to/this/pr/> python3 test_elf.py; I get a "PermissionError: [Errno 1] Operation not permitted" error. I haven't been able to isolate the cause of this, as my server doesn't do anything that ought to require any sort special privileges outside of opening port 8000. If the port was an issue, I would expect the test to also fail when run standalone.

Would you please take a look at these problems?

libumem avatar Apr 26 '25 16:04 libumem

I've patched some of the code earlier today and posted a PR to your repo. Please review the changes I made to see if they make sense. There were a few snippets in which I could not figure out what should be the expected behavior; look for my inline comments starting with "elicn".

elicn avatar Apr 28 '25 19:04 elicn

Note I've posted another PR to your branch. Please review the changes there to see if they align with your intent.

elicn avatar May 04 '25 15:05 elicn

looks great. how's everything going now?

zkw13 avatar May 08 '25 09:05 zkw13

You should roll back your last 2 commits to align back with the code I pushed.

elicn avatar May 09 '25 14:05 elicn

@elicn Thanks for your patience, I've had some stuff come up IRL that's delayed my response.

The functions called later on need the events data, which is the first 32-bit field in that structure (i.e. *epoll_event and not epoll_event, which is the structure address)

That's what I'm doing now in ql_syscall_epoll_ctl

    events = 0
    if event:
        events = ql.mem.read_ptr(event,4)

which should match what I think your intent is. This also matches the expected behavior in the test cases I have; although those aren't passing yet for unrelated reasons.

libumem avatar May 16 '25 02:05 libumem

Looking into the permission denied errors I'm getting when running the whole test suite here

libumem avatar May 16 '25 02:05 libumem

The commits you pushed after my PR are introducing wrong 'fixes'. You modified the pointer handling with a wrong implementation, and replaced a useful comment with a useless one. There is no need to fix what not needs fixing.

elicn avatar May 16 '25 08:05 elicn

Rolled back to b724f4cf63de7a2fd88239b12c3eaca9a8122c80 which fires a UC_ERR_READ_UNMAPPED error in the test_elf_linux_x8664_epoll_server test case.

Ran in PDB, which shows what I think is the issue -- this is what I was trying to deal with in the next commit

-> event_ptr = ql.mem.read_ptr(event)
(Pdb) list
151                 return -EEXIST
152  
153             if not event:
154                 return -EINVAL
155  
156 B->         event_ptr = ql.mem.read_ptr(event)
157             events = ql.mem.read_ptr(event_ptr, 4)
158  
159             # EPOLLEXCLUSIVE was specified in event and fd refers to an epoll instance
160             if isinstance(fd_obj, QlEpollObj) and (op & EPOLLEXCLUSIVE):
161                 return -EINVAL

(Pdb) hex(event)
'0x7ffffffffab8'
(Pdb) n
> <removed>/epoll.py(157)ql_syscall_epoll_ctl()
-> events = ql.mem.read_ptr(event_ptr, 4)
(Pdb) hex(event_ptr)
'0x380000005' -> # invoking events = ql.mem.read_ptr(event_ptr, 4) will cause an access violation

libumem avatar May 16 '25 12:05 libumem

Got it. Undo these two commits so I can rebase my new work on top of yours: e4242ca741116c66fdde120e003b608559ef0b92 f9497bc7d6998c02bef5dd4cddf18e9132b24c7a

elicn avatar May 22 '25 18:05 elicn

@elicn Done

libumem avatar Jun 02 '25 23:06 libumem

I've created a new PR. Please let me know if you have any questions.

elicn avatar Jun 03 '25 10:06 elicn

The test errors suggest that you relocate the x8664_onestraw_server to the bin directory under the rootfs you are using.

elicn avatar Jun 04 '25 07:06 elicn

Commented out the test case, will dig deeper with what's going on in unittest.

libumem avatar Jun 24 '25 00:06 libumem

@elicn Up to you if you want to merge as is, or I can convert to draft PR pending any feedback on the python issue tracker...or elsewhere.

The blocker was/is that the code works outside of unittest, but gets "Operation not permitted" errors when running the ELFTest group.

libumem avatar Jun 24 '25 00:06 libumem

@elicn Thoughts?

libumem avatar Jul 21 '25 00:07 libumem

I prefer avoiding incorporating changes that cannot be tested. Is there anything we can do to have the test work?

elicn avatar Jul 21 '25 12:07 elicn