rr icon indicating copy to clipboard operation
rr copied to clipboard

Assertion failure in rr while rr record --chaos.

Open andronat opened this issue 4 years ago • 10 comments

I'm trying to run a binary that I'm afraid I can't post publicly and when I running:

rr record --chaos <private_bin>

I get the following crash:

[FATAL /home/andronat/rr/src/record_syscall.cc:5549:process_mmap()]
 (task 2177935 (rec:2177935) at time 1099)
 -> Assertion `rt->fd_table()->get_monitor(f.fd)->type() == FileMonitor::Type::Mmapped' failed to hold. Expected monitor type Mmapped for fd 100, got monitor type 3

Any ideas what is the problem and how can I fix it?

andronat avatar Aug 07 '21 17:08 andronat

Are you running rr build from the latest git rev or one your distro provided? If the latter, is it reproducible on tip?

khuey avatar Aug 07 '21 17:08 khuey

I'm running the latest from the repo. Last commit 3b75fd00da0b059be5c37768b960a58b0d951f86.

andronat avatar Aug 07 '21 17:08 andronat

Ok. Does your program expect to be able to pick arbitrary fds for its own use? rr uses fd 100 for some internal things, which is what triggers this error.

rr record -n will probably work around this.

khuey avatar Aug 07 '21 17:08 khuey

Ah this is interesting! Actually my binary is automatically going over a range of FDs and is closing them... Is it possible to make this rr used FD configurable?

andronat avatar Aug 07 '21 17:08 andronat

The -n option did the trick. What is the trade-off here? Am I going to miss some important feature by disabling the syscall buffer?

andronat avatar Aug 07 '21 17:08 andronat

The -n option did the trick. What is the trade-off here? Am I going to miss some important feature by disabling the syscall buffer?

Recording performance will suffer, but that's the only drawback.

khuey avatar Aug 07 '21 18:08 khuey

Ah this is interesting! Actually my binary is automatically going over a range of FDs and is closing them... Is it possible to make this rr used FD configurable?

You can configure it in the source by changing RR_DESCHED_EVENT_FLOOR_FD.

khuey avatar Aug 07 '21 18:08 khuey

JFYI I tried to change RR_DESCHED_EVENT_FLOOR_FD to:

define RR_DESCHED_EVENT_FLOOR_FD 6553

and the only syscall I see from a standard strace I made on my binary is:

strace.log.554131:close(6553)                             = -1 EBADF (Bad file descriptor)

Yet again when I'm running without -n, I still see the assertion error. From the description here I was expecting that closing this FD shouldn't create any issues.

andronat avatar Aug 07 '21 18:08 andronat

The standard file descriptor limit on Linux is 1024. You may need to raise the ulimit on your program first.

khuey avatar Aug 07 '21 18:08 khuey

Sorry forgot to mention that for us the limit is 65535 so the fd number is fine.

andronat avatar Aug 07 '21 19:08 andronat