Eliminate dependency on strace
On Linux, the output of strace is used to do automatic dependency detection on commands without ad-hoc support. Since this is a separate utility that is often not installed by default, the ptrace API should be used directly instead.
Unfortunately, the ptrace API is very non-trivial and this could be a lot of work. The biggest problem is that tables of system calls need to be maintained for each architecture. All of this is hardcoded in strace. There seems to be no easy way of obtaining this automatically. Luckily, only a handful of syscalls need to be traced (e.g., open, creat, chdir, mkdir, unlink, rename) and thus maintaining the entire syscall table is unnecessary.
You need more than just those six syscalls to for tracing file accesses, you also need to handle openat, unlinkat, truncate, utime, utimes, futimesat, lstat, lstat64, readlink, readlinkat, stat, stat64, mkdirat, symlink, symlinkat, execve, execveat, renameat, link, linkat, getdents, getdents64, and chdir. You could get by with fewer if you don't want to handle symlinks or directory accesses, but not by much.
If you want to share work on this, you could use my bigbro library https://github.com/droundy/bigbro, which uses ptrace to track changes to the filesystem by a child process. It is used by my fac build system, and I hope it could be easily integrated into button. It is currently limited to x86, x86-64 and ARM, but it shouldn't be hard to port to any hardware that you can test it on (and that runs linux).
Hi David, I'm happy to see that you've taken a look at this build system. It seems like there has been some convergent evolution going on with Button and fac.
You're right. I know more than just those six syscalls are needed for proper tracing. Those were just examples.
I'll definitely be taking a close look at libbigbro. I didn't know anyone else had implemented this, particularly for a build system. Although, I was hoping to eliminate a dependency by getting rid of strace. If I end up using this, then I'd just be replacing one dependency with another. (I know it's not quite the same; strace is a system dependency, not a library dependency.) My original idea was to implement ptrace dependency tracking directly inside Button. I'll probably try integrating libbigbro into Button and see how well it works. I was certainly not looking forward to implementing all of that, especially for Windows. You can expect me to be opening some issues and maybe pull requests in the future. From a cursory look at the source, it looks like it needs quite a bit of clean-up.
Have you done any performance comparisons between bigbro and strace?
Hi Jason,
I hadn't thought to compare timings of libbigbro with strace, since I never considered using strace. I've just now run a couple of timings compiling SDL2, which happened to be a bit of source code compiled with make that I had lying around to provide a plausible job.
time make:
real 1m54.943s user 1m46.436s sys 0m4.976s
real 1m50.245s user 1m42.320s sys 0m4.724s
time bigbro make 2> /dev/null:
real 2m6.008s user 1m47.172s sys 0m17.128s
real 2m8.089s user 1m48.900s sys 0m17.208s
time strace -f make 2> /dev/null:
real 2m11.039s user 1m51.252s sys 0m18.384s
real 2m14.951s user 1m54.396s sys 0m18.616s
So it looks like strace is surprisingly close to bigbro, given that strace does so much more. On the other hand, bigbro does a lot more for a build tool than strace does: it tracks files read and modified, and handles symlinks properly, so accessing a symlink in a path registers as a readlink.
See also https://github.com/jacereda/fsatrace
I prefer a library solution myself. I had some discussions with @jacereda regarding converting fsatrace to a library, which bogged down as he started work on https://github.com/jacereda/traced-fs