collector/ftrace: Use top-level buffer instance
Abandon the use of non-toplevel buffer instance (-B parameter), as they are not worth the hassle. A number of frace features do not interact well with them such as:
- bprint event emitted by trace_printk(), that always lands in the top-level buffer no matter what.
- synthetic events that can also only reasonably be emitted in the top-level buffer as of Linux 6.13
events e.g. bprint which are not possible to use with non top-level buffers become usable again.
They were still usable, but in the top-level instance. You can collect more than one instance in a single trace.dat, so that is what happened.
The downside is that we "lose" the ability for concurrent instruments to be isolated with separate buffers however in practice this didn't seem to achieve the desired effect?
Yes, isolation was the only upside I was chasing when I introduced that. After a couple of years of trying it out, it's basically 100% (mild) pain for 0% benefit I could see, and today, the pain turned into an actual blocker because some kernel feature (synthetic ftrace events) does not play well with instances because of some API design issue (I submitted some patches, but even if they get in, it will be years before all android kernels get fixed ...)
However from an end user's perspective the interface shouldn't change so we don't have to worry about compatibility issues if we revert this behaviour again?
Technically you can see the difference in trace-cmd report output. Instead of having this:
devlib: trace-cmd-620 [003] 2440.867617: irq_disable: caller=enter_from_kernel_mode+0x38 parent=el1_interrupt+0x24
you'll get that:
trace-cmd-620 [003] 2440.867617: irq_disable: caller=enter_from_kernel_mode+0x38 parent=el1_interrupt+0x24
If someone had an alternative parser for trace.dat, they would also be able to witness the difference. However, I'd be really surprised if a tool could cope with instances without handling the top-level one (usually it's more the opposite), and on top of that (almost) everything ended up in the "devlib" instance, so it's not like it was used as some sort of sorting mechanism. And since you could not choose the buffer in the devlib API, people can't have grown fancy use cases.
So if anything, reverting to using the top-level instance means the traces produced by devlib are more "vanilla" and less likely to trip other tools (although most tooling around trace.dat is made by the same people making ftrace, so support for buffers is normally good).
Also I realized that we do not pass -t to trace-cmd extract. According to its man page:
-t
Extracts the top level instance buffer. Without the -B or -a option this is the same as the default. But if -B or -a is used, this is required if the top level instance buffer should also be extracted.
So I'm not sure why we still get the top-level instance in practice. Maybe there is a bug in trace-cmd that makes the effect of -t enabled even if not passed on the CLI.
Either way, if this does not get merged, we should fix that and pass -t. If this gets merged, then it's just the default once we removed -B so nothing to change in the current PR.
I also noted that when -B is used, using trace-cmd reset becomes mandatory between runs in order to write to trace_marker. If not done, this error shows up:
# echo foobar > /sys/kernel/tracing/trace_marker
sh: write error: Bad file descriptor
This is on a 6.13 kernel and trace-cmd 3.1.5 (so not as old as in devlib).