tracee [BUG] KernelConfig not necessarily initialized

Prerequisites

[x] This affects latest released version.
[x] This affects current development tree (origin/HEAD).
[x] There isn't an issue describing the bug.

Select one OR another:

[ ] I'm going to create a PR to solve this (assign to yourself).
[x] Someone else should solve this.

Bug description

Currently when initializing the KernelConfig in the cli, we avoid crashing when initialization fails: This may later cause issues down the line, so better behavior should be defined.

Jul 25 '22 08:07 NDStrahilevitz

@AlonZivony we only need kconfig because of kconfig relocations within libbpf. Last we spoke about this, you started probing features from kallsyms (checking if symbols existed, to know if kconfig option was set or not).

Could you write here a bit of the work you did, point issues that caused the work, and describe if we still need the kconfig relocations OR we can rely in the method you created entirely ?

Jul 25 '22 15:07 rafaeldtinoco

Currently it does not really seem to me as an issue if the kconfig isn't initialized - our assumed values are good enough I think for the small amount of cases it happens in. But I didn't really investigated it through. However, in the future I want to use many kconfig values, and it will be necessary to get the correct ones. To do so, I noticed that the kconfig values affect which symbols and functions are compiled. As a result, we get some symbols to exist or not exist. For example, the CONFIG_SYNAMIC_MEMORY_LAYOUT result the existence of page_offset_base symbol. So if the symbol exist, we know that the config was set. However, this takes a lot of effort to check because it might change one day with new kernel version and should be checked for all existing kernel versions we support. We also saw that there are differences in kernel symbols between different distributions, that might cause false values. We can add this behavior to support the case that the kconfig values aren't available, but I don't think we should change the behavior to only use the kernel symbols. It will also make the CAP_SYSLOG capability to be mandatory for the initialization.

Jul 28 '22 09:07 AlonZivony

Yes, I also think that the kconfig file read need should be removed and we should rely in checking the existence (or not) of symbols. Maybe we should have a table of things to assume and check for ksyms:

Feature A:
- Symbol/Condition 1: kernel version range
- Symbol/Condition 2: kernel version range
- Symbol/Condition 3: kernel version range
Feature B:
- ...

This would get us "free" from userland (or containerland) in taking decisions (if we are in the host or container, if we have access to kconfig through procfs or just a file, etc).

Jul 28 '22 11:07 rafaeldtinoco

But it make you vulnerable to errors per kernel version and distro. It is already very hard for us to guarantee that tracee works for all distros and versions, you think it will be wise to totally rely on the symbols this way?

Jul 28 '22 11:07 AlonZivony

But it make you vulnerable to errors per kernel version and distro. It is already very hard for us to guarantee that tracee works for all distros and versions, you think it will be wise to totally rely on the symbols this way

Well, you are right actually. I like to see how others have solved the same problem and if libbpf guys thought kconfig were so important they should have a relocation method just for them, then it is likely it wont be easy to get rid of them. Another approach would be to do all tracee cmdline configuration (for the container only ? or for regular cmd as well maybe...) through a wrapper that takes care of all those details (check if /proc/config.gz exists, check if /boot/config-$(uname -r) exists, check if you're running inside the container, check if you're starting a container and, if you are, pass the correct docker cmdline sharing hosts kconfig file, etc etc).

Every day i see more fit to have a "wrapper" simplifying many of the things we do that wouldn't fit the tool initialization process itself (like making tracee to do all this).

Jul 28 '22 13:07 rafaeldtinoco

But it make you vulnerable to errors per kernel version and distro. It is already very hard for us to guarantee that tracee works for all distros and versions, you think it will be wise to totally rely on the symbols this way

Well, you are right actually. I like to see how others have solved the same problem and if libbpf guys thought kconfig were so important they should have a relocation method just for them, then it is likely it wont be easy to get rid of them. Another approach would be to do all tracee cmdline configuration (for the container only ? or for regular cmd as well maybe...) through a wrapper that takes care of all those details (check if /proc/config.gz exists, check if /boot/config-$(uname -r) exists, check if you're running inside the container, check if you're starting a container and, if you are, pass the correct docker cmdline sharing hosts kconfig file, etc etc).

Every day i see more fit to have a "wrapper" simplifying many of the things we do that wouldn't fit the tool initialization process itself (like making tracee to do all this).

Do we have an issue for such a "wrapper"?

Jul 28 '22 13:07 NDStrahilevitz

Every day i see more fit to have a "wrapper" simplifying many of the things we do that wouldn't fit the tool initialization process itself (like making tracee to do all this).

Do we have an issue for such a "wrapper"?

I don't think so. We've discussed about it in roadmap meetings only, but nothing concrete.

Aug 01 '22 12:08 rafaeldtinoco

Now we do: https://github.com/aquasecurity/tracee/issues/2038

Aug 02 '22 02:08 rafaeldtinoco