tracee icon indicating copy to clipboard operation
tracee copied to clipboard

log: introduce simple logger package

Open geyslan opened this issue 3 years ago • 14 comments

Initial Checklist

  • [x] There is an issue describing the need for this PR.
  • [x] Git log contains summary of the change.
  • [x] Git log contains motivation and context of the change.
  • [ ] If part of an EPIC, PR git log contains EPIC number.
  • [ ] If part of an EPIC, PR was added to EPIC description.

Description (git log)

Introduce logger package and make sample changes to cmd/tracee-ebpf/main.go.

This is a proposal based on the discussion in #2093 PR.

commit 4db97d9fdfdd0d26c60c82b7ddec2d2d499f4dd3 (HEAD -> 1787-logger-simple, origin/1787-logger-simple)
Author: Geyslan Gregório <[email protected]>
Date:   Tue Aug 30 08:51:33 2022 -0300

    log: kickstarting logger use

commit a055ca72d6bc4c05183542235d8cc701212421bc
Author: Geyslan Gregório <[email protected]>
Date:   Tue Aug 30 08:43:04 2022 -0300

    log: introduce simple logger package
    
    This logger can be used through all runtime directly from its
    package-level instance as well as from instances maintained by the
    consumer itself.
    
    Its base is a zap logger with the addition of log (origin) counting.
    
    Environment variables can be used to set the base logger:
    TRACEE_LOGGER_LVL, TRACEE_LOGGER_ENCODER and TRACEE_LOGGER_AGGREGATE.

Fixes: #1787

Type of change

  • [ ] Bug fix (non-breaking change fixing an issue, preferable).
  • [ ] Quick fix (minor non-breaking change requiring no issue, use with care)
  • [x] Code refactor (code improvement and/or code removal)
  • [x] New feature (non-breaking change adding functionality).
  • [ ] Breaking change (cause existing functionality not to work as expected).

How Has This Been Tested?

Tests being included in this PR:

  • [ ] Test File A
  • [ ] Test File B

Reproduce the test by running:

  • command 01
  • command 02

Final Checklist:

Pick "Bug Fix" or "Feature", delete the other and mark appropriate checks.

  • [ ] I have made corresponding changes to the documentation.
  • [x] My code follows the style guidelines (C and Go) of this project.
  • [x] I have performed a self-review of my own code.
  • [x] I have commented all functions/methods created explaining what they do.
  • [x] I have commented my code, particularly in hard-to-understand areas.
  • [x] My changes generate no new warnings.
  • [ ] I have added tests that prove my fix, or feature, is effective.
  • [ ] New and existing unit tests pass locally with my changes.
  • [ ] Any dependent changes have been merged and published before.

Git Log Checklist:

My commits logs have:

  • [x] Subject starts with "subsystem|file: description".
  • [x] Do not end the subject line with a period.
  • [x] Limit the subject line to 50 characters.
  • [x] Separate subject from body with a blank line.
  • [x] Use the imperative mood in the subject line.
  • [x] Wrap the body at 72 characters.
  • [x] Use the body to explain what and why instead of how.

geyslan avatar Aug 26 '22 19:08 geyslan

Cool! Log FTW :)

Couple questions:

1 - can't we use SugaredLogger? It has good performance, with better syntax. Eg: It doesn't seem like log performance is critical to use the strongly-typed approach.

# this 

logger.Debug("os_info", "os_realease_field", helpers.OS_KERNEL_RELEASE, "OS_KERNEL_RELEASE", OSInfo.GetOSReleaseFieldValue(helpers.OS_KERNEL_RELEASE))

# intead of 

logger.Debug("os_info", logger.Uint32("os_realease_field", uint32(helpers.OS_KERNEL_RELEASE)), logger.String("OS_KERNEL_RELEASE", OSInfo.GetOSReleaseFieldValue(helpers.OS_KERNEL_RELEASE)))

2 - Why are we adding complexity of log counting? It was pointed to me this issue https://github.com/aquasecurity/tracee/issues/2036, and it seemed the reasoning was because the same error message getting logged multiple times?

josedonizetti avatar Aug 29 '22 18:08 josedonizetti

Ok, practically this looks good to go ahead, please note my comments below.

:+1:

Do you think we should replace all logging with this in the scope of this PR or move that to a later one?

I think we can make this PR only an introduction to logger package, demonstrating how to use it with changes to the main file. As it is right now (but I would rebase it and commit accordingly). But let me know.

Just more a couple questions.

  • Currently, by default, it aggregates log origins (not the full logs) inside the LogCount. WDYT of changing it to optional making possible it to be set from env TRACEE_LOGGER_AGREGATE=true and through LoggerConfig?
  • logger.Dump() is the way of get counts of log origins (aggregated). How do you plan on using this eventually? Triggered via signal?

geyslan avatar Aug 29 '22 20:08 geyslan

1 - can't we use SugaredLogger? It has good performance, with better syntax. Eg: It doesn't seem like log performance is critical to use the strongly-typed approach.

Yeah, I suppose it's not an issue to use the sugared version.

In any case the sugared zap is in second place in performance.

image

2 - Why are we adding complexity of log counting? It was pointed to me this issue #2036, and it seemed the reasoning was because the same error message getting logged multiple times?

I guess @NDStrahilevitz and @AlonZivony can answer you about this more precisely.

--- EDIT

Yeah, I suppose it's not an issue to use the sugared version.

But I'll wait more thoughts before changing it.

geyslan avatar Aug 29 '22 20:08 geyslan

So about sugared logs - by design we shouldn't have noisy logs, so i'm ok with moving there.

@josedonizetti We want to keep the log counting to silence the noisy stuff and track how many occurred.

@geyslan I would like this PR at least to encompass the tracee-ebpf (and possibly tracee-rules) cmd directories if that's not too much. We could do the packages in a later PR. However do note that by skipping that we may miss bugs in the logger now (especially with regard to the log counting).

NDStrahilevitz avatar Aug 30 '22 12:08 NDStrahilevitz

@geyslan @NDStrahilevitz @rafaeldtinoco Wonder if it would be okay to simplifyy the scope here?

1 - add a basic logger, and change both tracee-ebpf and tracee-rules main.go to use it 2 - change the whole project to use a logger 3 - add log counting

It seems to me it is easier to review, understand the scope, and test things. We would also get a logger quicker to the project, allowing new PRs to already use it. WDYT?

josedonizetti avatar Aug 30 '22 13:08 josedonizetti

@geyslan @NDStrahilevitz @rafaeldtinoco Wonder if it would be okay to simplifyy the scope here?

1 - add a basic logger, and change both tracee-ebpf and tracee-rules main.go to use it 2 - change the whole project to use a logger 3 - add log counting

It seems to me it is easier to review, understand the scope, and test things. We would also get a logger quicker to the project, allowing new PRs to already use it. WDYT?

Good plan, let's keep this PR for point 1.

NDStrahilevitz avatar Aug 30 '22 13:08 NDStrahilevitz

Good plan, let's keep this PR for point 1.

Ok, as point 3 is almost 100% I'll do this embrace it as well.

note:@geyslan When returning to this PR:

  • [x] change the logger core to the sugared one.
    • changed API to Debugw(), Errorw(), ... to reflect the zap sugared functions API.
  • [x] implement TRACEE_LOGGER_AGGREGATE=true and LoggerConfig logic, making aggregation (couting) not default.
  • [x] change tracee-ebpf and tracee-rules main.go for using the new logger.

geyslan avatar Aug 30 '22 15:08 geyslan

@geyslan would you mind rebasing this ? @josedonizetti would you mind reviewing this ? @NDStrahilevitz would you mind giving a +1 after those ^ ?

Thanks!

rafaeldtinoco avatar Sep 14 '22 22:09 rafaeldtinoco

@geyslan would you mind rebasing this ?

Done.

@josedonizetti that's your call.

geyslan avatar Sep 15 '22 12:09 geyslan

@geyslan Perhaps some of the design decisions here are due to requirements I don't know, let me know:

1 - why do support env variables to configure the logger? We are not using env variables for anything else on tracee. 2 - why only env vars and not command line flags? 3 - it is weird to me to couple the env variables under the log package, calling methods on init. This hides implementation details that I don't think are related to logging, the env vars value should be used to initialize the logger configuration which is passed as a dependency to the logger initialization once.

main.go

/// config := getLogConfig() // checks for the env vars if it is not there, uses the default values
// log.Init(config) // no need to check if it was set from the env or not, only one point to initialize the logs (no init inside the pkg)

This gives the package a single initialization point and also decouples it from the env vars, which should only be used for configuration. Also avoids duplication, eg, every method checking for an env var must check and set setFromEnv, and removes the need to check IsSetFromEnv whenever one is initializing the logger.

WDYT?

josedonizetti avatar Sep 15 '22 17:09 josedonizetti

1 - why do support env variables to configure the logger? We are not using env variables for anything else on tracee.

Tracee can be consumed as a library. https://github.com/aquasecurity/tracee/pull/2110#pullrequestreview-1088405734

2 - why only env vars and not command line flags?

Ditto.

3 - it is weird to me to couple the env variables under the log package, calling methods on init. This hides implementation details that I don't think are related to logging, the env vars value should be used to initialize the logger configuration which is passed as a dependency to the logger initialization once.

Foundations about the init() use are here https://github.com/aquasecurity/tracee/pull/2110#discussion_r957140226.

This gives the package a single initialization point...

Yes, but doesn't guarantee that's the unique point calling logger.Init() and doesn't make it so crystal. About that topic I really don't know how to enforce it. The more we provide options we make it decentralized.

How about instead of this:

			// Avoiding to override package-level logger
			// when it's already set by logger environment variables
			if !logger.IsSetFromEnv() {
				// Logger Setup
				logger.Init(
					&logger.LoggerConfig{
						Writer:    os.Stderr,
						Level:     logger.InfoLevel,
						Encoder:   logger.NewJSONEncoder(logger.NewProductionConfig().EncoderConfig),
						Aggregate: false,
					},
				)
			}

Do something like this?

			logger.Init(
				&logger.LoggerConfig{
					Writer:    os.Stderr,
					Level:     logger.InfoLevel,
					Encoder:   logger.NewJSONEncoder(logger.NewProductionConfig().EncoderConfig),
					Aggregate: false,
					// Don't override package-level logger config
					// when it's already set by logger environment variables
					OverrideEnv: false,
				},
			)

Then, through a single flag we can set the logger (at any point) for reusing or not env vars.

geyslan avatar Sep 15 '22 19:09 geyslan

Tracee can be consumed as a library. #2110 (review)

Hmm, at least for me, if you consuming Tracee as a lib, you should set your own logger logger.SetBase() it will respect whatever your application is already using. One can bring their own env vars and config files. Parse them as they want not related to tracee library. Their app should handle their log initialization. Because one consuming it as a library should be changing it programmatically. What if a library user forgets to set the env vars?

2 - why only env vars and not command line flags?

Ditto.

But one doesn't cancel the other. A user now passes every configuration he wants tracee to do via command line, why for logs he would have to use an env variable? (but perhaps this should be another PR?)

Yes, but doesn't guarantee that's the unique point calling logger.Init() and doesn't make it so crystal. About that topic I really don't know how to enforce it. The more we provide options we make it decentralized.

We could if we want, for example, panic if the logger is already set for example.

Well, it seems most of the requirements came from the product, which @NDStrahilevitz is more aware of, and there is no right or wrong. Probably best to get his review, besides the different opinions on the design, the code seems :+1:

josedonizetti avatar Sep 15 '22 19:09 josedonizetti

Hmm, at least for me, if you consuming Tracee as a lib, you should set your own logger logger.SetBase() it will respect whatever your application is already using. One can bring their own env vars and config files. Parse them as they want not related to tracee library. Their app should handle their log initialization. Because one consuming it as a library should be changing it programmatically. What if a library user forgets to set the env vars?

Agree with @josedonizetti here. Maybe @NDStrahilevitz can explain what he meant in his comment https://github.com/aquasecurity/tracee/pull/2110#pullrequestreview-1088405734?

yanivagman avatar Sep 15 '22 19:09 yanivagman

So about the flags or arguments "debate", we can have both, and i've noted in previous reviews that the flag isn't working anymore, perhaps I didn't make it clear that I do want it to still function, just have the env flag taking priority.

So about the initialization, my reasoning was as such:

  1. Tracee has a whole bunch of required initialization logic, adding more adds difficulty in consuming
  2. To make the logging simpler it's solution should implicitly initializing the logger as it is part of tracee's mechanism of reporting problems
  3. If a consuming app is aware of this and wants to redirect the logs and change configuration behavior, they could still use something like AddBase but otherwise would not be concerned with it or have to even know of it.

I'm usually for not having implicit behavior, but in the case of logging I thought making an exception would be best so tracee can be initialized more smoothly, especially since we will likely use the global logger in other packages, and wanted it to work for other apps "out of the box".

I do agree we can make the initial logic "prettier" though, if we set implicitly via env variables on logger.init() we can just have a new config with cli arguments overriding it if they exist, with something like this similar to @josedonizetti's suggestion:

func main(c *cli.Ctx) {
    // ...logger is implicitly ready by this point
    config := getCliConfig(c)
    log.SetConfig(config)
}

NDStrahilevitz avatar Sep 18 '22 10:09 NDStrahilevitz

@NDStrahilevitz this PR version expired. I'm going to rebase it for you. Do you need something else?

geyslan avatar Sep 28 '22 16:09 geyslan

@geyslan I think this is good enough for now, i'll approve and let's merge after the rebase. About all the init logic and how I think it will become clearer after some usage what would be best. One nit that I have is that I think we should just do Debug instead of Debugw since it's not as if we have both to differentiate between having a wide version and not, but up to you.

NDStrahilevitz avatar Sep 29 '22 08:09 NDStrahilevitz