tryCatchLog icon indicating copy to clipboard operation
tryCatchLog copied to clipboard

suppression settings for each condition type

Open nickdickinson opened this issue 4 years ago • 18 comments

In #44 you asked an open question about whether there should be a matrix configuration to supress the trace specifically by condition type. This would be great in scripts where messages are used for informational purposes and should not require an entire trace, while an error or warning may indeed require the trace to be there. This was not added when #44 was closed as far as I can tell. Is there a workaround perhaps? Otherwise it would be a great feature.

nickdickinson avatar Dec 12 '21 10:12 nickdickinson

I guess you are referring to this open question I have documented in #44 for myself:

Open questions:

  1. Does it make sense to allow different suppression settings for each condition type (error, warning, message...)? This would span a configuration "matrix".

I did not implement this together with #44 since I had no real use case for this and nobody has asked for this so far (which has obviously changed now ;-)

Another pending decision was whether tryCatchLog should suppress the stack trace in the log output or whether this should better be done at the logging framework side.

Possible work-arounds

In the logging framework

Since the log message (incl. the compact or full stack trace) is created in my code (https://github.com/aryoda/tryCatchLog/blob/master/R/tryCatchLog.R#L299) the logging framework cannot "undo" this easily.

At the logging framework side I could imagine two "dirty" work-arounds:

  1. Write only the first n lines to the output by using a custom logging function instead of the default one for the desired severity level(s) and enable it via tryCatchLog::set.logging.functions(), e. g. when using my internal logging functions via:
library(tryCatchLog)

options(tryCatchLog.include.full.call.stack = FALSE)
tryLog(log(-1))  # shows the compact call stack
# tryCatchLog::set.logging.functions(warn.log.func = function(msg) tryCatchLog:::log2console("WARN", substr(msg, 1, 20))) # show only the first n chars
tryCatchLog::set.logging.functions(warn.log.func = function(msg) tryCatchLog:::log2console("WARN", head(unlist(strsplit(msg, "\n", fixed = TRUE)), 1)))
tryLog(log(-1)) # shows only the first line
  1. Write the log output to different files (one file per severity level).

You can use the logging level as filter to write to different files (looks complicated but has the advantage of always seeing all logged messages of the same or higher log levels in one file...):

library(futile.logger)
library(tryCatchLog)

options(tryCatchLog.include.full.call.stack = FALSE)
tryLog(log(-1))  # shows the compact call stack
# Set an appender to a file named using the severity level (ERROR.log, WARN.log, INFO.log) applied for filtering:
# Each file contains ALL logging output of the according or higher log levels!
flog.appender(appender.file2('~l.log', console = TRUE)) # ~l = log level (see ?futile.logger::flog.layout)
tryLog(message("just an info")) # only logged in INFO.log
tryLog(log(-1)) # not logged in ERROR.log
tryLog(log("a")) # this error is logged in ERROR.log, WARN.log and INFO.log

None of the above work-arounds is nice or perfect :-(

In tryCatchLog

This requires code changes and I think there is no easy work-around for end users of tryCatchLog for now.

Next steps

Please give me some time to think about your feature request and how it could be implemented easily (incl. ease-of-use for end users). I don't want to make it too complicated for the end user though offer the possibility for customization...

I also want to re-think about another feature request in this context: Supressing selected ("hand-picked") conditions - see eg. #11

Any proposals are welcome :-)

aryoda avatar Dec 12 '21 13:12 aryoda

So I can see that #11 is a challenge fundamentally. But perhaps first education of package developers to follow patterns that will help us differentiate conditions.

For this one, you might consider the following for a form of configuration. I still need to consider your work arounds and look through the code but I might have some ideas for tryCatchLog.

options(test_option = list(warning = c(include.full.stack = FALSE, include.compact.stack = TRUE), message = c(include.full.stack = FALSE, include.compact.stack = FALSE), my_custom_condition = c(include.full.stack = TRUE, include.compact.stack = FALSE)))

getOption("test_option")$warning["include.full.stack"]

getOption("test_option")$my_custom_condition["include.full.stack"]

nickdickinson avatar Dec 13 '21 20:12 nickdickinson

So after having a think and looking at the code, I think the easiest for the end user would be to have a way to configure existing and new conditions with a simple function rather than directly in options.

I would think the severity level would be configurable and also the tryCatchLog package would have a way to register new user created conditions. One can then presumably use the registered conditions, rather than the static character strings ("warning", "error", etc.) to test for inheritance. Would this be feasible? The user does not have to know too much about how to build options and you could still leave the include.full.stack in the arguments for backward compatibility.

It might look like this:

In package code (to replicate current behavior) on load:

conditionLogOption("message", severity = "INFO") 
conditionLogOption("warning", severity = "WARN") 
conditionLogOption("error", severity = "ERROR") 
conditionLogOption("condition", severity = "INFO") 

In user code, to modify behavior:

conditionLogOption("message", include.full.stack = FALSE, include.compact.stack = FALSE) 
conditionLogOption("my_custom_warning_condition", include.full.stack = FALSE, include.compact.stack = TRUE, severity = "WARN")  

The option might be saved in a format such as in my prior message to store the condition names and behaviors. Of course, this pattern would open up possibilities to do other interesting customizations based on conditions where the user can let, for example, the tryCatchLog know if for example attributes are ok on conditions (for duplicate checking). One might even be able to start to deal with #11 in a more systematic way if this would be become a wider used pattern upstream. But that is all out of scope. Interested in hearing if you think this may be feasible. I think it would require some minimal refactoring of the current tryCatchLog function....

nickdickinson avatar Dec 14 '21 11:12 nickdickinson

Not to mention, the option to decide if any condition should be muffled like this: https://github.com/r-lib/rlang/issues/482 and https://rlang.r-lib.org/reference/cnd_muffle.html

nickdickinson avatar Dec 14 '21 11:12 nickdickinson

I liked the second workaround for immediate application but it appears, futile.logger is no longer maintained since 2016: https://github.com/MazamaScience/MazamaCoreUtils/issues/54 Perhaps best to switch to this one: https://github.com/daroczig/logger I tried the workaround with logging by error level but that requires installing from GitHub rather than CRAN, which is the version I have (1.4.3).

nickdickinson avatar Dec 14 '21 11:12 nickdickinson

I think this is also related to #70

nickdickinson avatar Dec 14 '21 11:12 nickdickinson

Ok now also seeing this #62 where you have a positive list of signal conditions (and better understanding the overall function!). I think this option could also be managed in the same way through the same helper function while remaining backward compatible...

nickdickinson avatar Dec 14 '21 11:12 nickdickinson

futile.logger is no longer maintained since 2016: MazamaScience/MazamaCoreUtils#54

I am trying to kick-start a new CRAN release of futile.logger, see: https://github.com/zatonovo/futile.logger/issues/98

Perhaps best to switch to this one: https://github.com/daroczig/logger

Regarding daroczig's logger package: On my backlist is also to add support for more logging frameworks (see https://github.com/aryoda/tryCatchLog/issues/42).

As of today the most frequently used logging frameworks (regarding CRAN downloads) are these:

library(cranlogs)
library(data.table)
res <- cran_downloads(when = "last-month", packages = c("futile.logger", "logging", "logR", "logr", "loggr", "dtq", "loggit", "rlogging", "log4r", "rsyslog", "luzlogr", "debugme", "plogr", "logger"))
setDT(res)
res[, .(.N, downloads = sum(count)), by = .(package)][downloads > 0,][order(-downloads),]
# Dec 14, 2021:
# package  N downloads
# 1: futile.logger 30    121455
# 2:         plogr 30     94182
# 3:        logger 30     23239
# 4:       logging 30     20735
# 5:         log4r 30     18342
# 6:       debugme 30     11725
# 7:        loggit 30      5176
# 8:          logr 30      1079
# 9:       rsyslog 30       790
# 10:      luzlogr 30       467

So logger is a good candidate :-)

Note: plogr is NO logging framework for R code but for C/C++ code called by R. So I will not support it...

aryoda avatar Dec 14 '21 14:12 aryoda

Wow, yes, a new release would be good but since I work across posix and windows, I think the current issue may be a show stopper. What do you think of this proposal? https://github.com/aryoda/tryCatchLog/issues/71#issuecomment-993426043

nickdickinson avatar Dec 14 '21 19:12 nickdickinson

What do you think of this proposal? #71 (comment)

I really like the clear functional approach with a clear explicit signature in your proposal and I am still thinking about an even more general implementation since it is difficult later to change the public API of a package again.

I also like the "state-less" (options-based) approach where the packages does not store a concrete configuration setup [edit: between two function calls].

I am thinking eg. about using a configuration file instead of (too) many (small) options that may get confusing since.

The config file could be specified by a single option and is injected globally then.

RfC: The requirements I want to implement are:

  1. Per condition class (error, warning, message, condition and custom condition classes) specify

    a) the log content:

    • Include no, compact or full stack trace (no = show only the message)
    • Logging severity level (eg. a condition may be logged as info or debug...)
    • Exclude the condition class from logging ("do not log")

    b) the condition propagation to other registered handlers:

    • Propagate to registered handlers
    • "Muffle" (= do not propagate)
  2. Support condition class inheritance:

    • Apply the most specific class name found in the config file (eg. my_custom_condition_class before condition) [edit: The processing order shall not depend on the row order of the condition class names in the config]
  3. The configuration shall be applicable globally as well as for a single try*Log() call:

    • The function signature must be extended with a configuration argument
  4. [Edit: Added later] Nice to have: Rethrow a catched condition as another condition class: Eg. to "repackage" technical errors into user-understandable errors...

  5. Non-functional:

    • Easy to configure (eg. via a CSV file or options)
    • Ensure good performance (condition class inheritance handling and config files may be slow)

Open questions:

  • How to inform the end user about parsing errors or wrong file path?
  • State-less or state-full configuration: If tryCatchLog shall work stateless each function call must parse the config file again and again (which may become a performance bottleneck) [edit: or better the client/caller passes the desired config "state" with each function call]
  • Would it be more difficult for end users to use a config file instead of options (which contains lists)?
  • How to cope with the existing arguments silent.warnings and silent.messages [edit: and logged.conditions] which may be "overwritten" in the configuration? Deprecate them? This may be an (public) API-breaking change...

Did I miss any requirement of your feature request (or do you propose more)?

aryoda avatar Dec 14 '21 22:12 aryoda

Ok great, the stateless approach can be very flexible. A global config file could be an R script/function with a series of short calls as proposed.

A simple structured config as a dataframe/csv could be useful at times for those with a more structured (and heavier) approach where there should be a different config for different operations, possibly nested. Condition handling as data!

  1. Nice

  2. Yes, would either require some work from the user to provide a sequence for all but the condition class or would you detect inheritance and build a different order than specified? This could create overhead so I would propose a configuration object/structure that can store a particular config and can be passed as an argument.

  3. A simple condition handling configuration object could address any performance issues so your checks in tryCatchLog just run over the object.

Open questions:

  1. Throw an error on parse errors. Allow the config to be built using the same functions possibly. If you use a dataframe then the file can be a csv or rds etc. and tryCatchLog does not have to handle that or can use an existing function.

  2. I would have the user load as many config objects in advance as they need to pass to their functions. A one time only action. I imagine many will only use a global config.

  3. I like the approach of list options that are made to facilitate condition and other checks in the package functions but with a dataframe style config for the user. No parsing of files to avoid potential introduction of bugs when there are already existing ways to store and load data. Even stronger if one can programmatically build the config with the helper function and save it as a dataframe and/or list object.

  4. Yes, I would respect existing arguments and actually give them priority over global config or passed config options but possibly deprecate over a sufficiently long period of time. I think it is safe to say that if someone is using the most powerful config option then they can eventually stop using the other arguments. I can't say if deprecating now is important.

I cant think of anything missed now. This would be quite powerful. Is there any way I can help?

On Tue, Dec 14, 2021, 23:49 aryoda @.***> wrote:

What do you think of this proposal? #71 (comment) https://github.com/aryoda/tryCatchLog/issues/71#issuecomment-993426043

I really like the clear functional approach with a clear explicit signature in your proposal and I am still thinking about an even more general implementation since it is difficult later to change the public API of a package again.

I also like the "state-less" (options-based) approach where the packages does not store a concrete configuration setup.

I am thinking eg. about using a configuration file instead of (too) many (small) options that may get confusing since.

The config file could be specified by a single option and is injected globally then.

RfC: The requirements I want to implement are:

Per condition class (error, warning, message, condition and custom condition classes) specify

a) the log content:

  • Include no, compact or full stack trace (no = show only the message)
    • Logging severity level (eg. a condition may be logged as info or debug...)
    • Exclude the condition class from logging ("do not log")

b) the condition propagation to other registered handlers:

  • Propagate to registered handlers
    • "Muffle" (= do not propagate)

Support condition class inheritance:

  • Apply the most specific class name found in the config file (eg. my_custom_condition_class before condition)

Non-functional:

  • Easy to configure (eg. via a CSV file or options)
    • Ensure good performance (condition class inheritance handling and config files may be slow)

Open questions:

  • How to inform the end user about parsing errors or wrong file path?
  • State-less or state-full configration: If tryCatchLog shall work stateless each function call must parse the config file again and again (which may become a performance bottleneck)
  • Would it be more difficult for end users to use a config file instead of options (which contains lists)?
  • How to cope with the existing arguments silent.warnings and silent.messages which may be "overwritten" in the configuration? Deprecate them? This may be an (public) API-breaking change...

Did I miss any requirement of your feature request (or do you propose more)?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/aryoda/tryCatchLog/issues/71#issuecomment-994114261, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOP4ABLZKTUDAXQ3JIQDTDUQ7CYFANCNFSM5J4DTILQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

nickdickinson avatar Dec 15 '21 08:12 nickdickinson

Is there any way I can help?

Thanks for offering your help!

I am starting the implementation today and will publish the first version in this feature branch.

It would be great to get your feed back then (just if you want and have time of course :-)

aryoda avatar Dec 20 '21 00:12 aryoda

@aryoda I've not seen changes in the remote feature branch. Let me know if there is some way I can help out (testing, code/PR, etc.)

nickdickinson avatar Jan 28 '22 12:01 nickdickinson

@nickdickinson Sorry, I have lost the track on this implementation. I have pushed the current state here in the feature branch and will continue this week (I hope - it depends on my work load). I have started with a new function tryCatchLogExt() to offer an extended config option that is independent of the "old classical" tryCatchLog() since the API would become quite difficult to understand IMHO. This is just for testing and subject of discussion. I hope I can provide working code soon and really would appreciate your feed back and testing then.

aryoda avatar Jan 29 '22 18:01 aryoda

FYI: I have just pushed a first working version in the feature branch .

You can now do something like:

library(tryCatchLog)
config <- config.create()  # creates a standard config if no args are passed
# you could modify the config then...
options("tryCatchLog.global.config" = config)

tryCatchLog(simpleMessage("low water"))  # will suppress the full call stack (as it is configured so)

The standard (default) config returned by config.create() is: tryCatchLog - standard config example

BTW: I am still working on these missing parts:

  • DONE: Extend the logging functions to support all severity levels (DEBUG, FATAL...)
  • wip: Documentation
  • Vignette for configs (topics: silencing logs unsilencable conditions only, precedence, args are used if no config row matches...)
  • DONE: Implement config.add.row() and probably more convenience functions
  • DONE: Implement config validation "return channel" to the user (eg. throw error)
  • DONE: Silencing errors and warnings should consider inheritance too
  • NOT REQUIRED: Replace every \n with the platform constant (esp. in (config.validate()`)
  • DONE: Missing unit tests for 100 % code coverage
  • Final review to finalize names, design decisions, estimate impact to other open issues
  • Work through all TODOs
  • wip: Ask at r-devel for a unique identifier for conditions (follow the r-devel discussion here).

PS: A little background about the new semantics and important design decisions:

    # 29.07.2022 The new configuration feature (FR https://github.com/aryoda/tryCatchLog/issues/71)
    #            is injected here and overwrites (takes precedence over) the corresponding arguments (passed directly
    #            or via their default value from an option).
    #            Backward compatibility:
    #            - If you don not use configurations the semantics works like before
    #            - If you do use configurations it takes precedence over all actual arguments
    #              since the configuration allows finer-coarsed control.
    #              The actual arguments are only applied if there is no matching condition class row in the config!
    #              -> Using a configuration for existing code with precedence of the actual arguments
    #                cannot work since actual args are not really optional (due to the default values)
    #                and would therefore ALWAYS win (= the configuration would always be ignored!).
    #            - If a config is passed user-defined conditions are always logged
    #              (the logged.conditions arg is ignored)
    # Design decisions:
    #            1. Extend the existing API with a config option instead of adding a new one to allow a soft migration
    #               of existing client code to configuration-based code.
    #               This supports a slow migration by deprecating the arguments of the "old" API one day (if required at all!).
    #            2. Invalid configurations are ignored and logged with a warning and execute as if no config was passed (safe fall-back)
    #            3. The configuration row for the most specific condition class wins (will be applied)

aryoda avatar Jul 31 '22 09:07 aryoda

Hi @aryoda, apologies for not picking this up sooner. I'm going to test this out now. Let me know if there is anything I should take into account (changes / other branches) while testing.

nickdickinson avatar Mar 16 '24 11:03 nickdickinson

Hi @nickdickinson, I hope you are doing well (saw your web site and thought that you have much - and important - work to do).

It would be great if you could do some testing.

I have lost track of this issue (spending my time on another FOSS project) but always wanted to finalize this feature and prepare a CRAN release (and can also fix the CHECK note of the CRAN build then).

I have a few free days in April and plan to implement unit tests for this new feature... Edit: I just saw in my above TODO list that I have already implemented unit tests - couldn't remember - or the TODO list is lying ;-)

BTW: I saw you are working with knowledge graphs. Which KG database are you using and how do you enforce ontologies when loading data into the KG? I am curious because I am working in the data* area too...

aryoda avatar Mar 19 '24 23:03 aryoda

So far it is working ok, I've only done some simple configuration so far to remove traces from the regular messages. Do you remember if it also works with custom messages/conditions if I add new columns?

Right our ontology is still under development and super simple (only a few predicates/relation types). The concept is more to have a manually curated smaller KG that can link to entities in the text of our documents. We can also continue this in a separate thread / email :-)

nickdickinson avatar Mar 20 '24 08:03 nickdickinson