GaNDLF Improved logging

Is your feature request related to a problem? Please describe. Currently, we have our own logging class, which is fine, but it doesn't provide the option to extended debugging or error-reporting.

Describe the solution you'd like Something like loguru would be good to have. This gives more flexibility in logging, and provides more functionality related to tracing.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context N.A.

Jan 01 '22 13:01 sarthakpati

Stale issue message

Mar 02 '22 19:03 github-actions[bot]

Another option: https://neptune.ai/product#how-it-works

This is a well-fleshed out MLOps solution, and has an offline mode.

Apr 01 '22 00:04 sarthakpati

The more I think about this, the more I realize that perhaps using Tensorboard in a nicely thought-out manner would be enough to record the information for pretty much every kind of experimentation matrix we are running.

Apr 01 '22 01:04 sarthakpati

Stale issue message

Jun 04 '22 19:06 github-actions[bot]

Stale issue message

Aug 04 '22 19:08 github-actions[bot]

I recently came across wandb which is free and seems to be good for hyperparamter sweeps and visualization in general - https://wandb.ai/site.

Sep 07 '22 00:09 meghbhalerao

Thanks!

I have seen this before and it is pretty good. Only 1 issue, though: it needs to be deployed as a web app and it isn't self sustained (for instance, like tensor board). Ideally, it would be great to have the functionality of tensor board integrated to our work flow. It provides enough flexibility for local deployment and use, while having the option to do server side deployment.

Sep 07 '22 02:09 sarthakpati

There are 2 major things we want to accomplish from this:

Visualize results from a training process during hyper-parameter tuning
Save console output to file [ref]

Sep 07 '22 19:09 sarthakpati

There are 2 major things we want to accomplish from this:

Visualize results from a training process during hyper-parameter tuning

Save console output to file [ref]

I feel 2 can be done well by using the default logging module. A basic example shows that this works well in our multi-module structure pretty well. However, this would require a significant engineering effort.

Thoughts?

Sep 10 '22 15:09 sarthakpati

Agreed that the built-in logging module is best for this. I can handle that work. This will mean that we will need to start requesting changes PRs with plain print statements. Loguru and snoop are cool, but my hunch is that with the type of code we are writing, it will most likely only print out python object notations like"<numpy.array at 0xdeadbeefbadbabe>".

Can I ask what the intended user workflow is to visualize hyperparameter tuning? Do we expect this to be part of "gandlf_collectStats", (i.e., done post-hoc after a training is performed)? Or is this something we want to be generating/visualizing while training is running?

Sep 10 '22 17:09 AlexanderGetka-cbica

I can handle that work.

Awesome, thank you! Please let me know how I can help.

This will mean that we will need to start requesting changes PRs with plain print statements.

Once you have the logger class set up, we would need to define which print statements go as warning, error, and so on. Can the print statements be redirected to the logger class config?

Can I ask what the intended user workflow is to visualize hyperparameter tuning? Do we expect this to be part of "gandlf_collectStats", (i.e., done post-hoc after a training is performed)? Or is this something we want to be generating/visualizing while training is running?

I wanted to discuss with you all how to obtain the information about the "best hyperparameters" after a set of N experiments. I guess gandlf_collectStats would be the most extendable and maintainable way to do this. What do you think? And, we would keep this in a separate issue altogether to make the PRs easier to review.

Sep 10 '22 18:09 sarthakpati

Stale issue message

Nov 09 '22 19:11 github-actions[bot]