pecan icon indicating copy to clipboard operation
pecan copied to clipboard

connect do.conversions overwrite to settings

Open mdietze opened this issue 7 years ago • 8 comments

@infotroph recently discovered that the logical overwrite flags being passed into do.conversions were not connected to any information in the settings file, and thus couldn't be controlled via advanced edit of the pecan.xml.

Furthermore, the names and identities of these flags are fixed, which means that we would have to modify the function definition every time we add a new input conversion into do.conversions, which doesn't seem sustainable.

Proposed solution:

  • Deprecate overwrite flags as do.conversions arguments
  • Instead, overwrite should be set as a logical variable in settings$run$inputs$<tag>$overwrite
  • In the loop over inputs within do.conversions, we should just pull out the overwrite info and pass it on to the specific conversion function being called.

mdietze avatar Apr 30 '18 11:04 mdietze

While we're at it, do_conversions needs better documentation. If the function is only going to take a settings list as an argument (which I generally oppose in favor of explicit arguments1, but I can see how it could make sense here), then we need to be crystal clear about exactly which elements of settings it needs. We should also be explicitly checking that all the required elements are there, and failing loudly if they are not. E.g.

PEcAn.logger::severeifnot(
  "run" %in% names(settings),
  "host" %in% names(settings),
  ...,
  msg = "One or more elements are missing from the settings list"
)

(Or better yet, we should have a logger or utils utility needs_element(list, elements) that checks a list for required elements and returns an informative error if any are missing).

While we're on the topic, I also think do_conversions should be moved to the settings package. In general, I think all of our functions that work primarily with a settings object should live in PEcAn.settings, and the remaining functions should take all their arguments explicitly (see also the below idea about S3 methods).


1 One idea I've been tossing around is to use S3 methods with a settings class to do get the best of both worlds. For instance:

met.process <- function(...) {
   UseMethod("met.process")
}

met.process.default <- function(site, start_date, end_date, <etc>) {
   # Do stuff...
}

met.process.settings <- function(settings) {
  needs_elements(settings, c("run::site", "run::start_date", "run::end_date"))
  met.process.default(settings$run$site, settings$run$start_date, settings$run$end_date)
}

That said, I don't think this would necessarily work well for do_conversions. I think this could also make R CMD check complain because S3 methods are supposed to have the same dispatch argument name.

ashiklom avatar Apr 30 '18 14:04 ashiklom

While we're on the topic, I also think do_conversions should be moved to the settings package

Not PEcAn.workflow as discussed in #1722? It does takes a settings object as an argument, but the work it does on it uses, like, most of PEcAn.

infotroph avatar Apr 30 '18 16:04 infotroph

You're right, workflow makes more sense.

Perhaps it's worth thinking about whether the settings package is actually necessary, or if it makes more sense to eventually merge it with PEcAn.workflow? I know I've traditionally aired on the side of more, smaller packages with more precise purposes, but in this case, I'm not convinced a settings object as defined by PEcAn.settings makes sense outside the context of a PEcAn workflow.

ashiklom avatar Apr 30 '18 16:04 ashiklom

I tend to favor keeping the settings package more narrowly focused on read, writing, and parsing/checking the setting object.

Understand the point about being more explicit in args and documentation. There's a few trade-offs here since this is a pretty high-level function. One is that if we made the args explicit then we'd need a new wrapper function for the MultiSetting detection and papply, which means that from workflow.R things would stay just as opaque. Another is that I could see us whittling down the args to be being the inputs list and then the other errata (db connection, etc), but that inputs list is going to remain complex and functions also very explicitly update that list, which would need to be re-inserted back into the overall settings.

mdietze avatar Apr 30 '18 17:04 mdietze

Yeah, so as long as we have good documentation and checks, I think it does make sense to design workflow-related functions with settings as an argument -- just not lower-level functions, which could be useful as external utilities and should be called explicitly. If all of those lower-level functions have ... as an argument (even if it isn't used), then we can safely do things like do.call(met.process, list_of_args) even if list_of_args has extra arguments (since those will be silently and safely soaked up in the ...).


As far as modifying lists, modifyList is a fantastic function for this (and as I recently learned, it works recursively!). I use it all the time in my code to modify default parameters in function arguments:

f <- function(x, opts_list = list() {
  default_list <- list(a = TRUE, b = "something", c = list(c1 = 3, c2 = 7))
  final_list <- modifyList(default_list, opts_list)
  # do stuff with final_list
}
f(x) # Use defaults
f(x, list(a = FALSE, d = "custom thing"))  # Change a to FALSE, add new variable d, use defaults for everything else
f(x, list(c = list(c2 = 10))) # Works recursively! This changes only final_list$c$c2, but uses default final_list$c$c1

In the context of PEcAn, where we are definitely only modifying specific subsections of settings, I was envisioning something like this, which makes it clearer that we are changing only one part of settings:

new_inputs <- do_inputs_thing(settings$inputs, ...)
settings$inputs <- modifyList(settings$inputs, new_inputs)

For multiSettings, since we already have a multiSettings class with all the appropriate formalities, I'd be in favor of eventually using S3 method dispatch rather than our current if-else recursion, i.e.

do_conversions <- function(settings, ...) {
  UseMethod("do_conversions", settings)
}

do_conversions.multiSettings <- function(settings, ...) {
  papply(settings, do_conversions.default, ...)
}

do_conversions.default <- function(settings, ...) {
   # do stuff
}

...but it definitely shouldn't be a priority since the current system works fine too.

ashiklom avatar Apr 30 '18 18:04 ashiklom

This issue is stale because it has been open 365 days with no activity.

github-actions[bot] avatar Apr 14 '20 00:04 github-actions[bot]

Unstaling. Function has been moved as discussed, but the overwrite flags that were the main point of this issue still need to be changed.

infotroph avatar Apr 14 '20 12:04 infotroph

This issue is stale because it has been open 365 days with no activity.

github-actions[bot] avatar Apr 15 '21 00:04 github-actions[bot]