Add function to Optionally get `site.info` if not present
Description
This PR aims to refactor site.id and improve str_ns. The end goal is to make the the System independent of DB. Currently, I'm refactoring to create a siteID if not present already. I'll add test cases to check this util function too. Work still in pending
Some comments from @mdietze :
- Here's my slightly different interpretation :
- Require the user to input a dataframe with lat/lon and optionally siteID; if siteID is not provided, >construct a unique identifier from lat/lon
- replace str_ns with provided unique identifier
- Provide a helper function that takes in a BETY connection and site IDs and returns lat, lon, and str_ns
Motivation and Context
May Fix a Subtask of #3307
Review Time Estimate
- [ ] Immediately
- [ ] Within one week
- [x] When possible
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
Checklist:
- [ ] My change requires a change to the documentation.
- [ ] My name is in the list of CITATION.cff
- [ ] I have updated the CHANGELOG.md.
- [ ] I have updated the documentation accordingly.
- [x] I have read the CONTRIBUTING document.
- [ ] I have added tests to cover my changes.
- [x] All new and existing tests passed.
@meetagrawal09 can you cross check if corresponding changes in test.met.process are valid or not? :)
I think you're taking the wrong approach to task 2 here: Yes, "if siteID is not provided, construct a unique identifier from lat/lon", but it needs to be constructed without using the DB. As we discussed in Slack, this could be as simple as something like id_str <- paste0(lat, "_", lon).
Current code is updated for db-less interaction. What more improvements can be made? Would creating a new site-id be correct for a pre existing lat and lon or should we consider fetching it from db if connection is present. Else-if con=NULL, we can generate a new siteID. We can create a int+str (For eg : dcf544e6c etc) styled site-id to distinguish this str_ns from those site-id's when con!=NULL.
Should we add a unit test to check if get.new.site() works correctly?
Fixing the CI checks and test errors
@Sweetdevil144 Thank you for the extensive work you put into this PR. It has become quite large compared to the simple idea it started from, and at the same time we've been gaining more experience running PEcAn with no database and learning how much we can simplify things by making more inputs truly required rather than trying to fill them in for you. Specifically, it's now clear that we almost always want to force the user to provide a site id rather than try to create it for them, and that in the cases we create an ID on the fly we should generally be thinking of it as a site name (i.e. something that lets a human figure out which site's in this folder) rather than guarantee its permanence or uniqueness as a global identifier.
I've taken the liberty of cherry-picking commits from this PR into a much smaller set of changes in #3656, which will close this one. I think the replacement PR captures the core ideas of this one (avoiding DB access when we know location already, allowing met download from locations with no preassigned siteid) with a much smaller footprint and better alignment with our current run setup approaches.
Let's close this Pr once #3656 is merged