validate icon indicating copy to clipboard operation
validate copied to clipboard

Switchover to using registry-common library to communicate with OpenSearch Serverless Registry

Open jordanpadams opened this issue 1 year ago • 8 comments

💡 Description

Complete switchover for referential integrity checker

⚔️ Parent Epic / Related Tickets

No response

jordanpadams avatar May 09 '24 14:05 jordanpadams

@jordanpadams @tloubrieu-jpl

Design choice - I am split or would not be asking - on config file for validate-refs. I get that the most convenient thing for the user is to let them use harvest config. The new config is more complex and controlled with a schema. Great. For validate-ref, this means it now needs to:

  1. continue sax parsing and hope harvest config does not change too much more
  2. depend on harvest as well as registry-common
  3. not use harvest config as auth input but do like all other registry-mgr tools
  4. move validate-refs out of validate and into registry-mgr since it has nothing to do with validate other than the name.
  5. move validate-refs to harvest since it has more in common with it than validate

I think 2, 4 and 5 are up for bat. 2 seems the most direct and with maven maybe we do not care one wit about all the packages that we reference. 4 offers a cleaner solution since validate-refs uses nothing in validate at all but is probably there because the same user that runs harvest and validate also wants to run validate-refs but cares nothing for tools in registry-mgr. If true about users that use harvest want to do validate-refs, then 5 seems like the best solution since they share the config and both having nothing in common with validate itself.

al-niessner avatar May 15 '24 16:05 al-niessner

Hi @al-niessner ,

I don't have a final answer on that, but to feed your internal discussion, I think we would like all the tools provided to the discipline node users to be available in a consistent way. What they need is:

  • validate
  • validate-ref
  • harvest
  • registry-mgr, archive status mostly

I don't think the same users need validate, validate-refs and harvest/registry-mgr. For example SDS development guys use validate, but not harvest or registry-mgr or validate-refs.

I don't know though where that would make more sense to move validate-refs.

We also have this ticket on which we could piggy back your suggested changes: https://github.com/NASA-PDS/registry-loader/issues/11. If we move forward with that ticket, we would not have to choose in between harvest or registry-mgr to move validate-ref (options 4 or 5).

Can that decision wait for Jordan to be back, next week ? Thanks

tloubrieu-jpl avatar May 15 '24 17:05 tloubrieu-jpl

@jordanpadams @tloubrieu-jpl

No rush on decision. I can easily punt for now.

I would suggest not implementing NASA-PDS/registry-loader#11. With Java anyway, smaller repos are better -- Java bloat is really bad. Maybe a registry-loader that depends on all the other repos for the user that then gives them a better/simpler install 1 to get 5, but combining the java code will get unwieldy fast.

al-niessner avatar May 15 '24 17:05 al-niessner

I am realizing: should we switch to using the search API directly, since Alex fixed the pagination performance issue.

tloubrieu-jpl avatar May 20 '24 17:05 tloubrieu-jpl

@jordanpadams @al-niessner ☝️

tloubrieu-jpl avatar May 20 '24 17:05 tloubrieu-jpl

@alexdunnjpl @jordanpadams @tloubrieu-jpl

Short answer: update and march forward.

Longer answer:

We all blabber about the learning curve and it is more expensive to change later than now . However, we never include the forgot curve. The real effort is the area between these two curves. It can also be shown that the forgot curve always plummets faster than the learning curve rises. By the time you eventually upgrade you will have forgotten all of the workarounds, patches, cruft, and other bad code to use the older interface that now also need updating.

If the pagination stuff is broken in SDK, then opensearch will fix it because the rest of the world will apply pressure. If they do not fix it because we are using the wrong technology (rest of world thinks pagination works well enough as is), then, well, change technologies.

Having rewritten harvest registry-common, and registry-mgr to use SDK v2, it is wonderful compared to the API direct interface. v2 does mean a complete rethink and recode of registry-api.

al-niessner avatar May 20 '24 18:05 al-niessner

@jordanpadams @tloubrieu-jpl

I am going to move validate-refs to registry-mgr unless there is compelling reason to make a mess of validate (the repo) with depending on opensearch-java just to get this ticket done.

al-niessner avatar Aug 02 '24 16:08 al-niessner

@al-niessner I would prefer we do not do that. validate-refs depends on the registry, but it is performing an integral step in validation. I know it is messier in terms of dependencies, but it is not really a function of managing the registry or loading data into the registry. I would prefer we avoid conflating the two.

jordanpadams avatar Aug 02 '24 16:08 jordanpadams