bagit-java icon indicating copy to clipboard operation
bagit-java copied to clipboard

make BagVerifier extendable

Open rvanheest opened this issue 7 years ago • 3 comments

This PR aims to make BagVerifier extendable. In order to do so, I removed the final keyword from class level and added it to the methods isValid, isComplete, checkHashes, getExecutor and getManifestVerifier. This way, the class is extendable, but still ensures that the implementations of these methods is not changed (as these are defined in https://tools.ietf.org/html/draft-kunze-bagit). Besides, I gave the method checkHashes access level protected, such that not only other classes within the same package, but also classes that extend BagVerifier can call this. Since this method was not part of the public API, I decided against public access level here, such that it remains outside the public API.

The reason for making BagVerifier extendable comes from a use case we have in our project. Besides 'complete' and 'valid', we define 'virtually valid'. This requires us to partially copy a large chunk of this class and also copy all functionality defined in checkHashes. Rather than copying, we would prefer to just extend the BagVerifier class and call checkHashes. That way we can still rely on this library and not end up with duplicate code.

As an example, as well as a way to check this functionality works at compile time, I added ExtendedBagVerifier to the integration folder. It is not clear to me, though, whether this folder is picked up by Gradle for compilation/testing, so I'm not sure if this is actually the correct place to put this file and ensure that it gets checked on compile time.

rvanheest avatar Aug 06 '18 10:08 rvanheest

Coverage Status

Coverage remained the same at 98.282% when pulling a56225c0460e241eba4c7563e29f01b702fe6893 on rvanheest-DANS-KNAW:extendable-verifier into 2b7002e62d721f5eb50461e4c4c70a6ef643ec1d on LibraryOfCongress:master.

coveralls avatar Aug 06 '18 10:08 coveralls

@jscancella I'm not really sure what you mean by

If you merge this PR in I would recommend extensive testing to ensure any assumptions made during the original design still hold and extended classes wouldn't break functionality.

Rather than having a final class, I now made all public methods final. Since extended classes aren't able to override the public methods, they are not able to break current functionality. Of course, testing is good, but I don't know if one can test that overriding is not possible. This would generally result in compile errors, while I read your comment as 'creating unit tests to verify this'.

I only now see your comment in https://github.com/LibraryOfCongress/bagit-java/pull/125#issuecomment-407854411 (missed it because of holidays). Indeed we could also do it that way, but that would require extra disk space (note that for large bags, you need a lot of extra space!). That's exactly why I wanted to do it like this.

For now we decided to keep 'virtually valid' on the backburner. We'll first look at other things while we keep this one in mind. Until then, this PR may be unnecessary. We'll see how far we come without it :-)

rvanheest avatar Aug 07 '18 14:08 rvanheest

@rvanheest I would recommend you take a look at Effective Java 3rd Edition by Joshua Bloch since he helped design the language while at SUN. In he mentions many items that recommend against your proposal with reasons why they should be avoided. Specifically I am thinking of chapter 4 on classes and interfaces (and more specifically item 19 - Design and document for inheritance or else prohibit it).

As for having virtually valid bags take up more space, I would recommend you not store them as bags. Instead I would use a object store with built in redundancy and de-duplication (like swift or ceph.

jscancella avatar Aug 15 '18 13:08 jscancella