spec icon indicating copy to clipboard operation
spec copied to clipboard

Is it OK to have a missing version directory?

Open zimeon opened this issue 5 years ago • 21 comments

Fixture https://github.com/OCFL/fixtures/pull/79 / E010_missing_versions brings up an interesting question for me. Is it really necessary to have a version directory for every version? There will be a version directory if there is an inventory for every version but this is not required. But an implementation isn't storing an inventory for every version and a version doesn't add any new content files, is an empty version directory required?

zimeon avatar Apr 20 '21 13:04 zimeon

I brought this issue up previously (https://github.com/OCFL/spec/issues/535) and at the time was told that you must always have a version and if you aren't storing an inventory for every version you're doing it wrong.

pwinckles avatar Apr 20 '21 20:04 pwinckles

Getting back to the core OCFL principles, I would argue that the first paragraph of "3.3 Version Directories" defines an important characteristic of OCFL. However, given the loophole raised in https://github.com/OCFL/spec/issues/535, I would suggest we add wording to the middle of the first paragraph of "3.7 Version Inventory and Inventory Digest" along the lines of:

In the case where no files have been added or updated in a given version, which would result in an empty and therefore absent "content" directory (see https://ocfl.io/1.0/spec/#content-directory), such a version directory MUST include an inventory file.

awoods avatar Apr 25 '21 14:04 awoods

I feel uncomfortable with the idea that we might require an inventory.json just as a way to keep the version directory in implementations that choose otherwise not to have an inventory in the version directories.

zimeon avatar Nov 16 '21 14:11 zimeon

Suggested chage to 3.3 Version Directories. Paragraph 1 should read (changes are highlighted):

OCFL Object content MUST be stored as a sequence of one or more versions. The sequence of version numbers is the sequence of positive, base-ten integers: 1, 2, 3, etc., and the version directory name is constructed by adding the prefix v. The version number sequence MUST start at 1 and MUST be continuous without missing integers. Each object version MUST be stored in a version directory under the object root.

and then the last paragraph should read (changes are highlighted):

There MUST be no other files as children of a version directory, other than an inventory file, an inventory digest, or a .no_content file. The version directory SHOULD NOT contain any directories other than the designated content sub-directory. Once created, the contents of a version directory are expected to be immutable.

I don't think the suggested changes indicate that you can't have an empty version directory which is of course the whole point of this ticket.

rosy1280 avatar Nov 30 '21 03:11 rosy1280

We need additional language that lets readers know .no_content file should exist when 1. you don't store an inventory file in your version directories AND 2. your version does not have any content to be stored (e.g. the version was created to document a file name change).

Suggestion on language to use welcome.

cc: @zimeon @awoods

rosy1280 avatar Nov 30 '21 14:11 rosy1280

Per slack discussion with @neilsjefferies, I think don't see any benefit of making the no_content file a "dot"/hidden file.

Questions:

  1. Is it allowed to have a no_content file AND an inventory? (I think it YES)
  2. Is it allowed to have a no_content file AND a content sub-directory? (I think NO)
  3. Is it preferred to have a no_content file in the case that there is no content sub-directory, even if there is an inventory? (I think YES)
  4. What should the content of the no_content file be? (I suggest empty, SHOULD?)

Assuming my two answers to the above. I think we could write something like the following although it ends up as a bit of a mouthful:

There MUST be no files as children of a version directory except an inventory file, an inventory digest, or a no_content file. The version directory SHOULD NOT contain any directories other than the designated content sub-directory. The version directory MUST NOT be empty and in the case that there is no content sub-directory there SHOULD be a no_content file. If present, the no_content file SHOULD be empty. Once created, the contents of a version directory are expected to be immutable.

zimeon avatar Nov 30 '21 15:11 zimeon

That language does not enforce point 2

pwinckles avatar Nov 30 '21 16:11 pwinckles

Is there any harm in making a no_content file mandatory in the absence of a content subdirectory?

neilsjefferies avatar Nov 30 '21 16:11 neilsjefferies

We also discussed whether or not 'no_content' should have content and felt that for validation it didn't matter we would just check for presence. Therefore we would remain silent on whether or not there is content in the no_content file. @zimeon can you explain why we need to dictate that no_content is zero length?

rosy1280 avatar Nov 30 '21 16:11 rosy1280

@pwinckles re. https://github.com/OCFL/spec/issues/540#issuecomment-982779257 - yes indeed, good point

@neilsjefferies re. https://github.com/OCFL/spec/issues/540#issuecomment-982783608 - if we make it mandatory then we don't have backward compatibility with 1.0... but now I think about it, requiring no_content when there isn't a content directory is also not backwards compatible so maybe this whole change has to wait for 2.0?

@rosy1280 re. https://github.com/OCFL/spec/issues/540#issuecomment-982786512 - no particular reason why no_content should have no content (though you have to admit it is kinda cute). I do think it is better to recommend something as that avoids someone having to make an arbitrary implementation decision.

Taking the above into account, a revised proposal might be:

There MUST be no files as children of a version directory except an inventory file, an inventory digest, or a no_content file. The version directory SHOULD NOT contain any directories other than the designated content sub-directory. The version directory MUST NOT be empty. In the case that there is no designated content sub-directory there [SHOULD|MUST] be a file named no_content, and there MUST NOT be a file or directory named no_content otherwise. If present, the no_content file [SHOULD be empty|MAY be empty or have any content]. Once created, the contents of a version directory are expected to be immutable.

zimeon avatar Nov 30 '21 22:11 zimeon

@zimeon 👍🏼 to it may be a breaking change. I've been wondering that as we drafted this. Should have said something sooner.

rosy1280 avatar Nov 30 '21 22:11 rosy1280

Given the fact that we do not want to introduce any breaking changes in a 1.1 release, would a softening of my earlier suggestion to a SHOULD instead of MUST be sufficient guidance for this release?

In the case where no files have been added or updated in a given version, which would result in an empty and therefore absent "content" directory (see https://ocfl.io/1.0/spec/#content-directory), such a version directory SHOULD include an inventory file.

awoods avatar Dec 01 '21 01:12 awoods

I am now leaning towards @awoods suggestion as the minimal change to the spec required to resolve the issue. It is a little untidy only if you are taking the NOT RECOMMENDED route of not having version inventories.

neilsjefferies avatar Dec 01 '21 10:12 neilsjefferies

I do not think we should make the earlier suggestion but with SHOULD instead of MUST because it doesn't solve the problem: it would still just be a warning to not have a version directory even though now two warnings (no inventory and no directory).

I think we should punt this to v2.0 with the understanding that in v1.0 (and v1.1) it is possible (though not recommended) to not have a version directory in the case of no files updated and no version inventories stored. I don't see a non-breaking correction/fix without other implications.

zimeon avatar Dec 01 '21 13:12 zimeon

I agree that my updated suggestion does not solve the problem. It does, however, provide clear guidance on how to address the empty version directory scenario.

If that guidance is less helpful than not, I am happy to leave the text as-is, and punt to 2.0.

awoods avatar Dec 01 '21 14:12 awoods

Is it spelled out somewhere what the compatibility between 1.0 and 1.1 is supposed to be? Is 1.1 supposed to just be 1.0, but with a few validations made explicit?

It's true that the no_content change would make the representation on disk of 1.0 and 1.1 versions substantively different so that some 1.0 versions would be invalid per 1.1 and some 1.1 versions would be invalid per 1.0. However, this is only true depending on how validators are intended to behave.

If an object is created 1.0 and is later "upgraded" to 1.1, should the 1.0 versions be validated against the 1.0 spec or the 1.1 spec?

For my validators, I had originally planned on simply updating everything to validate to 1.1, because the majority of the changes were providing clarity to constraints that could already be inferred from the 1.0 spec.

If 1.1 were to include the no_content change then things become more complicated, and I was thinking of validating versions based on the spec version that they were created under. It additionally introduces the complication for OCFL clients that would need to create versions slightly differently depending on the current spec version the object conforms to.

All of that to say, I think you could put the no_content change in 1.1 if you wanted. It would make clients and validators more complicated, but it wouldn't "break" anything. Personally, I'm just as happy punting on it because it means I have less work to do, and this is a very niche edge case.

pwinckles avatar Dec 01 '21 14:12 pwinckles

@pwinckles : per your comment about validating versions: The spec is clear that versions should be validated against the version they were written to conform to, but this isn't actually clear without a version inventory... I have created https://github.com/OCFL/spec/issues/569 to discuss

zimeon avatar Dec 01 '21 14:12 zimeon

Requiring a namaste file in all versions would solve the empty directory problem. :D

pwinckles avatar Dec 01 '21 14:12 pwinckles

+1 Punt this one to 2.0

neilsjefferies avatar Dec 01 '21 14:12 neilsjefferies

...is there any mileage in putting something about this in the Implementation Notes?

neilsjefferies avatar Dec 01 '21 15:12 neilsjefferies

Agreement in community call to delay until 2.0 (@rosy1280 @awoods @julianmorley @zimeon present). Removing 1.1 tag

zimeon avatar Dec 09 '21 00:12 zimeon