fs icon indicating copy to clipboard operation
fs copied to clipboard

Expand on "File object will likely be no longer readable"

Open a-sully opened this issue 4 years ago • 7 comments

Migrated from https://github.com/WICG/file-system-access/issues/121 and https://github.com/WICG/file-system-access/issues/244 (see https://github.com/whatwg/fs/issues/2)

@foolip:

https://fs.spec.whatwg.org/#api-filesystemfilehandle-getfile says in domintro:

If the file on disk changes or is removed after this method is called, the returned File object will likely be no longer readable.

This isn't the normative text, but when the normative algorithm is written down will there really be any uncertainty to justify "likely" or can it be defined exactly what happens and when.

For a File that is no longer readable, does that mean that it's somehow neutered? Would lastModified reflect the change/removal time, or how would one tell that a File object is in this state?

@mkruisselbrink:

The note there is mostly trying to describe what is already (not well) defined in FileAPI. I.e. this is the behavior for all File objects that represent actual files on disk. Unfortunately this is also not very well specified in FileAPI (see for example https://github.com/w3c/FileAPI/issues/47, but also https://github.com/w3c/FileAPI/issues/75). I definitely intend to fix that, but will need to fix the FileAPI side of things before I can fix this side.

@mkruisselbrink:

We should define behavior of FileSystemFileHandle.getFile() when the underlying file no longer exists. Currently the spec is pretty much silent on this, we should probably make it clear/explicit that this should reject with a NotFoundError.

a-sully avatar Mar 08 '22 20:03 a-sully

So "snapshot state" ("Set f’s snapshot state to the current state of entry.") indicates that the data returned may not reflect later changes to the file; implying a read-into-memory (or copy) operation when the snapshot state is created. Note that the definition of 'snapshot state' is somewhat vague and circular if you follow the links. This is at odds with the comments in this issue and in the getFile definition.

jesup avatar Jun 16 '22 02:06 jesup

See the discussion in https://github.com/w3c/FileAPI/issues/47. @saschanaz has been doing some work in this area and it seems rather hard to completely prevent a File object from being readable after its backing file was modified on disk. For typical cases where the backing file's timestamp changes we can do it, but if that stays the change there's not really a way to do it.

annevk avatar Jun 16 '22 07:06 annevk

The wpt tests assume that using a blob after the original file has been removed will reject with NotFoundError; this isn't in the spec (and may not be always possible per above, and shouldn't be the case if my comment from June 15 is correct (that snapshot state implies the data is read or copied at creation time))

jesup avatar Jul 28 '22 21:07 jesup

Is this spec the right place for this discussion? I don't think anyone actually wants to create a copy of the file in getFile(). I agree with the sentiment that "snapshot state" is somewhat misleading phrase, but the terminology is from the File API.

It seems to me like the updating the File API spec to better specify when a file becomes unreadable would allow this spec to just point to that?

a-sully avatar Jul 28 '22 21:07 a-sully

I think we need snapshot state to be fully defined. The File API says it's set to the "state" of the file on disk; this could be interpreted to mean the state including (all) the data on disk, or it could be something else, but it sounds like the entire state of the file (i.e. a copy). Later on talking about errors, it implies that it's snapshotting some type of metadata and returns errors if they don't match. I suspect they meant it to refer to some type of metadata, but what and how this would interact with the system is confusing.

Also, if the file changes it may well still be readable. Also we don't want to spec something that requires stat() before every operation just to check if the 'snapshot state' has changed...

jesup avatar Aug 02 '22 04:08 jesup

Also we don't want to spec something that requires stat() before every operation just to check if the 'snapshot state' has changed...

I'm not a WebKit/Blink expert, but when I last checked their File API behavior they seemed to check the timestamp (which does require stat()) to detect the change. That's really unfortunate :(

saschanaz avatar Aug 02 '22 22:08 saschanaz

Yes, that's how blink implements file-backed Blobs/File objects. For the general case I don't really see a better option, but for fs-api backed blobs (i.e. the ones this spec deals with) I don't see any reason why we couldn't do better and rather than checking on every operation instead have any operation that modifies a file in the fs API effectively invalidate all Blob/File objects pointing to the file. The general file case needs to deal with external modifications to files, but for files in the fs API it should be fine to assume that there are no external modifications. w3c/FileAPI#154 was my partial attempt at some point to better define snapshot state etc (or at least lay the groundwork for enabling defining it better). I still hope to revisit that at some point, although I don't have any concrete timeline in mind.

mkruisselbrink avatar Aug 02 '22 23:08 mkruisselbrink