cwltool icon indicating copy to clipboard operation
cwltool copied to clipboard

Capture last modification date of remote file

Open RenskeW opened this issue 3 years ago • 0 comments

Hello,

Would it be possible to implement automatically retaining the last modification date of a remote file in cwltool, instead of setting the download timestamp as last modification date?

When I download a file in a CWL workflow run using an explicit download step (using wget or curl --remote-time) the modification timestamp of the remote file is carried over to the downloaded file. However, this is not the case when I skip the download step and instead specify the input for the next step as the remote file itself: here the download date is set as last modification date (see https://github.com/RenskeW/cwlprov-provenance/tree/83e230bff19501fbc461c3d93fac9d9c1432cc53/modification_date for a demonstration).

Considering that last modification date may be of interest from a provenance perspective, it could be really helpful if cwltool also captured the remote timestamp by default (possibly in addition to the download date).

Expected Behavior

file_to_download:
  class: File
  location: https://<some_remote_file> # has last modification date  

should result in: A downloaded file with the same modification timestamp as on the remote location, instead of the download date

Actual Behavior

See https://github.com/RenskeW/cwlprov-provenance/tree/83e230bff19501fbc461c3d93fac9d9c1432cc53/modification_date

Your Environment

  • cwltool version: 3.1.20220502060230

RenskeW avatar May 30 '22 11:05 RenskeW