pyxet icon indicating copy to clipboard operation
pyxet copied to clipboard

Feature request: pyxet commit history

Open xdssio opened this issue 2 years ago • 0 comments

Rational

Our "timestamps" are commits, therefore we should be able to explore them easily to make it more useful. The function name can be "commit", "commits", or "history". I prefer the word "history" over "commit" since it has less git connotation, it's "feels" more files-oriented (unlike merging of branches which are also commit but of many files) and is very explicit.

  • If files is True, returns a list of all files changes under that commit - this is a simple way to answer questions like: What was the model-card, metrics, database state when uploading model X.

Use cases

  • Checking the local data commit for reproducibility at the beginning of an experiment.
  • Checking the model commit at the end of an experiment to mark the connection between model-data-code.
  • When committing a preprocess script together with the processed data together - it get tracked.

API suggestion:

$ fs.history("xet://user/repo/branch/file-or-folder" | "local-file-or-folder", limit:int=1, files=False)
[{"hash": ..., "message": ... , "author":..., "date":... , files:[ ]}, ...] # a sorted list from new to old.

xdssio avatar Aug 14 '23 10:08 xdssio