openneuro icon indicating copy to clipboard operation
openneuro copied to clipboard

Dataset branches with unprivileged maintainers

Open effigies opened this issue 3 years ago • 0 comments

Is your feature request related to a problem? Please describe. Problems may be discovered in uploaded datasets, most often incomplete or inaccurate metadata. If this is discovered during a period when the original dataset owner is non-responsive (e.g., on vacation, lacking time to maintain the dataset, or having left academia) the preferred solution of asking the dataset owner to update the dataset may be unworkable or at least inconveniently slow.

In many cases, the solution is evident and unambiguous to an experienced researcher, but it is not within OpenNeuro's remit to decide which researchers should be given editorial discretion over datasets.

Describe the solution you'd like Researchers could be allowed to create branches of the dataset in which they provide a fix. The permissions would be limited, and the original dataset owner should be given the opportunity to merge those branches into the mainline dataset. These branches would have their own tag structure.

A special "-contrib" suffix would be added to each branch. Suppose we have ds000114, owned by user A, with versions 1.0.0 and 1.0.1.

v1.0.0 -- v1.0.1

User B forks it to create ds000114-contrib001, which gets its own landing page https://openneuro.org/datasets/ds000114-contrib001. This is forked from the latest snapshot (v1.0.1). Internally, this is simply a git branch:

git branch v1.0.1 -b contrib001

After making changes, User B creates a snapshot (v1.0.0) that is implemented as tag contrib001.v1.0.0:

git tag -a contrib001.v1.0.0
v1.0.0 -- v1.0.1
                 \
                  contrib001.v1.0.0

This is given DOI doi:10.18112/openneuro.ds000114-contrib001.v1.0.1. User B proceeds to publish a paper based on the modified dataset, citing this DOI.

User A returns from vacation and sees a contrib branch. After inspection, they agree to adopt the changes as v1.1.0:

v1.0.0 -- v1.0.1 ------------------ v1.1.0
                 \                 /
                  contrib001.v1.0.0

Notes:

  1. Contrib branches must be resolved to fast-forwardable states to the latest canonical snapshot before tagging. If the dataset owner creates a new tag, the contributor cannot re-tag without merging in main.
  2. Modifying annexed objects should probably require admin approval.

Describe alternatives you've considered

  1. Entirely new accession numbers. Loses obvious connection between branches.
  2. GitHub-based workflow. Prefer to retain loose coupling.

effigies avatar Aug 24 '22 21:08 effigies