Etherpad export should not include all changesets if limited by revision
I hope it's okay I am bundling these issues together, as I suppose they might be related.
-
/etherpad export endpoint always returns the head revision
/p/{padname}/{rev}/export/etherpadfor all 'rev' values. Unlike both the /html/ and /txt/ exports, which work as expected. This also affects (and can be reproduced via) timeslider's "Export current version (as Etherpad)". -
On the API side however,
getHTML(padID, [rev])ignores 'rev', always returning head revision. -
... while
getText(padID, [rev])works as expected by taking 'rev' into account. But it has a different return schema when 'rev' is specified. It wraps the "text" object in another "text" object, and also adds an "attribs" blob.
Etherpad 1.8.13 + no plugins Ubuntu 18.04.5 LTS Node v12.22.1 npm 6.14.12
Thanks for the report! Marking as bug due to getText not behaving as expected
RE 1: Not sure if I would consider this a bug. Maybe we should entirely remove the rev parameter here. The etherpad export is special and is meant to return the full set of all information that is necessary to e.g. import the pad again, do your own analysis etc. It would be relatively easy to include only the changesets up to a given revision, but we would also need to limit the number of attributes (so that attributes are not included that were only used in revisions after our given revision). Not sure if this is useful?
RE 2: I think I cannot reproduce this. Seems to work for me.
RE 3. This is a bug. Ignore the attribs for now. Caused by https://github.com/ether/etherpad-lite/commit/e7dc0766fdcc3c8477e97c83393fcc2683fab72e Good catch!
Thank you @webzwo0i for such a brief reply.
RE 3: Great :)
RE 2: Sorry, my bad, I had a bug on my side & retract that.
RE 1: I don't think it's behaving as expected, and I think getting a specific limited changeset revision is useful. I found out the revisioned export didn't work after a collective using my pad server was confused: they tried to revert accidental changes someone made, by going to the timeslider, finding the last good revision, and export→importing it. (I'm aware there's a few different ways of doing that, but also think this one is legitimate).
Another use is "forking" pads: you might for example author a template, continue with filling it up, but would later like to split the template out, while retaining its original history.
A workaround is to (I've done this before)
- export the latest revision to a file
-
restoreRevision(padId, rev)to some historic rev - re-import the latest .etherpad to another pad
OK I think you have a point here. I'd consider it a feature request to make the etherpad export more useful to different use cases.
Thanks! I think it'd be really great to allow users to make backups / revert accidents in pad contents this way.
(IRL, in my experience, these will happen, and it is a detriment for groups of people to adopt pads, even if they have an editor with some chops. I mentioned the API possibility on #1791 , but this isn't available for non-admins.)
But in any case, the label in timeslider and especially the export endpoint, containing "rev", are misleading.