imara-diff
imara-diff copied to clipboard
feat: word diffs
Adds a word diffing feature for each hunk. The following questions need to be answered to move this out of draft state:
EDIT: all done!
- The current implementation is based on bytes. This is consistent with how strings are used internally in other places. Is this useful for word-diffs? We could also use
.chars()or.graphemes(true)(via this crate). It's also possible to offer several of these options. (EDIT: we now use words) - The current implementation reuses the diff, but it reallocates the bytes of the underlying strings for every hunk. Is this acceptable? If no, which alternative should we go with? (EDIT: no longer applies since we no longer use bytes)
- The current implementation reuses the diff, but it reallocates the
InternedInputfor every hunk. Is this acceptable? If no, which alternative should we go with? (EDIT: we now reuse the interner) - The current implementation estimates the token count to be 256 for byte token sources. Please let me know if that is an unreasonable heuristic for byte token sources. (EDIT: no longer applies since we no longer use bytes)
- The current implementation always uses the Myers algo. Should we offer a second method which performs a minimal diff? (EDIT: no, we can add that later if requested)
- Is there anything else I am missing?
Closes #1.
@pascalkuthe @Byron I have addressed all comments. This is ready for review now.
@pascalkuthe is there a way for to me make this easier to review?
left one cooment otherwise lgtm
Thanks! Can you allow CI to run?
Seems the build failed due to the usual need to deref
Let's try again