Change trunc to handle unicode (rune counting)

Open andrewmostello opened this issue 4 years ago • 0 comments

Update the trunc func to count and truncate by rune instead of by byte. When truncating by byte, if a multi-byte unicode character is encountered, the character will be split and become invalid. Instead, trunc now looks at the rune count for truncation and a slice of runes for performing truncation.

After some searching it looks like len([]rune(s)) is compiler optimized (https://go-review.googlesource.com/c/go/+/108985), so conversion of the string into a rune slice is not done until trunc knows truncation needs to be done.

A quick note that this does not fix abbrev, which performs byte counting like trunc did, but is in a separate package. If this PR is merged, or upon maintainer request in review, I can create a PR to the goutils package.

Jun 14 '21 16:06 andrewmostello