zig icon indicating copy to clipboard operation
zig copied to clipboard

shift-based DFA implementation of utf8ValidateSlice

Open hollmmax opened this issue 8 months ago • 1 comments

I've been playing around with shift-based DFAs and I think they could be a good fit for the Zig standard library, as they're 3 to 8 times faster than the current utf8ValidateSlice implementation in my testing on an i5 8350u.

The main difference from the original gist is that this implementation uses only 32bit shifts, making it compatible with important Zig targets like ARM, RISC-V and good old x86. The awkward downside is that this is achieved by validating the string from end to start.

I chose to submit an implementation that first defines a DFA and converts it to a shift transition table using comptime, although I am open to replacing that code with a pre-built table literal.

hollmmax avatar May 22 '25 22:05 hollmmax

Hi, thanks for feedback. For some reason I though enums with explicit tag type didn't play well with exhaustive switches and as the State is only a comptime construct, I'd rather have those than the tag type.

That was an incorrect assumption on my part. The enum now has an explicit tag type and values and the helper methods are completely gone.

hollmmax avatar May 23 '25 23:05 hollmmax