alfaaz icon indicating copy to clipboard operation
alfaaz copied to clipboard

Bug: detecting special symbols as a word

Open lyqht opened this issue 3 years ago • 3 comments

Hello there, thank you for the great library! I tried testing it on a sentences like these

  chinese1 : [`你好吗? ✨😊`, 3],
  chinese2 : [`你好吗?!~`, 3],

but both are returning 4 words instead of the expected 3.

lyqht avatar Apr 21 '23 11:04 lyqht

Hello, any update on this? It is also counting dashes and em dashes as words as well when they should be treated as spaces.

vonWolfehaus avatar Jul 29 '25 18:07 vonWolfehaus

@vonWolfehaus examples would be great!

thecodrr avatar Jul 30 '25 04:07 thecodrr

@vonWolfehaus examples would be great!

Image

It is counting this string as 10 words but it is actually only 8. It incorrectly counts the dashed word ("some—words") as one word as well, and counts the em dashes as words themselves when separated by spaces.

You can see a live example here. You can also select words and the word count at the bottom will display the selected word count too so you can see what it is counting.

vonWolfehaus avatar Jul 31 '25 19:07 vonWolfehaus