Documentation of EntityContext (v2)
Are EntityContext Start and End zero based?
I have the assumption that since golang is zero based on the index of the runes in a string, that the Start and End of the Spans in an EntityContext are zero based. Could you confirm and perhaps update the godoc? I traced into the UsingEntities method, and found the adjustPos method that seemed to make up for off-by-one problems. I'd like to be sure my training model is correct.
(to be clear, on the v2 branch)
At the core of this question is really, "How can I verify my prose.ModelFromData("name", prose.UsingEntities(data)) is accurate? I'm building a model that includes unicode. Which I realize this might be tied into the question of supporting other languages. Is twitter-english really english with all the unicode emoji?