keyman icon indicating copy to clipboard operation
keyman copied to clipboard

feat(web): check for low-probability exact + exact-key correction matches 📚

Open jahorton opened this issue 1 year ago • 1 comments

One long-time pet peeve of mine when it comes to auto-correct: it should never correct away from a perfectly-valid word of the language. Even if it's is a more common word in English than its, I believe that its should be left in place. (I find it quite the pain on iOS, as I've had to fight iOS to leave its before.)

Even if we were to auto-correct away, we should ensure that its is, at least, easily accessible on the banner as an option so that a user may at least prevent auto-correction that way. (Admittedly, iOS does present it... I probably need to adapt.) That is... we need to detect exact matches + exact-key matches and ensure they're always visible, with priority. This is something our engine currently doesn't do well, but thanks to #11869, we now have the perfect tool to remedy the issue.

Secondly... the ModelCompositor.predict method is long and rather "monolithic". It'd be nice to spin off as much "keep"-related handling as possible into its own method.

jahorton avatar Jun 26 '24 08:06 jahorton

One long-time pet peeve of mine when it comes to auto-correct: it should never correct away from a perfectly-valid word of the language. Even if it's is a more common word in English than its, I believe that its should be left in place. (I find it quite the pain on iOS, as I've had to fight iOS to leave its before.)

I am tempted to make you feel better by saying "there, their, they're"...

Interestingly, I've seen other users who love this. its vs it's is impossible to get right without some grammatical awareness of course... but in the more general case, I wonder if this should be surfaced as an option for users?

mcdurdin avatar Jul 05 '24 07:07 mcdurdin

One long-time pet peeve of mine when it comes to auto-correct: it should never correct away from a perfectly-valid word of the language. Even if it's is a more common word in English than its, I believe that its should be left in place. (I find it quite the pain on iOS, as I've had to fight iOS to leave its before.)

I am tempted to make you feel better by saying "there, their, they're"...

Interestingly, I've seen other users who love this. its vs it's is impossible to get right without some grammatical awareness of course... but in the more general case, I wonder if this should be surfaced as an option for users?

I was actually thinking something similar, making it a configurable option.

jahorton avatar Jul 05 '24 08:07 jahorton

I was actually thinking something similar, making it a configurable option.

  • #11931

mcdurdin avatar Jul 05 '24 08:07 mcdurdin

Noticed this while working on focused unit tests for the followup to #11940: even with this PR in place, we're not currently auto-selecting something like can't with priority when the current context is cant - where there's no exact context match, but there is an exact-key match. I'll want to fix that, whether it be within this PR or within a descendant.

The probability-ratio requirement should probably only apply to suggestions within the same "similarity tier". If it's a lower tier, we should probably straight-up ignore its probability component for the sum used in the thresholding ratio.

jahorton avatar Jul 09 '24 03:07 jahorton

Changes in this pull request will be available for download in Keyman version 18.0.75-alpha

keyman-server avatar Jul 25 '24 18:07 keyman-server