keyboards icon indicating copy to clipboard operation
keyboards copied to clipboard

[sil_hebrew] diacritic marks not rendering consistently cross-platform on the OSK

Open MakaraSok opened this issue 2 years ago • 17 comments

It has been noticed that some key caps on the default layer of the Hebrew SIL keyboard appear empty, but they do output characters.

Keyman version: 16.0.37 iOS version: 16.3.1 Keyboard: Hebrew (SIL) keyboard

It is interesting though that they do show up in an older version of iOS,

iOS 16.3.1 iOS 15.7.3

MakaraSok avatar Mar 23 '23 02:03 MakaraSok

A discussion has already been established at https://community.software.sil.org/t/hebrew-sil-keyboard-has-some-invisible-characters-on-ios-16-3-1-obu/7273.

MakaraSok avatar Mar 23 '23 02:03 MakaraSok

See also:

  • https://community.software.sil.org/t/keyboard-fails-to-load-load-correctly-on-iphone-xr-11/6958/3
  • https://github.com/keymanapp/keyman/issues/7914

mcdurdin avatar Apr 13 '23 05:04 mcdurdin

Guys thanks for working through this issue here. Is the keyboard made by SIL as well, or did a community member make this one. Would there be any possibility of keyboard owner trying the 25CC to 200F fix in the keyboard directly instead of in the keyman codebase working?

Agreed having a functional font for Hebrew, would be ideal, but it sounds like we don't have someone to do that. It seems like changing the characters in the keyboard would be a quick fix (although would take more time and devices to test that is doesn't break elsewhere).

DylanCross avatar Apr 13 '23 13:04 DylanCross

Would there be any possibility of keyboard owner trying the 25CC to 200F fix in the keyboard directly instead of in the keyman codebase working?

I am pretty sure that trying a keyboard-based fix is likely to lead to significant time and effort in testing that ultimately will lead us to conclude that we need a font-based fix :grin: (Also, in this instance, the keyboard owner is the same team that works on fonts anyway.)

mcdurdin avatar Apr 17 '23 03:04 mcdurdin

The first step to a resolution here is to get a full list of the characters (Unicode values) required. If there are keys with multiple glyphs on the key, then we'd need to list those sets as well.

mcdurdin avatar May 05 '23 03:05 mcdurdin

@mcdurdin For the full list of characters, would the list of character sequences in the .kvks file be sufficient? I suppose that combining marks would need a base (space or dotted circle?) with them for proper display.

DavidLRowe avatar May 06 '23 22:05 DavidLRowe

I suppose that combining marks would need a base (space or dotted circle?) with them for proper display.

Well, that's the entire problem we're trying to solve :grin:; refer to the proposal document at https://docs.google.com/document/d/1gJrWpZebOlPn_yn91BGk_Z8hEvVDVdxXB0ihgkW7wdg/edit#heading=h.9kamffv3tas9 for how we can solve this.

I have written a tool to analyze the .kvks and .keyman-touch-layout files and return an ordered, de-duped list of all the keycap texts as a single file, in Markdown or JSON format. I'll include the results for sil_hebrew below as an example. I have not yet published the code, but it will be part of Keyman Developer 17 shortly.

This would be the first step to building a spec for a given keyboard, I guess. Next step would be to assign PUA codes to the key caps that could be problematic for rendering (that is, any chars that require a base). We already have the KeymanWeb OSK font for special chars -- NBSP, ZWNJ, etc.

You'll note some attempts to mitigate the issue with combining marks on the key caps are made visible in this report. So when we fix them, we'll need to cleanup the key caps.

Markdown format

kmc analyze -a osk-char-use /c/Projects/keyman/keyboards/release/sil/sil_hebrew -f markdown -o sil_hebrew.md

Code Points Key Caps
U+0021 !
U+0022 "
U+0024 $
U+0027 '
U+0028 (
U+0029 )
U+002C ,
U+002E .
U+002F /
U+0030 0
U+0031 1
U+0032 2
U+0033 3
U+0034 4
U+0035 5
U+0036 6
U+0037 7
U+0038 8
U+0039 9
U+003B ;
U+003F ?
U+0043 U+0047 U+004A CGJ
U+004E U+006F U+0020 U+0062 U+0072 U+0065 U+0061 U+006B U+0020 U+0073 U+0070 U+0061 U+0063 U+0065 No break space
U+0053 U+0020 U+0041 U+004C U+0054 S ALT
U+0053 U+0020 U+0041 U+006C U+0074 S Alt
U+005A U+0057 U+004A ZWJ
U+005B [
U+005C \
U+005D ]
U+006E U+006F U+0020 U+0062 U+0072 U+0065 U+0061 U+006B U+0020 U+0073 U+0070 U+0061 U+0063 U+0065 no break space
U+0074 U+0068 U+0069 U+006E U+0020 U+0073 U+0070 U+0061 U+0063 U+0065 thin space
U+007A U+0077 U+006A zwj
U+007A U+0077 U+006E U+006A zwnj
U+007B {
U+007C |
U+007D }
U+00AB «
U+00BB »
U+0307 ̇
U+0308 ̈
U+030A ̊
U+0336 ̶
U+05BE ־
U+05C0 ׀
U+05C3 ׃
U+05C5 ׅ
U+05C6 ׆
U+05C6 U+0307 ׆̇
U+05D0 א
U+05D1 ב
U+05D2 ג
U+05D3 ד
U+05D4 ה
U+05D5 ו
U+05D5 U+05B9 וֹ
U+05D5 U+05BA וֺ
U+05D5 U+05BC U+05BA וֺּ
U+05D5 U+05BC U+05D5 U+05B9 וּוֹ
U+05D5 U+05D5 U+05B9 ווֹ
U+05D6 ז
U+05D7 ח
U+05D8 ט
U+05D9 י
U+05DA ך
U+05DB כ
U+05DC ל
U+05DD ם
U+05DE מ
U+05DF ן
U+05E0 נ
U+05E0 U+0307 נ̇
U+05E1 ס
U+05E2 ע
U+05E3 ף
U+05E4 פ
U+05E5 ץ
U+05E6 צ
U+05E7 ק
U+05E8 ר
U+05E9 ש
U+05E9 U+05C1 שׁ
U+05E9 U+05C2 שׂ
U+05EA ת
U+05F3 ׳
U+05F4 ״
U+2013
U+2014
U+2022
U+20AA
U+20AC
U+25CC
U+25CC U+0307 U+0020 U+0020 ◌̇
U+25CC U+0308 U+0020 U+0020 ◌̈
U+25CC U+030A ◌̊
U+25CC U+0591 ◌֑
U+25CC U+0592 ◌֒
U+25CC U+0593 ◌֓
U+25CC U+0594 ◌֔
U+25CC U+0595 ◌֕
U+25CC U+0596 ◌֖
U+25CC U+0597 ◌֗
U+25CC U+0598 ◌֘
U+25CC U+0599 ◌֙
U+25CC U+059A ◌֚
U+25CC U+059B ◌֛
U+25CC U+059C ◌֜
U+25CC U+059D ◌֝
U+25CC U+059E ◌֞
U+25CC U+059F ◌֟
U+25CC U+05A0 ◌֠
U+25CC U+05A1 ◌֡
U+25CC U+05A2 ◌֢
U+25CC U+05A3 ◌֣
U+25CC U+05A4 ◌֤
U+25CC U+05A5 ◌֥
U+25CC U+05A6 ◌֦
U+25CC U+05A7 ◌֧
U+25CC U+05A8 ◌֨
U+25CC U+05A9 ◌֩
U+25CC U+05AA ◌֪
U+25CC U+05AB ◌֫
U+25CC U+05AC ◌֬
U+25CC U+05AD ◌֭
U+25CC U+05AE ◌֮
U+25CC U+05AF ◌֯
U+25CC U+05B0 ◌ְ
U+25CC U+05B1 ◌ֱ
U+25CC U+05B2 ◌ֲ
U+25CC U+05B3 ◌ֳ
U+25CC U+05B4 ◌ִ
U+25CC U+05B5 ◌ֵ
U+25CC U+05B6 ◌ֶ
U+25CC U+05B7 ◌ַ
U+25CC U+05B8 ◌ָ
U+25CC U+05B9 ◌ֹ
U+25CC U+05BB ◌ֻ
U+25CC U+05BC ◌ּ
U+25CC U+05BD ◌ֽ
U+25CC U+05BF ◌ֿ
U+25CC U+05C4 ◌ׄ
U+25CC U+05C7 ◌ׇ
U+25E6

JSON format

kmc analyze -a osk-char-use /c/Projects/keyman/keyboards/release/sil/sil_hebrew -f json -o sil_hebrew.json

[
  "!",
  "\"",
  "$",
  "'",
  "(",
  ")",
  ",",
  ".",
  "/",
  "0",
  "1",
  "2",
  "3",
  "4",
  "5",
  "6",
  "7",
  "8",
  "9",
  ";",
  "?",
  "CGJ",
  "No break space",
  "S ALT",
  "S Alt",
  "ZWJ",
  "[",
  "\\",
  "]",
  "no break space",
  "thin space",
  "zwj",
  "zwnj",
  "{",
  "|",
  "}",
  "«",
  "»",
  "̇",
  "̈",
  "̊",
  "̶",
  "־",
  "׀",
  "׃",
  "ׅ",
  "׆",
  "׆̇",
  "א",
  "ב",
  "ג",
  "ד",
  "ה",
  "ו",
  "וֹ",
  "וֺ",
  "וֺּ",
  "וּוֹ",
  "ווֹ",
  "ז",
  "ח",
  "ט",
  "י",
  "ך",
  "כ",
  "ל",
  "ם",
  "מ",
  "ן",
  "נ",
  "נ̇",
  "ס",
  "ע",
  "ף",
  "פ",
  "ץ",
  "צ",
  "ק",
  "ר",
  "ש",
  "שׁ",
  "שׂ",
  "ת",
  "׳",
  "״",
  "–",
  "—",
  "•",
  "₪",
  "€",
  "◌",
  "◌̇  ",
  "◌̈  ",
  "◌̊",
  "◌֑",
  "◌֒",
  "◌֓",
  "◌֔",
  "◌֕",
  "◌֖",
  "◌֗",
  "◌֘",
  "◌֙",
  "◌֚",
  "◌֛",
  "◌֜",
  "◌֝",
  "◌֞",
  "◌֟",
  "◌֠",
  "◌֡",
  "◌֢",
  "◌֣",
  "◌֤",
  "◌֥",
  "◌֦",
  "◌֧",
  "◌֨",
  "◌֩",
  "◌֪",
  "◌֫",
  "◌֬",
  "◌֭",
  "◌֮",
  "◌֯",
  "◌ְ",
  "◌ֱ",
  "◌ֲ",
  "◌ֳ",
  "◌ִ",
  "◌ֵ",
  "◌ֶ",
  "◌ַ",
  "◌ָ",
  "◌ֹ",
  "◌ֻ",
  "◌ּ",
  "◌ֽ",
  "◌ֿ",
  "◌ׄ",
  "◌ׇ",
  "◦"
]

mcdurdin avatar May 08 '23 23:05 mcdurdin

The first step to a resolution here is to get a full list of the characters (Unicode values) required. If there are keys with multiple glyphs on the key, then we'd need to list those sets as well.

Would this include the hataf vowels? I don't see them distinctly in the JSON list. Not sure if they're their own unicode value or if they're a composite of two. Such as ֲ or ֱ .

Sorry if I'm not the most useful. Trying to contribute what I can though.

DylanCross avatar May 12 '23 13:05 DylanCross

Thanks for your comment. I believe that the two items you list (U+05B1 HEBREW POINT HATAF SEGOL and U+05B2 HEBREW POINT HATAF PATAH) are in the JSON list.

DavidLRowe avatar May 12 '23 20:05 DavidLRowe

Sorry if I'm not the most useful. Trying to contribute what I can though.

Thank you for your contributions! Appreciate your engagement with the issue.

As @DavidLRowe noted, these are in the JSON data. They are not the most visible, though, precisely due to the formatting issues with unattached bases which we are trying to solve for the On Screen Keyboard. Note that we are hoping to extend this work later this year to make a general proposal to Unicode, but that's a much bigger can of worms!

mcdurdin avatar May 14 '23 05:05 mcdurdin

Checking in. I'm trying to follow the chain here. It looks like there was another issue that intended to also fix this for Hebrew. That issue has a Merge Request and was marked as closed. But this issue still persists for Hebrew. Is there another step specifically for Hebrew, or what's the current plan? Thanks again for your guy's work on this. Keyman is awesome, but this makes the Hebrew keyboard really difficult to learn for new typers.

https://github.com/keymanapp/keyman/issues/9031

DylanCross avatar Jul 14 '23 14:07 DylanCross

@DylanCross the fix has been implemented in keymanapp/keyman#9032 and related PRs. The way we ended up addressing this is with a comprehensive feature in the Keyman Developer compiler that will be released together with Keyman 17.0, later this year, hopefully around September. This feature allows us to resolve the problem for any writing system that has inconsistent diacritic display (and there are a bunch of them, not just Hebrew). It's been an issue we've been wrestling with for some time, and I believe that the solution we've arrived at is as about as clean as we could make it. See more details in the feature documentation: &displayMap store.

For now, as far as we know, this specific issue with Hebrew diacritics affects only the latest releases of iOS (and possibly macOS), and the fix works consistently on iOS, macOS, Android, Windows, and web, on all versions of those platforms (Linux has a separate blocking issue keymanapp/keyman#6186 because it does not currently support custom on screen keyboard fonts).

Once Keyman 17.0 is released, we'll be updating the compiler version referenced in the keyboards repository and can publish a update #2258 for the Hebrew keyboard. Note that the updated Hebrew keyboard will continue to work with earlier versions of Keyman; the feature implementation is contained in the new compiler.

mcdurdin avatar Jul 17 '23 01:07 mcdurdin

This may not be a high priority fix as KM Dev gets replaced, but I'm documenting it here as it is related. I just noticed that most of the Hebrew diacritics don't get dotted circles in the Character Map, image and Arabic is about three quarters full of dotted circles. image Maybe some of those are non-combining, but surely not all of them.

MattGyverLee avatar Oct 04 '23 16:10 MattGyverLee

@MattGyverLee the Keyman Developer character map uses the system renderer and fonts, so it's up to those as to how they render. We're not planning to make changes there, now or in the future. (It's not quite the same as with the keyboard, where inconsistent rendering, particularly of multiple diacritics together, has been a pain for a long time.)

mcdurdin avatar Oct 05 '23 08:10 mcdurdin

@mcdurdin

@DylanCross the fix has been implemented in keymanapp/keyman#9032 and related PRs. The way we ended up addressing this is with a comprehensive feature in the Keyman Developer compiler that will be released together with Keyman 17.0, later this year, hopefully around September. This feature allows us to resolve the problem for any writing system that has inconsistent diacritic display (and there are a bunch of them, not just Hebrew). It's been an issue we've been wrestling with for some time, and I believe that the solution we've arrived at is as about as clean as we could make it. See more details in the feature documentation: &displayMap store.

For now, as far as we know, this specific issue with Hebrew diacritics affects only the latest releases of iOS (and possibly macOS), and the fix works consistently on iOS, macOS, Android, Windows, and web, on all versions of those platforms (Linux has a separate blocking issue keymanapp/keyman#6186 because it does not currently support custom on screen keyboard fonts).

Once Keyman 17.0 is released, we'll be updating the compiler version referenced in the keyboards repository and can publish a update #2258 for the Hebrew keyboard. Note that the updated Hebrew keyboard will continue to work with earlier versions of Keyman; the feature implementation is contained in the new compiler.

Couple questions. 1. Is there an update on the Keyman 17 release date? It's past September. I did check the blogs, but an not sure where to look. 2. Should this work in the alpha build of 17, or does something else have to be done with the compiler release first. I've been using the v17 alpha and still don't see the characters, so I figure there's another step left.

Thanks for the work to get the vowels back in Hebrew. The Masoretes and I greatly appreciate it.

DylanCross avatar Feb 05 '24 19:02 DylanCross

  1. Is there an update on the Keyman 17 release date? It's past Septembe

Yeah, the release slipped a fair bit because of substantial changes in the Unicode LDML Keyboards specification (which is now nearly final), and we have to implement the changes. The target release date is now March 2024 -- and hopefully going into beta in the new week or two.

  1. Should this work in the alpha build of 17, or does something else have to be done with the compiler release first. I've been using the v17 alpha and still don't see the characters, so I figure there's another step left.

This is addressed in the compiler in v17. Existing keyboards in the keyboards repository will not benefit until we release v17, when we can deploy the compiler to the repository and update them. So it's a matter of getting all our ducks in a row to launch this change. (We are still a very small team with very limited resources, so please bear with us!)

mcdurdin avatar Feb 06 '24 06:02 mcdurdin

@mcdurdin Thanks for the heads up. Like I said, I appreciate the work you guys do. I'm not quite the expert on how Keyman works, so thank for accomadating my clarifications.

DylanCross avatar Feb 06 '24 15:02 DylanCross