[sil_hebrew] diacritic marks not rendering consistently cross-platform on the OSK
It has been noticed that some key caps on the default layer of the Hebrew SIL keyboard appear empty, but they do output characters.
Keyman version: 16.0.37 iOS version: 16.3.1 Keyboard: Hebrew (SIL) keyboard
It is interesting though that they do show up in an older version of iOS,
| iOS 16.3.1 | iOS 15.7.3 |
|---|---|
![]() |
![]() |
A discussion has already been established at https://community.software.sil.org/t/hebrew-sil-keyboard-has-some-invisible-characters-on-ios-16-3-1-obu/7273.
See also:
- https://community.software.sil.org/t/keyboard-fails-to-load-load-correctly-on-iphone-xr-11/6958/3
- https://github.com/keymanapp/keyman/issues/7914
Guys thanks for working through this issue here. Is the keyboard made by SIL as well, or did a community member make this one. Would there be any possibility of keyboard owner trying the 25CC to 200F fix in the keyboard directly instead of in the keyman codebase working?
Agreed having a functional font for Hebrew, would be ideal, but it sounds like we don't have someone to do that. It seems like changing the characters in the keyboard would be a quick fix (although would take more time and devices to test that is doesn't break elsewhere).
Would there be any possibility of keyboard owner trying the 25CC to 200F fix in the keyboard directly instead of in the keyman codebase working?
I am pretty sure that trying a keyboard-based fix is likely to lead to significant time and effort in testing that ultimately will lead us to conclude that we need a font-based fix :grin: (Also, in this instance, the keyboard owner is the same team that works on fonts anyway.)
The first step to a resolution here is to get a full list of the characters (Unicode values) required. If there are keys with multiple glyphs on the key, then we'd need to list those sets as well.
@mcdurdin For the full list of characters, would the list of character sequences in the .kvks file be sufficient? I suppose that combining marks would need a base (space or dotted circle?) with them for proper display.
I suppose that combining marks would need a base (space or dotted circle?) with them for proper display.
Well, that's the entire problem we're trying to solve :grin:; refer to the proposal document at https://docs.google.com/document/d/1gJrWpZebOlPn_yn91BGk_Z8hEvVDVdxXB0ihgkW7wdg/edit#heading=h.9kamffv3tas9 for how we can solve this.
I have written a tool to analyze the .kvks and .keyman-touch-layout files and return an ordered, de-duped list of all the keycap texts as a single file, in Markdown or JSON format. I'll include the results for sil_hebrew below as an example. I have not yet published the code, but it will be part of Keyman Developer 17 shortly.
This would be the first step to building a spec for a given keyboard, I guess. Next step would be to assign PUA codes to the key caps that could be problematic for rendering (that is, any chars that require a base). We already have the KeymanWeb OSK font for special chars -- NBSP, ZWNJ, etc.
You'll note some attempts to mitigate the issue with combining marks on the key caps are made visible in this report. So when we fix them, we'll need to cleanup the key caps.
Markdown format
kmc analyze -a osk-char-use /c/Projects/keyman/keyboards/release/sil/sil_hebrew -f markdown -o sil_hebrew.md
| Code Points | Key Caps |
|---|---|
| U+0021 | ! |
| U+0022 | " |
| U+0024 | $ |
| U+0027 | ' |
| U+0028 | ( |
| U+0029 | ) |
| U+002C | , |
| U+002E | . |
| U+002F | / |
| U+0030 | 0 |
| U+0031 | 1 |
| U+0032 | 2 |
| U+0033 | 3 |
| U+0034 | 4 |
| U+0035 | 5 |
| U+0036 | 6 |
| U+0037 | 7 |
| U+0038 | 8 |
| U+0039 | 9 |
| U+003B | ; |
| U+003F | ? |
| U+0043 U+0047 U+004A | CGJ |
| U+004E U+006F U+0020 U+0062 U+0072 U+0065 U+0061 U+006B U+0020 U+0073 U+0070 U+0061 U+0063 U+0065 | No break space |
| U+0053 U+0020 U+0041 U+004C U+0054 | S ALT |
| U+0053 U+0020 U+0041 U+006C U+0074 | S Alt |
| U+005A U+0057 U+004A | ZWJ |
| U+005B | [ |
| U+005C | \ |
| U+005D | ] |
| U+006E U+006F U+0020 U+0062 U+0072 U+0065 U+0061 U+006B U+0020 U+0073 U+0070 U+0061 U+0063 U+0065 | no break space |
| U+0074 U+0068 U+0069 U+006E U+0020 U+0073 U+0070 U+0061 U+0063 U+0065 | thin space |
| U+007A U+0077 U+006A | zwj |
| U+007A U+0077 U+006E U+006A | zwnj |
| U+007B | { |
| U+007C | | |
| U+007D | } |
| U+00AB | « |
| U+00BB | » |
| U+0307 | ̇ |
| U+0308 | ̈ |
| U+030A | ̊ |
| U+0336 | ̶ |
| U+05BE | ־ |
| U+05C0 | ׀ |
| U+05C3 | ׃ |
| U+05C5 | ׅ |
| U+05C6 | ׆ |
| U+05C6 U+0307 | ׆̇ |
| U+05D0 | א |
| U+05D1 | ב |
| U+05D2 | ג |
| U+05D3 | ד |
| U+05D4 | ה |
| U+05D5 | ו |
| U+05D5 U+05B9 | וֹ |
| U+05D5 U+05BA | וֺ |
| U+05D5 U+05BC U+05BA | וֺּ |
| U+05D5 U+05BC U+05D5 U+05B9 | וּוֹ |
| U+05D5 U+05D5 U+05B9 | ווֹ |
| U+05D6 | ז |
| U+05D7 | ח |
| U+05D8 | ט |
| U+05D9 | י |
| U+05DA | ך |
| U+05DB | כ |
| U+05DC | ל |
| U+05DD | ם |
| U+05DE | מ |
| U+05DF | ן |
| U+05E0 | נ |
| U+05E0 U+0307 | נ̇ |
| U+05E1 | ס |
| U+05E2 | ע |
| U+05E3 | ף |
| U+05E4 | פ |
| U+05E5 | ץ |
| U+05E6 | צ |
| U+05E7 | ק |
| U+05E8 | ר |
| U+05E9 | ש |
| U+05E9 U+05C1 | שׁ |
| U+05E9 U+05C2 | שׂ |
| U+05EA | ת |
| U+05F3 | ׳ |
| U+05F4 | ״ |
| U+2013 | – |
| U+2014 | — |
| U+2022 | • |
| U+20AA | ₪ |
| U+20AC | € |
| U+25CC | ◌ |
| U+25CC U+0307 U+0020 U+0020 | ◌̇ |
| U+25CC U+0308 U+0020 U+0020 | ◌̈ |
| U+25CC U+030A | ◌̊ |
| U+25CC U+0591 | ◌֑ |
| U+25CC U+0592 | ◌֒ |
| U+25CC U+0593 | ◌֓ |
| U+25CC U+0594 | ◌֔ |
| U+25CC U+0595 | ◌֕ |
| U+25CC U+0596 | ◌֖ |
| U+25CC U+0597 | ◌֗ |
| U+25CC U+0598 | ◌֘ |
| U+25CC U+0599 | ◌֙ |
| U+25CC U+059A | ◌֚ |
| U+25CC U+059B | ◌֛ |
| U+25CC U+059C | ◌֜ |
| U+25CC U+059D | ◌֝ |
| U+25CC U+059E | ◌֞ |
| U+25CC U+059F | ◌֟ |
| U+25CC U+05A0 | ◌֠ |
| U+25CC U+05A1 | ◌֡ |
| U+25CC U+05A2 | ◌֢ |
| U+25CC U+05A3 | ◌֣ |
| U+25CC U+05A4 | ◌֤ |
| U+25CC U+05A5 | ◌֥ |
| U+25CC U+05A6 | ◌֦ |
| U+25CC U+05A7 | ◌֧ |
| U+25CC U+05A8 | ◌֨ |
| U+25CC U+05A9 | ◌֩ |
| U+25CC U+05AA | ◌֪ |
| U+25CC U+05AB | ◌֫ |
| U+25CC U+05AC | ◌֬ |
| U+25CC U+05AD | ◌֭ |
| U+25CC U+05AE | ◌֮ |
| U+25CC U+05AF | ◌֯ |
| U+25CC U+05B0 | ◌ְ |
| U+25CC U+05B1 | ◌ֱ |
| U+25CC U+05B2 | ◌ֲ |
| U+25CC U+05B3 | ◌ֳ |
| U+25CC U+05B4 | ◌ִ |
| U+25CC U+05B5 | ◌ֵ |
| U+25CC U+05B6 | ◌ֶ |
| U+25CC U+05B7 | ◌ַ |
| U+25CC U+05B8 | ◌ָ |
| U+25CC U+05B9 | ◌ֹ |
| U+25CC U+05BB | ◌ֻ |
| U+25CC U+05BC | ◌ּ |
| U+25CC U+05BD | ◌ֽ |
| U+25CC U+05BF | ◌ֿ |
| U+25CC U+05C4 | ◌ׄ |
| U+25CC U+05C7 | ◌ׇ |
| U+25E6 | ◦ |
JSON format
kmc analyze -a osk-char-use /c/Projects/keyman/keyboards/release/sil/sil_hebrew -f json -o sil_hebrew.json
[
"!",
"\"",
"$",
"'",
"(",
")",
",",
".",
"/",
"0",
"1",
"2",
"3",
"4",
"5",
"6",
"7",
"8",
"9",
";",
"?",
"CGJ",
"No break space",
"S ALT",
"S Alt",
"ZWJ",
"[",
"\\",
"]",
"no break space",
"thin space",
"zwj",
"zwnj",
"{",
"|",
"}",
"«",
"»",
"̇",
"̈",
"̊",
"̶",
"־",
"׀",
"׃",
"ׅ",
"׆",
"׆̇",
"א",
"ב",
"ג",
"ד",
"ה",
"ו",
"וֹ",
"וֺ",
"וֺּ",
"וּוֹ",
"ווֹ",
"ז",
"ח",
"ט",
"י",
"ך",
"כ",
"ל",
"ם",
"מ",
"ן",
"נ",
"נ̇",
"ס",
"ע",
"ף",
"פ",
"ץ",
"צ",
"ק",
"ר",
"ש",
"שׁ",
"שׂ",
"ת",
"׳",
"״",
"–",
"—",
"•",
"₪",
"€",
"◌",
"◌̇ ",
"◌̈ ",
"◌̊",
"◌֑",
"◌֒",
"◌֓",
"◌֔",
"◌֕",
"◌֖",
"◌֗",
"◌֘",
"◌֙",
"◌֚",
"◌֛",
"◌֜",
"◌֝",
"◌֞",
"◌֟",
"◌֠",
"◌֡",
"◌֢",
"◌֣",
"◌֤",
"◌֥",
"◌֦",
"◌֧",
"◌֨",
"◌֩",
"◌֪",
"◌֫",
"◌֬",
"◌֭",
"◌֮",
"◌֯",
"◌ְ",
"◌ֱ",
"◌ֲ",
"◌ֳ",
"◌ִ",
"◌ֵ",
"◌ֶ",
"◌ַ",
"◌ָ",
"◌ֹ",
"◌ֻ",
"◌ּ",
"◌ֽ",
"◌ֿ",
"◌ׄ",
"◌ׇ",
"◦"
]
The first step to a resolution here is to get a full list of the characters (Unicode values) required. If there are keys with multiple glyphs on the key, then we'd need to list those sets as well.
Would this include the hataf vowels? I don't see them distinctly in the JSON list. Not sure if they're their own unicode value or if they're a composite of two. Such as ֲ or ֱ .
Sorry if I'm not the most useful. Trying to contribute what I can though.
Thanks for your comment. I believe that the two items you list (U+05B1 HEBREW POINT HATAF SEGOL and U+05B2 HEBREW POINT HATAF PATAH) are in the JSON list.
Sorry if I'm not the most useful. Trying to contribute what I can though.
Thank you for your contributions! Appreciate your engagement with the issue.
As @DavidLRowe noted, these are in the JSON data. They are not the most visible, though, precisely due to the formatting issues with unattached bases which we are trying to solve for the On Screen Keyboard. Note that we are hoping to extend this work later this year to make a general proposal to Unicode, but that's a much bigger can of worms!
Checking in. I'm trying to follow the chain here. It looks like there was another issue that intended to also fix this for Hebrew. That issue has a Merge Request and was marked as closed. But this issue still persists for Hebrew. Is there another step specifically for Hebrew, or what's the current plan? Thanks again for your guy's work on this. Keyman is awesome, but this makes the Hebrew keyboard really difficult to learn for new typers.
https://github.com/keymanapp/keyman/issues/9031
@DylanCross the fix has been implemented in keymanapp/keyman#9032 and related PRs. The way we ended up addressing this is with a comprehensive feature in the Keyman Developer compiler that will be released together with Keyman 17.0, later this year, hopefully around September. This feature allows us to resolve the problem for any writing system that has inconsistent diacritic display (and there are a bunch of them, not just Hebrew). It's been an issue we've been wrestling with for some time, and I believe that the solution we've arrived at is as about as clean as we could make it. See more details in the feature documentation: &displayMap store.
For now, as far as we know, this specific issue with Hebrew diacritics affects only the latest releases of iOS (and possibly macOS), and the fix works consistently on iOS, macOS, Android, Windows, and web, on all versions of those platforms (Linux has a separate blocking issue keymanapp/keyman#6186 because it does not currently support custom on screen keyboard fonts).
Once Keyman 17.0 is released, we'll be updating the compiler version referenced in the keyboards repository and can publish a update #2258 for the Hebrew keyboard. Note that the updated Hebrew keyboard will continue to work with earlier versions of Keyman; the feature implementation is contained in the new compiler.
This may not be a high priority fix as KM Dev gets replaced, but I'm documenting it here as it is related.
I just noticed that most of the Hebrew diacritics don't get dotted circles in the Character Map,
and Arabic is about three quarters full of dotted circles.
Maybe some of those are non-combining, but surely not all of them.
@MattGyverLee the Keyman Developer character map uses the system renderer and fonts, so it's up to those as to how they render. We're not planning to make changes there, now or in the future. (It's not quite the same as with the keyboard, where inconsistent rendering, particularly of multiple diacritics together, has been a pain for a long time.)
@mcdurdin
@DylanCross the fix has been implemented in keymanapp/keyman#9032 and related PRs. The way we ended up addressing this is with a comprehensive feature in the Keyman Developer compiler that will be released together with Keyman 17.0, later this year, hopefully around September. This feature allows us to resolve the problem for any writing system that has inconsistent diacritic display (and there are a bunch of them, not just Hebrew). It's been an issue we've been wrestling with for some time, and I believe that the solution we've arrived at is as about as clean as we could make it. See more details in the feature documentation:
&displayMapstore.For now, as far as we know, this specific issue with Hebrew diacritics affects only the latest releases of iOS (and possibly macOS), and the fix works consistently on iOS, macOS, Android, Windows, and web, on all versions of those platforms (Linux has a separate blocking issue keymanapp/keyman#6186 because it does not currently support custom on screen keyboard fonts).
Once Keyman 17.0 is released, we'll be updating the compiler version referenced in the keyboards repository and can publish a update #2258 for the Hebrew keyboard. Note that the updated Hebrew keyboard will continue to work with earlier versions of Keyman; the feature implementation is contained in the new compiler.
Couple questions. 1. Is there an update on the Keyman 17 release date? It's past September. I did check the blogs, but an not sure where to look. 2. Should this work in the alpha build of 17, or does something else have to be done with the compiler release first. I've been using the v17 alpha and still don't see the characters, so I figure there's another step left.
Thanks for the work to get the vowels back in Hebrew. The Masoretes and I greatly appreciate it.
- Is there an update on the Keyman 17 release date? It's past Septembe
Yeah, the release slipped a fair bit because of substantial changes in the Unicode LDML Keyboards specification (which is now nearly final), and we have to implement the changes. The target release date is now March 2024 -- and hopefully going into beta in the new week or two.
- Should this work in the alpha build of 17, or does something else have to be done with the compiler release first. I've been using the v17 alpha and still don't see the characters, so I figure there's another step left.
This is addressed in the compiler in v17. Existing keyboards in the keyboards repository will not benefit until we release v17, when we can deploy the compiler to the repository and update them. So it's a matter of getting all our ducks in a row to launch this change. (We are still a very small team with very limited resources, so please bear with us!)
@mcdurdin Thanks for the heads up. Like I said, I appreciate the work you guys do. I'm not quite the expert on how Keyman works, so thank for accomadating my clarifications.

