String resource markup format
I stumbled over this comment in the converter source:
# TODO: transform and cleanup the read strings...
# (strip html, insert formatchars/identifiers, ...)
When take a look at the strings, the following format is being used:
-
<Team> -
<b>Mini-map<b> -
Create <b> Samurai<b> (<cost>) - ..
I see two different "features" in there. The first one is the pseudo-HTML (it's not HTML as no proper closing tags are being used) that's being used for markup. The other thing is placeholders.
My suggestion is the following:
- Use a markdown subset for markup:
*italic*,**bold**,***bold italic***. Maybe also the underscore variants (although I prefer simplicity). - Use a placeholder syntax inspired by Python 3's named
.format()syntax:Create **Samurai** ({cost}). I don't think we need formatting syntax like!sor:.2. - Convert all uppercase placeholders to lowercase. Convert camelCase (not sure if that even exists) to snake_case.
- Strip unneeded whitespace
So the examples above would become:
-
{team} -
**Mini-map** -
Create **Samurai** ({cost})
An alternative placeholder could be like in the Django template language: {{ placeholder_name }}. That would also leave open the possibility for filters: {{ begin_date|date:"d.m.y H:M" }}
Opinions?
:+1: at the whole proposal. i'd go for the {rofl} replacements, i don't think {{}} is needed.
The problem with markdown is that it's designed to be written by humans, not by machines. For example, there's no proper way (that I know of) to escape things.
How would I convert <b>a**b<b>? Whatever the answer, it probably won't be nice for both reading and writing by both humans and machines.
So, instead I suggest using some sort of escape character, like \ is used in C++ strings:
<b>a**b<b> could be converted to something like \bold\a**b\regular\... tbh that looks pretty retarded, so maybe you have better ideas.
Note that the string processing could be done entirely in python, since it's not very performance-critical. We might thus leverage the full potential of Python and the as-of-yet not-yet existing python/C interface. The format strings could even contain simple python expressions.
Markdown specifies backslash escapes: https://daringfireball.net/projects/markdown/syntax#backslash
With that we'd arrive at the following markup spec:
- Use
*for bold - Use
**for italic - Use
***for bold italic - Use
\to escape the asterisk
The parsing would be quite trivial using a very simple state machine. Converting the current markdown would be easy too.
If we actually do everything in Python we could even settle for the full Python 3 .format() syntax. It's quite powerful. (Alternatively we can limit ourselves to the {name} version.)
With \ escapes, the markdown thing looks good. However, there's a few additional features that would be great:
- being able to colorize (parts of) the text
- clickable hyperlinks
- text that shows additional info when you hover over it with the mouse
- switch between monospaced/regular font
Maybe we need to differentiate two different representations:
The one used to store the formatstrings, and the one used to pass the processed strings to the font renderer.
The formatstring representation could be an extended subset of Markdown. I'm sure we'll be able to invent some syntax for coloring and hovertexts.
The font-renderer representation could look like this:
(for hyperlinks)
- escape sequence: begin hyperlink
- click here to visit the openage homepage!
- escape sequence: begin hyperlink url
- http://openage.sft.mx
- escape sequence: end hyperlink
(for hoverable text)
- escape sequence: begin hoverable text
- escape sequence: color: red
- error
- escape sequence: end color
- something went wrong!
- escape sequence: begin hover text
- engine.cpp, line 235; thread 3
- escape sequence: end hover text
(for hyperlinks)
Hyperlinks exist in Markdown as [text](url). Github Flavoured Markdown uses a similar syntax for images, . We could probably steal/borrow a similar syntax for colours as well, maybe [red!]{#f00} or #[red!](f00)? The [] combination is pretty nice for marking pieces of text.
Are there any examples of how the hover text would be used? Would it be similar to the title text on links and images in HTML (just a short description) or would they also have to be able to fit large paragraphs?
An awesome hovertext would be hovering on any unit, and all its props are visible in a popup then (for debugging purposes). This popup can be stickied, so one can follow the data changes. So yes, it can be a large paragraph.
I like the idea of colors [asdf my text lol]{red} or [text]{#ff0000}, this could be a start integrating it.
Hyperlinks could be done the regular markdown way.
The trigger for popups is not defined in the translation itself i'd say, this has to be done from the outside.
Would we ever create hyperlinks to HTML anchors? That would conflict with the # syntax.
If not, I like the [text]{#ff0000} suggestion.
What exactly do you mean with hovertext? Strings don't define their own behavior, they can be used however the coder likes. A text box that appears when hovering over a unit is not something we'd control using the strings, but rather by settings some event handlers on the unit itself.
i stumbled upon that problem as well, but note the different parens: link: []() attribute: []{}.
There won't be any conflict, linking to # will still work.
Ah, obviously.
So now we have:
- Use
*for bold - Use
**for italic - Use
***for bold italic - Use
\to escape the asterisk - Use
[link text](link target)for links - Use
[colored text]{#rrggbbaa}for colored text
What's up with the hover stuff?
My main use case for hover text:
If log messages are printed on-screen (such as next to the unit that they were logged by), you can hover over them to show additional info like filename, line of code, function name, thread id, timestamp etc.
So hovering would work like this:
The font renderer renders the text, but ignores the parts that are marked as 'hover'. That is, except if the mouse pointer happens to be over the 'hover text trigger' word.
I think it's a really bad idea making the font renderer aware of any mouse position. This is the responsibility of the gui framework. The markup for hover has to be something different which is preprocessed by the gui subsystem, which then triggers the activation of some textbox depending on mouse positions/hover durations.
Well, I don't really care at what point in the abstraction stack the hover text is rendered, but it should be.
Exactly. For that you would render a string, attach an event handler on it, and then display another string resource on mouseover.
Did you try implementing the string markup conversion? Should be pretty easy now that the python interface is there to define the markup at one place and use it from c++ as well.
Atm I don't really have time, so if someone else has time to pick up this issue, please feel free to do so :(
Although this discussion is pretty old, I want to add some points to consider here.
The biggest issue that we will run into will be the limited amount of space for most text messages on screen. Hover text and adjustable font size make that even more complicated. I suggest that we allow a set of formatting commands that can be embedded into the text. Maybe a LaTex inspired syntax like \command[options]{content} instead of Markdown would be better in that case? Although this could lead to the syntax being less intuitive at first. But it could be beneficial because modders could define their own macros.
Example syntax:
-
\bold{text}for text -
\italic{text}for text -
\bold{\italic{text}}for text -
\for escape -
\link{target} -
\url{hyperlink} -
\color[red]{text}
Thoughts on this?
jup. good idea. let's do it like that!
some \hover{text}{you hovered the word 'text'!} displayed along an image \image{unique.imageid}.
we can add \bold{bold text} and insert variables like <cost> with \var{cost}, for example.
Random citizen here 😕 .format() is mentioned a few times and seems powerful enough. Why not just go with that? Seems easy both for implementation and usage. Syntax looks a lot like the for matters from other languages.
@Jogai
Using control code syntax (\command[options]{content}) it's not that different from .format() anyway. The main advantage is that markup and placeholder commands have the same syntax and that the markup commands can be easily extended to fit our needs. We need some advanced info embedded in the string (coloring, hovertext, tooltips, links) that the simple formatters usually don't care about. But they are relevant for a game engine, so our solution is a bit more sophisticated. From the user's standpoint it's not that much of a stretch. Whether you write {cost} or \var{cost} in a string shouldn't be that much of a difference.
For proper translation we should go with a well-established solution here (especially for integration in community translation websites), and I think maybe https://projectfluent.org/ could be a candidate.
There is also a unicode standard for translation strings called ICU MessageFormat:
- https://github.com/unicode-org/message-format-wg
- https://unicode.org/reports/tr35/tr35-messageFormat.html