docs icon indicating copy to clipboard operation
docs copied to clipboard

RFC: Localizing the Atlas documentation

Open whallin opened this issue 3 months ago • 16 comments

[!NOTE] What is an "RFC"? A Request for Comments ("RFC") is an official document used to propose, discuss, and define the technical rules that make the internet work. We're using it here as a term for a larger technical project idea that needs discussion before any implementation.

We've been down this road before - how do we structure and operate a localization project for the AtlasOS documentation to improve user accessibility? We're asking ourselves this question today since the majority of our users aren't native English speakers, at least if our per-country web analytics are anything to go by.

I and ex-maintainer Amy had gone down the road way back of figuring out what the ideal technical implementation is for getting our documentation localized with our current framework, MkDocs (with mkdocs-material as our theme). While mkdocs-material helps implement many new features, they too recognize the limitations MkDocs bring for i18n support^1.

There are third-party plugins for mkdocs, like mkdocs-static-i18n, however their compatibility and ease-of-use with MkDocs while running mkdocs-material is undocumented and untested by both us at Atlas or the team behind mkdocs-material — this might be a possible option though.

What would we like to achieve with localized documentation?

  1. Better user accessibility - By providing localized documentation to our users, we hope that users whose proficiency in English may be insufficient can instead consume understandable documentation in their native language.
  2. Better localized SEO - With alternate localized routes for each documentation page, we could see higher chances of the documentation appearing in search results when someone searches for “install atlasos” on, for example, google.com.br (Google Brazil).
  3. Easier product accessibility - This partly ties back to point 1. If we're able to offer documentation in native languages rather than solely English, we might convert users whose primary blocker of using Atlas was documentation they couldn't understand due to its current language[^2].

What criteria must our implementation of localization meet?

  1. The documentation framework (in our case, currently, MkDocs) must have some method, native or third-party, for generating SEO-ready and accessible[^3] localized pages based on localized content files (in Markdown or a Markdown flavor).
  2. The localized content files must be stored in a folder structure that scales. E.g. /en for English documentation files and sub-folders, /sv for Swedish documentation files and sub-folders, etc.
  3. The documentation must be easily translatable by a third party to honor our community-driven and open-source nature. Using localization tooling like Crowdin[^4], Weblate, or POEditor is the goal.
    1. Optionally: offering foolproof documentation on how to contribute translations with Git could be an option, but we want as low of a barrier as possible for translation contributions.

What are possible challenges with localization?

  1. Keeping the documentation translated - Our biggest worry with a project like this has always been keeping the documentation up-to-date with changes. We tend to, at times frequently, add new pages, correct incorrect information, or push out other types of fixes. How do we keep documentation translated in all our languages as we update and elaborate? Could we safely fall back on the core translation?
  2. The technical implementation - Since MkDocs has its limitations (see footnote 1) it might be challenging to translate the entire visible page (documentation, page names in sidebar, metadata, interface components, etc.). Are there any third-party plugins we could use to fill the gaps with native i18n functionality? Do we need to migrate to an entirely new documentation framework or platform?
  3. Finding contributors willing to translate - This ties back into point 1. With evolving documentation, who'll lead it being localized? While we can't force contributions, we need to at least make people aware of the possibility to translate our documentation. How do we communicate the translation opportunity to people? How do we integrate those who want to contribute in an efficient and effective way?

What options do we have for localization?

  1. Relying on existing functionality - mkdocs-material has simple functionality for hosting multiple languages[^5]. However, how easy it is for translating sidebar names, generating the proper HTML metadata, and keeping any interface components translated is unclear and remains to be tested.
  2. Implementing a third-party plugin - With a third-party integration like mkdocs-static-i18n, could we bring full i18n support to the now lacking i18n? Would any plugins be compatible with mkdocs-material without or with minimum technical intervention? We're not looking to maintain our own technical debt for an integration.
  3. Switching documentation platforms - There are many open-source or open-source-friendly documentation frameworks and platforms that have better native i18n support than MkDocs has currently. You have likes of Starlight by Astro, VitePress, Nextra, Docus, GitBook, or Mintlify. While we've already established ourselves well on MkDocs + mkdocs-material, a framework switch could be one of the few viable solutions in the long-term to maintain localization.

Closing words

I'm leaving this RFC completely open to discuss. As mentioned in the beginning, Amy and I have been working hard on this topic way back, but it's about time our previous findings, possible blockers, and potential solutions are well-documented for our community to contribute their smart minds to.

I'd like you, the reader, to ask yourself the 4 questions I've asked with each heading in this RFC. If you get to answers other than the ones I've come up with, or you're able to elaborate on one of my existing answers, then leave a comment on this RFC and let's discuss it.

I'm open to editing, updating, and adding any new points or sections to this RFC as research and discussion progresses.

[^2]: There's currently no feedback or user messaging that indicates us losing users in the “funnel” due to the language of our documentation. It's purely an assumption of mine. [^3]: With “SEO-ready and accessible” I mean statically generated HTML that has the correct and relevant meta tags + attributes to indicate to crawlers and accessibility tools what language the site is in and where to find other languages for the same content (e.g. alternate meta tags, language switcher). [^4]: Crowdin seemingly supports both Markdown and MDX. So, Crowdin specifically, shouldn't be difficult implementing for purely documentation files. [^5]: https://squidfunk.github.io/mkdocs-material/setup/changing-the-language/

whallin avatar Oct 21 '25 18:10 whallin

I advise against it. The best you can do is rely on machine translate which is nowdays on a good level. The reason is human factor. Everytime the new docs commit comes, there has to be bunch of translators to implement it. This is not sustainable and will lead to more confused users seeking help, because of the outdated and incomplete info.

triuk avatar Oct 21 '25 19:10 triuk

I advise against it.

That has pretty much always been the consensus internally when this topic has been brought up. However, to follow up on your comment:

Do you have any ideas for how we could get high-quality machine translated documentation or reliable real-time machine translations easily accessible to end-users? I'm not thinking your typical "Translate this page" in Chrome or embedding the Google Translate toolbar on our site. I'm thinking actual quality machine translations that do a better job at respecting terminology and specific terms than Google Translate.

there has to be bunch of translators to implement it.

Looking at the PreMiD project - they've got a handful of languages and a ton of community translators. Sure, they still lack translations, and they're not translating purely documentation. I personally believe this is fine. As long as you implement a sleek technical fall back that gets the source language if the translated version doesn't exist - I think that'd still be a win in Atlas' book.

If we're looking at the PreMiD project as a role model for our efforts, the question becomes: How have they've gotten translators to contribute? What has their communication looked like to catch people's interest? Has there been any sort of offered incentives for their pro bono work?

whallin avatar Oct 21 '25 19:10 whallin

The standard mkdocs-static-i18n plugin fulfills all criteria in this RFC and is fully Material compatible. It has been deployed at scale already by FastAPI.

https://ultrabug.github.io/mkdocs-static-i18n/

It could be combined with Crowdin or Weblate for ease of use for translators.

As for convincing people to take part, that's a whole other issue.


It is also possible to modify the markdown or the generated HTML directly (through a Python file) with mkdocs-macros.

https://mkdocs-macros-plugin.readthedocs.io/en/latest/post_production/

Ast3risk-ops avatar Oct 21 '25 19:10 Ast3risk-ops

If we're looking at the PreMiD project as a role model for our efforts

SW translations are different from docs. In SW you just translate buttons and short descriptions that has exclusive place and hardly change meaning during development. You can easily fall back to default lang, if the translation is not provided. Now look at the docs - how are you going to divide the whole paragraphs in the same way? Can you imagine, how the "fallback to default lang" would work in this environment? Because I can not. BTW I don't even see them translating docs: https://docs.premid.app/

triuk avatar Oct 21 '25 20:10 triuk

Can you imagine, how the "fallback to default lang" would work in this environment?

If there is a missing translated page, most i18n plugins will fall back to a dev specified default locale (usually English).

Ast3risk-ops avatar Oct 21 '25 20:10 Ast3risk-ops

@Ast3risk-ops I am not going to repeat myself. Even the linked role model PreMiD project with "ton of community translators" does not have translated docs ...

triuk avatar Oct 21 '25 20:10 triuk

@Ast3risk-ops I am not going to repeat myself. Even the linked role model PreMiD project with "ton of community translators" does not have translated docs ...

I never said it did. Look a little closer at what part I'm responding to.

Ast3risk-ops avatar Oct 21 '25 20:10 Ast3risk-ops

Hi everyone 👋

I just wanted to suggest an option that might fit your needs regarding translations and community contributions.

Crowdin offers a free plan for open-source projects, which could make localization and doc translations much easier to manage.

You can request the setup here → https://crowdin.com/page/open-source-project-setup-request

It’s completely free for open-source projects as long as:

the code is public and under an open-source license,

the project isn’t tied to a paid/commercial product,

and there’s an active community of contributors.

Crowdin integrates nicely with GitHub (auto-syncs pull requests for new or updated translations) and provides a good interface for community translators.

I know several open-source communities that use it successfully — for example stardb.gg and paimon.moe — and it works really well for organizing translations and keeping them up to date.

It might be worth considering if Atlas OS plans to make its documentation or UI multilingual in the future. 🙂

kdefarge avatar Oct 22 '25 18:10 kdefarge

Hi everyone 👋

I just wanted to suggest an option that might fit your needs regarding translations and community contributions.

Crowdin offers a free plan for open-source projects, which could make localization and doc translations much easier to manage.

You can request the setup here → https://crowdin.com/page/open-source-project-setup-request

It’s completely free for open-source projects as long as:

the code is public and under an open-source license,

the project isn’t tied to a paid/commercial product,

and there’s an active community of contributors.

Crowdin integrates nicely with GitHub (auto-syncs pull requests for new or updated translations) and provides a good interface for community translators.

I know several open-source communities that use it successfully — for example stardb.gg and paimon.moe — and it works really well for organizing translations and keeping them up to date.

It might be worth considering if Atlas OS plans to make its documentation or UI multilingual in the future. 🙂

I'd argue Weblate is the better option since Atlas does have servers to run it on. Why work with a SaaS when you can host and control the whole translation platform?

Ast3risk-ops avatar Oct 23 '25 13:10 Ast3risk-ops

I'd argue Weblate is the better option since Atlas does have servers to run it on.

I'd argue the same. However, while we do have our own hardware, we don't necessarily have the people capacity to make sure everything is in order. If Weblate were to go down, or it needs a major update that involves migrations, or similar - the Atlas project just doesn't have the people available to check in on these kinds of things.

I'm not ruling out a self-hosted Weblate instance by any means. I, too, know that it has incredible levels of power for being a self-hosted tool. I'm just worried about our internal capacity in being able to keep our infrastructure monitored and managed as needed for deploying a partly community-critical tool.

whallin avatar Oct 24 '25 07:10 whallin

Just a small suggestion — why not try the Crowdin open-source offer first? It’s completely free for open-source projects and doesn’t require any hosting or maintenance on your side, which could save time and resources while still giving the community a clean, user-friendly translation interface.

Honestly, I find Crowdin quite simple to use, and several open-source communities already rely on it successfully — for example:

paimon.moe

stardb.gg

If it turns out not to fit Atlas’s workflow, you could still revisit the idea of a self-hosted Weblate instance later.

As my computer science teacher used to say — no need to reinvent the wheel!

kdefarge avatar Oct 24 '25 18:10 kdefarge

Just a small suggestion — why not try the Crowdin open-source offer first? It’s completely free for open-source projects and doesn’t require any hosting or maintenance on your side, which could save time and resources while still giving the community a clean, user-friendly translation interface.

Honestly, I find Crowdin quite simple to use, and several open-source communities already rely on it successfully — for example:

paimon.moe

stardb.gg

If it turns out not to fit Atlas’s workflow, you could still revisit the idea of a self-hosted Weblate instance later.

As my computer science teacher used to say — no need to reinvent the wheel!

But you're still relying on the """generosity""" of a company that could take this all away or make unwanted changes at any time.

Github is already going to shit, who's to say Crowdin won't be next? Having control over something this crucial is important so you won't be stuck if something happens. I do know this from moving the blendOS source code off to a self-hosted Gitlab server, the amount of control and security you have that way is unrivaled, you're not relying on anybody else and you're responsible for everything (for better or for worse).

@whallin is not inexperienced with this, and I really think self-hosted Weblate should be at least attempted before crawling to a SaaS and asking for an offer that could just get retracted at any time.

Ast3risk-ops avatar Oct 24 '25 19:10 Ast3risk-ops

Just a small suggestion — why not try the Crowdin open-source offer first? It’s completely free for open-source projects and doesn’t require any hosting or maintenance on your side, which could save time and resources while still giving the community a clean, user-friendly translation interface. Honestly, I find Crowdin quite simple to use, and several open-source communities already rely on it successfully — for example: paimon.moe stardb.gg If it turns out not to fit Atlas’s workflow, you could still revisit the idea of a self-hosted Weblate instance later. As my computer science teacher used to say — no need to reinvent the wheel!

But you're still relying on the """generosity""" of a company that could take this all away or make unwanted changes at any time.

Github is already going to shit, who's to say Crowdin won't be next? Having control over something this crucial is important so you won't be stuck if something happens. I do know this from moving the blendOS source code off to a self-hosted Gitlab server, the amount of control and security you have that way is unrivaled, you're not relying on anybody else and you're responsible for everything (for better or for worse).

@whallin is not inexperienced with this, and I really think self-hosted Weblate should be at least attempted before crawling to a SaaS and asking for an offer that could just get retracted at any time.

I totally get the concern about relying on a third-party service — it's a valid point.

That said, I think the real key here is good backup practice, not full self-hosting. If GitHub went down tomorrow, we'd have far bigger issues than the Atlas translations 😅

Crowdin lets you sync or export all translation files automatically (via API or GitHub integration), so as long as we keep regular backups, there’s no real vendor lock-in.

Projects like paimon.moe have used Crowdin for over 5 years now without any issue. It’s a proven, stable solution — and if one day things change, we can always migrate elsewhere.

To me, the important thing is not reinventing the wheel when a solid, free open-source offer already exists.

kdefarge avatar Oct 24 '25 21:10 kdefarge

Just a small suggestion — why not try the Crowdin open-source offer first? It’s completely free for open-source projects and doesn’t require any hosting or maintenance on your side, which could save time and resources while still giving the community a clean, user-friendly translation interface. Honestly, I find Crowdin quite simple to use, and several open-source communities already rely on it successfully — for example: paimon.moe stardb.gg If it turns out not to fit Atlas’s workflow, you could still revisit the idea of a self-hosted Weblate instance later. As my computer science teacher used to say — no need to reinvent the wheel!

But you're still relying on the """generosity""" of a company that could take this all away or make unwanted changes at any time. Github is already going to shit, who's to say Crowdin won't be next? Having control over something this crucial is important so you won't be stuck if something happens. I do know this from moving the blendOS source code off to a self-hosted Gitlab server, the amount of control and security you have that way is unrivaled, you're not relying on anybody else and you're responsible for everything (for better or for worse). @whallin is not inexperienced with this, and I really think self-hosted Weblate should be at least attempted before crawling to a SaaS and asking for an offer that could just get retracted at any time.

I totally get the concern about relying on a third-party service — it's a valid point.

That said, I think the real key here is good backup practice, not full self-hosting. If GitHub went down tomorrow, we'd have far bigger issues than the Atlas translations 😅

Crowdin lets you sync or export all translation files automatically (via API or GitHub integration), so as long as we keep regular backups, there’s no real vendor lock-in.

Projects like paimon.moe have used Crowdin for over 5 years now without any issue. It’s a proven, stable solution — and if one day things change, we can always migrate elsewhere.

To me, the important thing is not reinventing the wheel when a solid, free open-source offer already exists.

If we're going to pick one to try first, better the self-hostable option you can take down right away than an offer you have to commit to.

Ast3risk-ops avatar Oct 25 '25 02:10 Ast3risk-ops

Just a small suggestion — why not try the Crowdin open-source offer first? It’s completely free for open-source projects and doesn’t require any hosting or maintenance on your side, which could save time and resources while still giving the community a clean, user-friendly translation interface. Honestly, I find Crowdin quite simple to use, and several open-source communities already rely on it successfully — for example: paimon.moe stardb.gg If it turns out not to fit Atlas’s workflow, you could still revisit the idea of a self-hosted Weblate instance later. As my computer science teacher used to say — no need to reinvent the wheel!

But you're still relying on the """generosity""" of a company that could take this all away or make unwanted changes at any time. Github is already going to shit, who's to say Crowdin won't be next? Having control over something this crucial is important so you won't be stuck if something happens. I do know this from moving the blendOS source code off to a self-hosted Gitlab server, the amount of control and security you have that way is unrivaled, you're not relying on anybody else and you're responsible for everything (for better or for worse). @whallin is not inexperienced with this, and I really think self-hosted Weblate should be at least attempted before crawling to a SaaS and asking for an offer that could just get retracted at any time.

I totally get the concern about relying on a third-party service — it's a valid point.

That said, I think the real key here is good backup practice, not full self-hosting. If GitHub went down tomorrow, we'd have far bigger issues than the Atlas translations 😅

Crowdin lets you sync or export all translation files automatically (via API or GitHub integration), so as long as we keep regular backups, there’s no real vendor lock-in.

Projects like paimon.moe have used Crowdin for over 5 years now without any issue. It’s a proven, stable solution — and if one day things change, we can always migrate elsewhere.

To me, the important thing is not reinventing the wheel when a solid, free open-source offer already exists.

If we're going to pick one to try first, better the self-hostable option you can take down right away than an offer you have to commit to.

I totally understand your point about control and self-hosting — it’s a fair concern.

I just wanted to share Crowdin’s open-source offer as an option, since it’s free and easy to test.

Anyway, I’ll leave it to the Atlas team and volunteers to decide what fits best for their workflow. I completely get that time and people are valuable resources, and sometimes it’s better to focus them on other priorities within the project.

kdefarge avatar Oct 25 '25 16:10 kdefarge

The material for mkdocs authors are abandoning mkdocs as it is becoming a serious supply chain issue due to lack of maintenenace. They've created a new static site generator which means everything discussed here is now null and void: https://squidfunk.github.io/mkdocs-material/blog/2025/11/05/zensical/

It has compatibility with existing material for mkdocs configs and the HTML/CSS is structured the exact same but how i18n and macros would exist in the WIP modules system is unknown (would anybody make them?).

Ast3risk-ops avatar Nov 05 '25 15:11 Ast3risk-ops