forrtproject.github.io icon indicating copy to clipboard operation
forrtproject.github.io copied to clipboard

Generate translated glossaries

Open LukasWallrich opened this issue 1 year ago • 8 comments

Currently, the English glossary exists, and translations are waiting to be added. For that, we need a script similar to the summaries.py that parses the GDocs and creates the glossary in /content/glossary/german and other language folder

Links to German docs [currently contain English text as well that needs to be removed during parsing]: https://drive.google.com/drive/u/0/folders/1PrX97lGjRGHvvUJgTZYGtZoC6GHqTU7p

So, the tasks are:

  • [x] generate JSONs from Docs
  • [x] generate MDs from JSONs
  • [x] make this multi-lingual (likely with Master G-Sheet including all file links)
  • [ ] add language selector to English _index.md

LukasWallrich avatar Jul 03 '24 14:07 LukasWallrich

@flavioazevedo can you check and add your email with the other relevant links?

LukasWallrich avatar Jul 03 '24 14:07 LukasWallrich

Yes, thanks for this @LukasWallrich ! I am super happy you will be able to help us with this! šŸŽ‰ Please let me know if I can be useful somehow!

re: language selector, if possible, we want to have buttons that are large on the page that people click to go directly to the desired language. Below is a picture of buttons implemented on our website from the Curated resources page https://forrt.org/resources/

image

So the text I had shared previously is below. I hope it is useful

Our community defined 350 terms in open science in a series of Google docs, where we worked collaboratively. Then, we used a Python script to scrape the information from Google Docs, and we transformed it into a JSON that outputs .md files, which are then read and included as individual entries on our website.Ā  Our website is built with Hugoblox (https://hugoblox.com/). Our issue is that we lost the code we used for this operation. Thankfully, a big part of this work was done for another project. So we still haveĀ something really close to what we did for the glossary (code and output) but for the summaries project (https://forrt.org/summaries/), where we used the same procedure as above, we have the python (https://github.com/forrtproject/forrtproject.github.io/blob/master/content/summaries/summaries.py), which should give you a leg up, and the .json file: https://github.com/forrtproject/forrtproject.github.io/blob/master/data/summaries.json Here it is two of the Google docs we want to extract information from:Ā 1. https://docs.google.com/document/d/1Br-tqLh_nOXnjmddBmKCmTFLdDFmbhD5FA6T9v8GatU/edit?usp=drive_link 2. https://docs.google.com/document/d/196z4wBqjQAuNg3I8dwZY-di6S1sF_hwx1Y5AFwikKHI/edit?usp=drive_linkĀ 

flavioazevedo avatar Jul 03 '24 15:07 flavioazevedo

I now created a master file that will need to contain the links to published versions of all glossary entries - done for German, but @flavioazevedo you can add Arabic etc here as well when that is done (and it would be good to have more than 1 language in place for final testing, but no rush)

LukasWallrich avatar Jul 03 '24 19:07 LukasWallrich

[I now got this to largely work with the data as is - some would require manual fixes, but not worth the effort imho.]

@flavioazevedo how much do we care about links to related concepts? Currently, they are broken in the English version as well (#132) - if they are important to have, we need to add the original English title to each translated glossary entry (I cannot extract them as some of the titles already include brackets).

Given that users can just scroll to the right entry in the navigation, I don't think this is crucial, but obviously nice to have.

LukasWallrich avatar Jul 03 '24 20:07 LukasWallrich

When terms get created, the glossary folder is overwritten so that entries can be removed. However, _index.md must not be deleted - @LukasWallrich fix that in the script & indeed check in the script whether any _index.md is missing and warn user (most relevant when new languages are added).

LukasWallrich avatar Jul 10 '24 21:07 LukasWallrich

/remind me on Friday to see whether reminders now work [this is just a test]

LukasWallrich avatar Jul 10 '24 21:07 LukasWallrich

@LukasWallrich set a reminder for 7/12/2024

github-actions[bot] avatar Jul 10 '24 21:07 github-actions[bot]

:wave: @LukasWallrich, see whether reminders work [this is just a test]

github-actions[bot] avatar Jul 13 '24 01:07 github-actions[bot]

This seems done - @flavioazevedo if there are issues remaining with the glossary translations, please let me know and maybe add a more targeted issue? We need to add Arabic and Serbian, but that is with Mahmoud for now ...

LukasWallrich avatar Sep 10 '24 13:09 LukasWallrich

Thank you Lukas! Seems done to me as well :)

flavioazevedo avatar Sep 12 '24 13:09 flavioazevedo