Generate translated glossaries
Currently, the English glossary exists, and translations are waiting to be added. For that, we need a script similar to the summaries.py that parses the GDocs and creates the glossary in /content/glossary/german and other language folder
Links to German docs [currently contain English text as well that needs to be removed during parsing]: https://drive.google.com/drive/u/0/folders/1PrX97lGjRGHvvUJgTZYGtZoC6GHqTU7p
So, the tasks are:
- [x] generate JSONs from Docs
- [x] generate MDs from JSONs
- [x] make this multi-lingual (likely with Master G-Sheet including all file links)
- [ ] add language selector to English _index.md
@flavioazevedo can you check and add your email with the other relevant links?
Yes, thanks for this @LukasWallrich ! I am super happy you will be able to help us with this! š Please let me know if I can be useful somehow!
re: language selector, if possible, we want to have buttons that are large on the page that people click to go directly to the desired language. Below is a picture of buttons implemented on our website from the Curated resources page https://forrt.org/resources/
So the text I had shared previously is below. I hope it is useful
Our community defined 350 terms in open science in a series of Google docs, where we worked collaboratively. Then, we used a Python script to scrape the information from Google Docs, and we transformed it into a JSON that outputs .md files, which are then read and included as individual entries on our website.Ā Our website is built with Hugoblox (https://hugoblox.com/). Our issue is that we lost the code we used for this operation. Thankfully, a big part of this work was done for another project. So we still haveĀ something really close to what we did for the glossary (code and output) but for the summaries project (https://forrt.org/summaries/), where we used the same procedure as above, we have the python (https://github.com/forrtproject/forrtproject.github.io/blob/master/content/summaries/summaries.py), which should give you a leg up, and the .json file: https://github.com/forrtproject/forrtproject.github.io/blob/master/data/summaries.json Here it is two of the Google docs we want to extract information from:Ā 1. https://docs.google.com/document/d/1Br-tqLh_nOXnjmddBmKCmTFLdDFmbhD5FA6T9v8GatU/edit?usp=drive_link 2. https://docs.google.com/document/d/196z4wBqjQAuNg3I8dwZY-di6S1sF_hwx1Y5AFwikKHI/edit?usp=drive_linkĀ
I now created a master file that will need to contain the links to published versions of all glossary entries - done for German, but @flavioazevedo you can add Arabic etc here as well when that is done (and it would be good to have more than 1 language in place for final testing, but no rush)
[I now got this to largely work with the data as is - some would require manual fixes, but not worth the effort imho.]
@flavioazevedo how much do we care about links to related concepts? Currently, they are broken in the English version as well (#132) - if they are important to have, we need to add the original English title to each translated glossary entry (I cannot extract them as some of the titles already include brackets).
Given that users can just scroll to the right entry in the navigation, I don't think this is crucial, but obviously nice to have.
When terms get created, the glossary folder is overwritten so that entries can be removed. However, _index.md must not be deleted - @LukasWallrich fix that in the script & indeed check in the script whether any _index.md is missing and warn user (most relevant when new languages are added).
/remind me on Friday to see whether reminders now work [this is just a test]
@LukasWallrich set a reminder for 7/12/2024
:wave: @LukasWallrich, see whether reminders work [this is just a test]
This seems done - @flavioazevedo if there are issues remaining with the glossary translations, please let me know and maybe add a more targeted issue? We need to add Arabic and Serbian, but that is with Mahmoud for now ...
Thank you Lukas! Seems done to me as well :)