more languages
Thanks a lot for the impressive tool. How can additional languages be included? It seems that the sentence-transformers library supports many more …
Thank you for your interest in Bertalign! The LaBSE model supports over 100 languages. However, Bertalign relies on sentence-splitter for sentence segmentation, which currently supports only 25 languages.
If you need to align other languages, you might consider using alternative sentence segmentation tools such as pySBD and Ersatz. These tools offer broader language support and may better suit your needs.
Hi, my data has already undergone sentence splitting. How would I simply skip the sentence-splitting step?
Thanks for your work!!