bertalign
bertalign copied to clipboard
Multilingual sentence alignment using sentence embeddings
Hi, thanks for providing this code! Could you please give more information (e.g. a brief explanation) of the following options? - max_align=5 - top_k=3 - win=5 - skip=-0.1 - margin=True...
In case you haven't seen it: Same as alvinntnu in #3, I would require different ways of handing over data to the package. I have detailed the ideas in the...
Thanks for the nice package! Following up on [the issue concerning preprocessing suggestions](https://github.com/bfsujason/bertalign/issues/3), I have implemented an alternative parametrization that I would like to discuss. I hope you have time...
Thank you for creating this wonderful package. I just had a quick question about improving the accuracy of the alignment. Do you have any suggestions about text preprocessing, especially with...
This code by default loads `LaBSE` when `bertalign` is imported. I think it would be more convennient to load the encoder when the `align_sents` method is called and give the...
Instead of relying on `googletrans`, it would be convennient if I could do: ``` aligner = Bertalign( src=en, tgt=de, src_lang='en', tgt_lang='de', ``` Language only matters for Sentence Splitting and it...
If I want to attempt a trilingual alignment of a literary work, is it more efficient to align the third language with one of the already segmented texts from an...
Thanks a lot for the impressive tool. How can additional languages be included? It seems that the sentence-transformers library supports many more …
Hello, thanks for your awesome aligner! Unfortunately, a recent change in the huggingface_hub has broken your code: https://github.com/huggingface/huggingface_hub/releases/tag/v0.26.0 The cached_download() method was completely removed and bertalign complains that `ImportError: cannot...