bertalign icon indicating copy to clipboard operation
bertalign copied to clipboard

Multilingual sentence alignment using sentence embeddings

Results 11 bertalign issues
Sort by recently updated
recently updated
newest added

Hi, thanks for providing this code! Could you please give more information (e.g. a brief explanation) of the following options? - max_align=5 - top_k=3 - win=5 - skip=-0.1 - margin=True...

In case you haven't seen it: Same as alvinntnu in #3, I would require different ways of handing over data to the package. I have detailed the ideas in the...

Thanks for the nice package! Following up on [the issue concerning preprocessing suggestions](https://github.com/bfsujason/bertalign/issues/3), I have implemented an alternative parametrization that I would like to discuss. I hope you have time...

Thank you for creating this wonderful package. I just had a quick question about improving the accuracy of the alignment. Do you have any suggestions about text preprocessing, especially with...

This code by default loads `LaBSE` when `bertalign` is imported. I think it would be more convennient to load the encoder when the `align_sents` method is called and give the...

Instead of relying on `googletrans`, it would be convennient if I could do: ``` aligner = Bertalign( src=en, tgt=de, src_lang='en', tgt_lang='de', ``` Language only matters for Sentence Splitting and it...

If I want to attempt a trilingual alignment of a literary work, is it more efficient to align the third language with one of the already segmented texts from an...

Thanks a lot for the impressive tool. How can additional languages be included? It seems that the sentence-transformers library supports many more …

Hello, thanks for your awesome aligner! Unfortunately, a recent change in the huggingface_hub has broken your code: https://github.com/huggingface/huggingface_hub/releases/tag/v0.26.0 The cached_download() method was completely removed and bertalign complains that `ImportError: cannot...