bertalign
bertalign copied to clipboard
allow user to swap encoders
This code by default loads LaBSE when bertalign is imported. I think it would be more convennient to load the encoder when the align_sents method is called and give the user the option to specify a different model.
This could be convennient in cases where the user already knows with what languages they deal with and are running it on their CPU. Smaller encoders that support fewer languages and have smaller embedding dimensions run faster than LaBSE