pyfamsa icon indicating copy to clipboard operation
pyfamsa copied to clipboard

Alignment Scoring

Open jlotthammer opened this issue 2 years ago • 4 comments

Hi, this project looks very interesting.

I was just wondering if there's support for custom/different scoring matrices in the alignment procedure? If not, would this be challenging to implement or an easy addition?

Thanks so much in advance.

jlotthammer avatar Apr 09 '24 21:04 jlotthammer

I'm also interested in this issue, but it seems like the answer is no.

maovshao avatar Apr 27 '24 11:04 maovshao

Hi @jlotthammer, this is not supported out the box by FAMSA but I may be able to hack something by accessing the private attributes of the C++ aligner class.

althonos avatar Apr 29 '24 12:04 althonos

Hello @jlotthammer @maovshao @althonos Sorry for the delay! In the current version there is no way of replacing the scoring matrix. Potentially, this should be easy to implement. I will try to do this soon (there are some other issues hanging I need to resolve) but please assume "soon" to be something like month or so ;)

agudys avatar Apr 29 '24 12:04 agudys

That’d be fantastic - thanks so much guys!

jlotthammer avatar Apr 29 '24 13:04 jlotthammer

I am doing some refactoring because this is my fifth library that would be using scoring matrices so I figure it's better to just centralize the code, I created a new repo: https://github.com/althonos/scoring-matrices

In the next release you'll be able to create a ScoringMatrix the way you want and pass it to the Aligner on initialization, but I need to fix some things regarding conda distribution first :+1:

althonos avatar May 04 '24 08:05 althonos

Thanks to all creators for your attention.

Looking forward to the update!

Best.

maovshao avatar May 05 '24 13:05 maovshao

You can now use custom scoring matrices with the PyFAMSA aligner in release 0.4.0, which has just been pushed on PyPI. Bioconda will take a little bit more time.

Either pass a matrix name:

import pyfamsa

aligner = pyfamsa.Aligner(scoring_matrix="BLOSUM62")

Or a ScoringMatrix object from the scoring-matrices package, using the FAMSA alphabet:

import pyfamsa
import scoring_matrices

matrix = scoring_matrices.ScoringMatrix.from_name("BLOSUM62")
aligner = pyfamsa.Aligner(scoring_matrix=matrix.shuffle(pyfamsa.FAMSA_ALPHABET))

althonos avatar May 06 '24 15:05 althonos