ggmsa icon indicating copy to clipboard operation
ggmsa copied to clipboard

ggmsa plot error due to unique names

Open akjrobijns opened this issue 4 years ago • 4 comments

Hello, I'm trying to use ggmsa to plot an amino acid sequence alignment. I've aligned it using the msa() function and then converted it to an AAbin type file to then use with ggmsa().

When I try to make the ggmsa plot like this:

ggmsa(AA_alignment1_cv, 120, 220, color = "Clustal", font = "DroidSansMono", char_width = 0.5)

I get this error:

Error in tidy_msa(msa, start = start, end = end) : Sequences must have unique names

My sequences do all have different names? (I am using gene IDs in many cases) What counts as a 'unique name'? For example are these not unique because they start the same? "Alyli.0014s0106" and "Alyli.0091s0126"

Thanks!

akjrobijns avatar Nov 24 '21 10:11 akjrobijns

Hi @akjrobijns,

unique names request that there is no duplication for full sequence name, and the same prefix "Alyli" is allowed.

For example:

  • [x] "Alyli.0014s0106" and "Alyli.0091s0126" is allowed (√)
  • [ ] "Alyli.0014s0106" and "Alyli.0014s0106" is wrong (×)

you can check whether including duplicate sequence names in alignment :

n <- names(AA_alignment1_cv)
dup <- n[duplicated(n)] 
dup

Thanks, Lang

nyzhoulang avatar Nov 24 '21 16:11 nyzhoulang

Hi,

I'm struggling with the same issue here, trying to plot an MSA generated with msa().

The output of this function is an MsaAAMultipleAlignment object that is apparently not understood by ggmsa(). My solution was the same as explained here: using msa::msaConvert(x, type="ape::AAbin") to input an AAbin object into ggmsa().

So far I have not succeeded and I get the same error message as indicated above:

Error in tidy_msa(msa, start = start, end = end) : Sequences must have unique names

I've checked and there are definitely no duplicated IDs in my dataset, although I must say that names(alignment) returns NULL -the path to get the names from my AAbin alignment should be labels(alignment).

Here's my code if it is of any help:

library(msa)
library(ggmsa)

seqs <- readAAStringSet("data.fasta")
alignment <- msa(seqs, method="ClustalOmega")
alignment <- msaConvert(alignment, type="ape::AAbin")
ggmsa(alignment, char_width = 0.5) + geom_seqlogo() + geom_msaBar()

rvazqf avatar Jul 27 '22 13:07 rvazqf

Hello, I am having the same issue. Any help would be greatly appreciated!

maxfieldk avatar Feb 14 '23 16:02 maxfieldk