graphtyper icon indicating copy to clipboard operation
graphtyper copied to clipboard

SV genotyping: Multiple BAMs for single sample

Open tgong1 opened this issue 8 months ago • 1 comments

Dear author,

Thank you for providing and maintaining this useful software.

We have a question regarding SV genotyping using graphtyper. There are 10 samples in our cohort having two BAMs, generated with different Illumina machines. The RG-SM tag in two different BAMs are same, and matching the sample name in VCF.

I'm wondering how graphtyper handle multiple BAMs for one sample? If it's ok to provide all BAMs as input BAMLIST or you will suggest me to merge the two BAMs for one sample first, then provide one BAM for each sample?

Thank you for your time and help!

Best, Tingting

tgong1 avatar May 20 '25 07:05 tgong1

Hi,

If you would have both in your bamlist graphtyper will make calls for each of them. The output VCF would then have duplicated sample names (which is against the spec) so it would cause some problems.

So yes, if you want one call for both BAMs, I think you need to merge them with samtools first.

If you want separate calls for the two files in a valid VCF, you can make a symlink with names i.e. <ID>.bam and <ID>-DUP.bam and use the --get_sample_names_from_filename option. Graphtyper will then ignore the RG SM tags and just use <ID> and <ID>-DUP as sample names.

Best, Hannes

hannespetur avatar May 23 '25 10:05 hannespetur