MToolBox icon indicating copy to clipboard operation
MToolBox copied to clipboard

different haplotype groups with genomic samples coming from a same person

Open vmorozov opened this issue 9 years ago • 0 comments

Thanks for developing MToolBox and making it available to scientific community! I have noticed that it sometimes assigns different haplotype groups to genomic samples coming from a same person(sample collection replicates). Here are the top haplotyp[e group assignments from two samples that belong to one person: Sequence.Name Predicted.Haplogroup N Nph Nph_tot Nph_exp P_Hg Missing.sites 1 Contig.1 H20 190 44 44 44 100.000
2 Contig.1 H86 190 44 44 44 100.000
3 Contig.1 H1bt 190 44 45 45 97.778 Transition(3010) 4 Contig.1 H20c 190 44 45 45 97.778 Transition(7334) 5 Contig.1 H20a 190 44 45 45 97.778 Transversion(16328) 6 Contig.1 H3x 190 44 45 45 97.778 Transition(6776)

Sequence.Name Predicted.Haplogroup N Nph Nph_tot Nph_exp P_Hg Missing.sites 1 Contig.1 H20b 211 45 49 49 91.837 Transition(2835);Transition(10115);Transition(14968);Transition(15562) 2 Contig.1 H2a2a1f 211 45 50 50 90.000 Transition(263);Transition(750);Transition(1438);Transition(4769);Transition(15326) 3 Contig.1 H20 211 44 44 44 100.000
4 Contig.1 H86 211 44 44 44 100.000
5 Contig.1 H18 211 44 45 45 97.778 Transition(14364) 6 Contig.1 H1bt 211 44 45 45 97.778 Transition(3010)

http://www.hmtdb.uniba.it/ classification tool make same haplotype assignments (Haplogroup Prediction Results: H86 (97.73%); H (97.68%); H_152 (97.62%);…….) from the MToolBox contig.fasta sequences

In this case the MToolBox different haplotypegroup assignments come from using longest match to Phylotree (largest Nph) which is not best haplogroup genotype matching

I have another case where diffrent haplotype groups are assigned to a mother-child sample pair: FamID SubjectID type sample TissueID batch HaploGroup Sequence.Name Predicted.Haplogroup N Nph Nph_tot Nph_exp P_Hg Missing.sites rank 1 169 135 child ALSTDI_135_1 135 G90599 H2a2a1 Contig.1 H2a2a1 160 49 49 49 100.000 1 2 169 135 child ALSTDI_135_1 135 G90599 H2a2a1 Contig.1 H2a2a1b 160 49 50 50 98.000 Transition(9299) 2 3 169 135 child ALSTDI_135_1 135 G90599 H2a2a1 Contig.1 H2a2a1d 160 49 50 50 98.000 Transition(16172) 3 4 169 135 child ALSTDI_135_1 135 G90599 H2a2a1 Contig.1 H2a2a1e 160 49 50 50 98.000 Transition(8182) 4 5 169 135 child ALSTDI_135_1 135 G90599 H2a2a1 Contig.1 H2a2a1f 160 49 50 50 98.000 Transition(93) 5 6 169 135 child ALSTDI_135_1 135 G90599 H2a2a1 Contig.1 G2a1g 160 49 50 50 98.000 Transition(16227) 6 7 169 135 child ALSTDI_135_1 135 G90599 H2a2a1 Contig.1 H2a2a1a 160 49 50 50 98.000 Transition(15314) 7 8 169 135 child ALSTDI_135_1 135 G90599 H2a2a1 Contig.1 H2a2a1c 160 49 51 51 96.078 Transition(6632);Transition(16051) 8 9 169 135 child ALSTDI_135_1 135 G90599 H2a2a1 Contig.1 H2a2a 160 48 48 48 100.000 9 10 169 169 parent ALSTDI_242960_1 242960 G85830 H3h Contig.1 H3h 162 44 44 44 100.000 1 11 169 169 parent ALSTDI_242960_1 242960 G85830 H3h Contig.1 H3h1 162 44 45 45 97.778 Transition(8705) 2 12 169 169 parent ALSTDI_242960_1 242960 G85830 H3h Contig.1 H3h6 162 44 45 45 97.778 Transition(4025) 3 13 169 169 parent ALSTDI_242960_1 242960 G85830 H3h Contig.1 H3h5 162 44 45 45 97.778 Transition(10589) 4 14 169 169 parent ALSTDI_242960_1 242960 G85830 H3h Contig.1 H3h3 162 44 45 45 97.778 Transition(13967) 5 15 169 169 parent ALSTDI_242960_1 242960 G85830 H3h Contig.1 H3h2 162 44 45 45 97.778 Transition(5960) 6 16 169 169 parent ALSTDI_242960_1 242960 G85830 H3h Contig.1 H3x 162 44 45 45 97.778 Transition(16311) 7 17 169 169 parent ALSTDI_242960_1 242960 G85830 H3h Contig.1 H3h3a 162 44 46 46 95.652 Transition(13967);Transition(15470) 8 18 169 169 parent ALSTDI_242960_1 242960 G85830 H3h Contig.1 H3_16311 162 43 43 43 100.000 9

http://www.hmtdb.uniba.it:8080/hmdb/ correctly assign same haplotypes to the MToolBox contigs.fasta sequences as input: 169 parent ALSTDI_242960_1: Haplogroup Prediction Results: H3h (97.73%); H3_16311 (97.68%); HV_16311 (97.50%); H3h1 (95.57%); H3h2 (95.57%); H3h3 (95.57%); H3h5 (95.57%); H3h6 (95.57%); H3 (95.46%); 135 child ALSTDI_135_1: Haplogroup Prediction Results: H3h (97.73%); H3_16311 (97.68%); HV_16311 (97.50%); H3h1 (95.57%); H3h2 (95.57%); H3h3 (95.57%); H3h5 (95.57%); H3h6 (95.57%); H3 (95.46%);

Is it possible to run MToolBox using the same haplogroup assignment algorithm as in http://www.hmtdb.uniba.it/?

vmorozov avatar Apr 11 '16 12:04 vmorozov