biopython icon indicating copy to clipboard operation
biopython copied to clipboard

use regex on Bio.Nexus.Trees.Tree parser

Open jrom99 opened this issue 1 year ago • 6 comments

  • [X] I hereby agree to dual licence this and any previous contributions under both the Biopython License Agreement AND the BSD 3-Clause License.

  • [X] I have read the CONTRIBUTING.rst file, have run pre-commit locally, and understand that continuous integration checks will be used to confirm the Biopython unit tests and style checks pass with these changes.

  • [ ] I have added my name to the alphabetical contributors listings in the files NEWS.rst and CONTRIB.rst as part of this pull request, am listed already, or do not wish to be listed. (This acknowledgement is optional.)

The current Nexus parser is slow for big trees as it loops over each character and builds slices at each position. A simpler approach would be using a regular expression and "reconstructing" the position p. I don't address the quadratic complexity in parsing the same string multiple times (each self._parse(subtree) has already iterated over the characters and each children's level) since the proposed change already provides a big speedup.

I ran the following script with a 6MB tree that had huge comments at each node.

if __name__ == "__main__":
    args = parse_args()

    print("Reading tree...")

    # MCC nexus format requires conversion nexus -> figtree -> nexus
    # biopython doesn't understand "translate" block
    t0 = time.perf_counter()
    tree: Tree = Phylo.read(args.tree_file, format=args.tree_format)  # type: ignore

    t1 = time.perf_counter()
    print(f"Tree read in {timedelta(seconds=t1-t0)}")

python -m cProfile output for previous version:

Tree read in 0:17:41.177134

9068840 function calls (9064100 primitive calls) in 1064.358 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    157/1    0.001    0.000 1064.358 1064.358 {built-in method builtins.exec}
        1    0.002    0.002 1064.358 1064.358 constant_in_time.py:1(<module>)
8411/7453    0.001    0.000 1061.178    0.142 {built-in method builtins.next}
        1    0.000    0.000 1061.177 1061.177 _io.py:52(read)
        2    0.000    0.000 1061.177  530.589 _io.py:33(parse)
        2    0.000    0.000 1061.177  530.588 NexusIO.py:32(parse)
        1    0.000    0.000 1061.176 1061.176 Nexus.py:626(__init__)
        1    0.005    0.005 1061.176 1061.176 Nexus.py:693(read)
        2    0.011    0.006 1061.090  530.545 Nexus.py:760(_parse_nexus_block)
        1    0.000    0.000 1061.057 1061.057 Nexus.py:1162(_tree)
        1    0.000    0.000 1060.667 1060.667 Trees.py:58(__init__)
     91/1 1057.755   11.624 1060.661 1060.661 Trees.py:87(_parse)
     45/1    0.000    0.000  196.309  196.309 Trees.py:133(<listcomp>)

python -m cProfile output for proposed version:

Tree read in 0:00:00.832164

805783 function calls (801024 primitive calls) in 3.893 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    157/1    0.002    0.000    3.893    3.893 {built-in method builtins.exec}
        1    0.002    0.002    3.893    3.893 constant_in_time.py:1(<module>)
       92    0.012    0.000    2.745    0.030 constant_in_time.py:52(get_data)
       94    2.451    0.026    2.583    0.027 constant_in_time.py:26(get_consensus)
8411/7453    0.001    0.000    0.833    0.000 {built-in method builtins.next}
        1    0.000    0.000    0.832    0.832 _io.py:52(read)
        2    0.000    0.000    0.832    0.416 _io.py:33(parse)
        2    0.000    0.000    0.832    0.416 NexusIO.py:32(parse)
        1    0.000    0.000    0.832    0.832 Nexus.py:626(__init__)
        1    0.005    0.005    0.832    0.832 Nexus.py:693(read)
        2    0.010    0.005    0.746    0.373 Nexus.py:760(_parse_nexus_block)
        1    0.000    0.000    0.715    0.715 Nexus.py:1162(_tree)
       24    0.001    0.000    0.564    0.023 __init__.py:1(<module>)

jrom99 avatar Feb 09 '24 04:02 jrom99

python setup.py test --offline output:

running test
Skipping any tests requiring internet access
/home/junior/Downloads/biopython/Bio/__init__.py:138: BiopythonWarning: You may be importing Biopython from inside the source tree. This is bad practice and might lead to downstream issues. In particular, you might encounter ImportErrors due to missing compiled C extensions. We recommend that you try running your code from outside the source tree. If you are outside the source tree then you have a setup.py file in an unexpected directory: /home/junior/Downloads/biopython
  warnings.warn(
Python version: 3.11.6 (main, Oct  8 2023, 05:06:43) [GCC 13.2.0]
Operating system: posix linux
test_Ace ... ok
test_Affy ... ok
test_AlignIO ... ok
test_AlignIO_ClustalIO ... ok
test_AlignIO_EmbossIO ... ok
test_AlignIO_FastaIO ... ok
test_AlignIO_MauveIO ... ok
test_AlignIO_PhylipIO ... ok
test_AlignIO_convert ... ok
test_AlignInfo ... ok
test_Align_Alignment ... ok
test_Align_a2m ... ok
test_Align_bed ... ok
test_Align_bigbed ... ok
test_Align_bigmaf ... ok
test_Align_bigpsl ... ok
test_Align_chain ... ok
test_Align_clustal ... ok
test_Align_codonalign ... ok
test_Align_emboss ... ok
test_Align_exonerate ... ok
test_Align_fasta ... ok
test_Align_hhr ... ok
test_Align_maf ... ok
test_Align_mauve ... ok
test_Align_msf ... ok
test_Align_nexus ... ok
test_Align_phylip ... ok
test_Align_psl ... ok
test_Align_sam ... ok
test_Align_stockholm ... ok
test_Align_tabular ... ok
test_Application ... ok
test_BWA_tool ... skipping. Install bwa and correctly set the file path to the program if you want to use it from Biopython
test_BioSQL_MySQLdb ... skipping. BioSQL test configuration file biosql.ini missing (see biosql.ini.sample)
test_BioSQL_MySQLdb_online ... skipping. internet not available
test_BioSQL_mysql_connector ... skipping. BioSQL test configuration file biosql.ini missing (see biosql.ini.sample)
test_BioSQL_mysql_connector_online ... skipping. internet not available
test_BioSQL_psycopg2 ... skipping. BioSQL test configuration file biosql.ini missing (see biosql.ini.sample)
test_BioSQL_psycopg2_online ... skipping. internet not available
test_BioSQL_sqlite3 ... ok
test_BioSQL_sqlite3_online ... skipping. internet not available
test_Blast_Record ... ok
test_Blast_parser ... ok
test_CAPS ... ok
test_Chi2 ... ok
test_ClustalOmega_tool ... skipping. Install clustalo if you want to use Clustal Omega from Biopython.
test_Clustalw_tool ... skipping. Install clustalw or clustalw2 if you want to use it from Biopython.
test_Cluster ... ok
test_CodonTable ... ok
test_ColorSpiral ... skipping. Error: Install reportlab if you want to use Bio.Graphics.

test_Compass ... ok
test_Consensus ... ok
test_Dialign_tool ... skipping. Install DIALIGN2-2 if you want to use the Bio.Align.Applications wrapper.
test_EMBL_unittest ... ok
test_Emboss ... skipping. Install EMBOSS if you want to use Bio.Emboss.
test_EmbossPhylipNew ... skipping. Install the Emboss package 'PhylipNew' if you want to use the Bio.Emboss.Applications wrappers for phylogenetic tools.
test_EmbossPrimer ... ok
test_Entrez ... ok
test_Entrez_online ... skipping. internet not available
test_Entrez_parser ... ok
test_Enzyme ... ok
test_ExPASy ... skipping. internet not available
test_Fasttree_tool ... ok
test_File ... ok
test_GenBank ... ok
test_GenomeDiagram ... skipping. Error: Install reportlab if you want to use Bio.Graphics.

test_GraphicsBitmaps ... skipping. Error: Install ReportLab if you want to use Bio.Graphics.

test_GraphicsChromosome ... skipping. Error: Install reportlab if you want to use Bio.Graphics.

test_GraphicsDistribution ... skipping. Install reportlab if you want to use Bio.Graphics.
test_GraphicsGeneral ... skipping. Install reportlab if you want to use Bio.Graphics.
test_HMMCasino ... ok
test_HMMGeneral ... ok
test_KEGG ... ok
test_KEGG_online ... skipping. internet not available
test_KGML_graphics ... skipping. Error: Please install ReportLab if you want to use Bio.Graphics. You can find ReportLab at http://www.reportlab.com/software/opensource/

test_KGML_graphics_online ... skipping. Install reportlab if you want to use Bio.Graphics.
test_KGML_nographics ... ok
test_KeyWList ... ok
test_LogisticRegression ... ok
test_MSAProbs_tool ... skipping. Install msaprobs if you want to use MSAProbs from Biopython.
test_MafIO_index ... ok
test_Mafft_tool ... ok
test_MarkovModel ... ok
test_Medline ... ok
test_Muscle_tool ... skipping. Install MUSCLE if you want to use the Bio.Align.Applications wrapper.
test_NCBIXML ... ok
test_NCBI_BLAST_tools ... skipping. Install the NCBI BLAST+ command line tools if you want to use the Bio.Blast.Applications wrapper.
test_NCBI_qblast ... ok
test_NMR ... ok
test_NaiveBayes ... ok
test_Nexus ... ok
test_PAML_baseml ... ok
test_PAML_codeml ... ok
test_PAML_tools ... skipping. Install PAML if you want to use the Bio.Phylo.PAML wrapper.
test_PAML_yn00 ... ok
test_PDBList ... skipping. internet not available
test_PDB_CEAligner ... ok
test_PDB_DSSP ... ok
test_PDB_Dice ... ok
test_PDB_Disordered ... ok
test_PDB_Exposure ... ok
test_PDB_FragmentMapper ... ok
test_PDB_KDTree ... ok
test_PDB_MMCIF2Dict ... ok
test_PDB_MMCIFIO ... ok
test_PDB_MMCIFParser ... ok
test_PDB_NACCESS ... ok
test_PDB_PDBIO ... ok
test_PDB_PDBParser ... ok
test_PDB_PSEA ... skipping. Download and install psea from ftp://ftp.lmcp.jussieu.fr/pub/sincris/software/protein/p-sea/. Make sure that psea is on path
test_PDB_Polypeptide ... ok
test_PDB_QCPSuperimposer ... ok
test_PDB_ResidueDepth ... ok
test_PDB_SASA ... ok
test_PDB_SMCRA ... ok
test_PDB_Selection ... ok
test_PDB_StructureAlignment ... ok
test_PDB_Superimposer ... ok
test_PDB_internal_coords ... skipping. Error: Install mmtf to use Bio.PDB.mmtf (e.g. pip install mmtf-python)

test_PDB_parse_pdb_header ... ok
test_PDB_vectors ... ok
test_PQR ... ok
test_Pathway ... ok
test_Phd ... ok
test_Phylo ... ok
test_PhyloXML ... ok
test_Phylo_CDAO ... skipping. Install RDFlib if you want to use the CDAO tree format.
test_Phylo_NeXML ... ok
test_Phylo_igraph ... skipping. Install igraph if you wish to use it with Bio.Phylo
test_Phylo_matplotlib ... ok
test_Phylo_networkx ... skipping. Install networkx if you wish to use it with Bio.Phylo
test_PopGen_GenePop ... skipping. Install GenePop if you want to use Bio.PopGen.GenePop.
test_PopGen_GenePop_EasyController ... skipping. Install GenePop if you want to use Bio.PopGen.GenePop.
test_PopGen_GenePop_nodepend ... ok
test_Prank_tool ... skipping. Install PRANK if you want to use the Bio.Align.Applications wrapper.
test_Probcons_tool ... skipping. Install PROBCONS if you want to use the Bio.Align.Applications wrapper.
test_ProtParam ... ok
test_RCSBFormats ... ok
test_Restriction ... ok
test_SCOP_Astral ... ok
test_SCOP_Cla ... ok
test_SCOP_Des ... ok
test_SCOP_Dom ... ok
test_SCOP_Hie ... ok
test_SCOP_Raf ... ok
test_SCOP_Residues ... ok
test_SCOP_Scop ... ok
test_SCOP_online ... skipping. internet not available
test_SVDSuperimposer ... ok
test_SearchIO_blast_tab ... ok
test_SearchIO_blast_tab_index ... ok
test_SearchIO_blast_xml ... ok
test_SearchIO_blast_xml_index ... ok
test_SearchIO_blat_psl ... ok
test_SearchIO_blat_psl_index ... ok
test_SearchIO_exonerate ... ok
test_SearchIO_exonerate_text_index ... ok
test_SearchIO_exonerate_vulgar_index ... ok
test_SearchIO_fasta_m10 ... ok
test_SearchIO_fasta_m10_index ... ok
test_SearchIO_hhsuite2_text ... ok
test_SearchIO_hmmer2_text ... ok
test_SearchIO_hmmer2_text_index ... ok
test_SearchIO_hmmer3_domtab ... ok
test_SearchIO_hmmer3_domtab_index ... ok
test_SearchIO_hmmer3_tab ... ok
test_SearchIO_hmmer3_tab_index ... ok
test_SearchIO_hmmer3_text ... ok
test_SearchIO_hmmer3_text_index ... ok
test_SearchIO_interproscan_xml ... ok
test_SearchIO_model ... ok
test_SearchIO_write ... ok
test_SeqFeature ... /home/junior/Downloads/biopython/Bio/SeqFeature.py:1040: BiopythonParserWarning: Attempting to fix invalid location '3..2' as it looks like incorrect origin wrapping. Please fix input file, this could have unintended behavior.
  warnings.warn(
ok
test_SeqIO ... ok
test_SeqIO_AbiIO ... ok
test_SeqIO_FastaIO ... ok
test_SeqIO_Gck ... ok
test_SeqIO_Gfa ... ok
test_SeqIO_Insdc ... ok
test_SeqIO_NibIO ... ok
test_SeqIO_PdbIO ... ok
test_SeqIO_QualityIO ... ok
test_SeqIO_SeqXML ... ok
test_SeqIO_SnapGene ... ok
test_SeqIO_TwoBitIO ... ok
test_SeqIO_Xdna ... ok
test_SeqIO_features ... ok
test_SeqIO_index ... ok
test_SeqIO_online ... skipping. internet not available
test_SeqIO_write ... ok
test_SeqRecord ... ok
test_SeqUtils ... ok
test_Seq_objs ... ok
test_SffIO ... ok
test_SwissProt ... ok
test_TCoffee_tool ... skipping. Install TCOFFEE if you want to use the Bio.Align.Applications wrapper.
test_TogoWS ... skipping. internet not available
test_TreeConstruction ... ok
test_Tutorial ... ok
test_UniGene ... ok
test_UniProt ... skipping. internet not available
test_UniProt_GOA ... ok
test_UniProt_Parser ... ok
test_XXmotif_tool ... skipping. Install XXmotif if you want to use XXmotif from Biopython.
test_align ... ok
test_align_substitution_matrices ... ok
test_bgzf ... ok
test_cellosaurus ... ok
test_codonalign ... ok
test_geo ... ok
test_kNN ... ok
test_mmtf ... skipping. Error: Install mmtf to use Bio.PDB.mmtf (e.g. pip install mmtf-python)

test_mmtf_online ... skipping. Error: Install mmtf to use Bio.PDB.mmtf (e.g. pip install mmtf-python)

test_motifs ... ok
test_motifs_online ... skipping. internet not available
test_pairwise2 ... ok
test_pairwise2_no_C ... ok
test_pairwise_aligner ... ok
test_pairwise_alignment_map ... ok
test_phenotype ... ok
test_phenotype_fit ... ok
test_phyml_tool ... skipping. Couldn't find the PhyML software. Install PhyML 3.0 or later if you want to use the Bio.Phylo.Applications wrapper.
test_prodoc ... ok
test_prosite ... ok
test_raxml_tool ... ok
test_samtools_tool ... skipping. Install samtools and correctly set the file path to the program if you want to use it from Biopython
test_seq ... ok
test_translate ... ok
Bio docstring test ... ok
Bio.Affy docstring test ... ok
Bio.Affy.CelFile docstring test ... ok
Bio.Align docstring test ... ok
Bio.Align.AlignInfo docstring test ... ok
Bio.Align.Applications docstring test ... ok
Bio.Align.Applications._ClustalOmega docstring test ... ok
Bio.Align.Applications._Clustalw docstring test ... ok
Bio.Align.Applications._Dialign docstring test ... ok
Bio.Align.Applications._MSAProbs docstring test ... ok
Bio.Align.Applications._Mafft docstring test ... ok
Bio.Align.Applications._Muscle docstring test ... ok
Bio.Align.Applications._Prank docstring test ... ok
Bio.Align.Applications._Probcons docstring test ... ok
Bio.Align.Applications._TCoffee docstring test ... ok
Bio.Align._codonaligner docstring test ... ok
Bio.Align._pairwisealigner docstring test ... ok
Bio.Align.a2m docstring test ... ok
Bio.Align.analysis docstring test ... ok
Bio.Align.bed docstring test ... ok
Bio.Align.bigbed docstring test ... ok
Bio.Align.bigmaf docstring test ... ok
Bio.Align.bigpsl docstring test ... ok
Bio.Align.chain docstring test ... ok
Bio.Align.clustal docstring test ... ok
Bio.Align.emboss docstring test ... ok
Bio.Align.exonerate docstring test ... ok
Bio.Align.fasta docstring test ... ok
Bio.Align.hhr docstring test ... ok
Bio.Align.interfaces docstring test ... ok
Bio.Align.maf docstring test ... ok
Bio.Align.mauve docstring test ... ok
Bio.Align.msf docstring test ... ok
Bio.Align.nexus docstring test ... ok
Bio.Align.phylip docstring test ... ok
Bio.Align.psl docstring test ... ok
Bio.Align.sam docstring test ... ok
Bio.Align.stockholm docstring test ... ok
Bio.Align.substitution_matrices docstring test ... ok
Bio.Align.tabular docstring test ... ok
Bio.AlignIO docstring test ... ok
Bio.AlignIO.ClustalIO docstring test ... ok
Bio.AlignIO.EmbossIO docstring test ... ok
Bio.AlignIO.FastaIO docstring test ... ok
Bio.AlignIO.Interfaces docstring test ... ok
Bio.AlignIO.MafIO docstring test ... ok
Bio.AlignIO.MauveIO docstring test ... ok
Bio.AlignIO.MsfIO docstring test ... ok
Bio.AlignIO.NexusIO docstring test ... ok
Bio.AlignIO.PhylipIO docstring test ... ok
Bio.AlignIO.StockholmIO docstring test ... ok
Bio.Application docstring test ... ok
Bio.Blast docstring test ... ok
Bio.Blast.Applications docstring test ... ok
Bio.Blast.NCBIWWW docstring test ... ok
Bio.Blast.NCBIXML docstring test ... ok
Bio.Blast._parser docstring test ... ok
Bio.CAPS docstring test ... ok
Bio.Cluster docstring test ... ok
Bio.Cluster._cluster docstring test ... ok
Bio.Compass docstring test ... ok
Bio.Data docstring test ... ok
Bio.Data.CodonTable docstring test ... ok
Bio.Data.IUPACData docstring test ... ok
Bio.Data.PDBData docstring test ... ok
Bio.Emboss docstring test ... ok
Bio.Emboss.Applications docstring test ... ok
Bio.Emboss.Primer3 docstring test ... ok
Bio.Emboss.PrimerSearch docstring test ... ok
Bio.Entrez.Parser docstring test ... ok
Bio.ExPASy.Enzyme docstring test ... ok
Bio.ExPASy.Prodoc docstring test ... ok
Bio.ExPASy.Prosite docstring test ... ok
Bio.ExPASy.ScanProsite docstring test ... ok
Bio.File docstring test ... ok
Bio.GenBank docstring test ... ok
Bio.GenBank.Record docstring test ... ok
Bio.GenBank.Scanner docstring test ... ok
Bio.GenBank.utils docstring test ... ok
Bio.Geo docstring test ... ok
Bio.Geo.Record docstring test ... ok
Bio.Graphics docstring test ... skipped, missing Python dependency
Bio.Graphics.BasicChromosome docstring test ... skipped, missing Python dependency
Bio.Graphics.ColorSpiral docstring test ... skipped, missing Python dependency
Bio.Graphics.Comparative docstring test ... skipped, missing Python dependency
Bio.Graphics.DisplayRepresentation docstring test ... skipped, missing Python dependency
Bio.Graphics.Distribution docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._AbstractDrawer docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._CircularDrawer docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._Colors docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._CrossLink docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._Diagram docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._Feature docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._FeatureSet docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._Graph docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._GraphSet docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._LinearDrawer docstring test ... skipped, missing Python dependency
Bio.Graphics.GenomeDiagram._Track docstring test ... skipped, missing Python dependency
Bio.Graphics.KGML_vis docstring test ... skipped, missing Python dependency
Bio.HMM docstring test ... ok
Bio.HMM.DynamicProgramming docstring test ... ok
Bio.HMM.MarkovModel docstring test ... ok
Bio.HMM.Trainer docstring test ... ok
Bio.HMM.Utilities docstring test ... ok
Bio.KEGG docstring test ... ok
Bio.KEGG.Compound docstring test ... ok
Bio.KEGG.Enzyme docstring test ... ok
Bio.KEGG.Gene docstring test ... ok
Bio.KEGG.KGML docstring test ... ok
Bio.KEGG.KGML.KGML_parser docstring test ... ok
Bio.KEGG.KGML.KGML_pathway docstring test ... ok
Bio.KEGG.Map docstring test ... ok
Bio.KEGG.REST docstring test ... ok
Bio.LogisticRegression docstring test ... ok
Bio.MarkovModel docstring test ... ok
Bio.MaxEntropy docstring test ... ok
Bio.Medline docstring test ... ok
Bio.NMR docstring test ... ok
Bio.NMR.NOEtools docstring test ... ok
Bio.NMR.xpktools docstring test ... ok
Bio.NaiveBayes docstring test ... ok
Bio.Nexus docstring test ... ok
Bio.Nexus.Nexus docstring test ... ok
Bio.Nexus.Nodes docstring test ... ok
Bio.Nexus.StandardData docstring test ... ok
Bio.Nexus.Trees docstring test ... ok
Bio.Nexus.cnexus docstring test ... ok
Bio.PDB docstring test ... ok
Bio.PDB.AbstractPropertyMap docstring test ... ok
Bio.PDB.Atom docstring test ... ok
Bio.PDB.Chain docstring test ... ok
Bio.PDB.DSSP docstring test ... ok
Bio.PDB.Dice docstring test ... ok
Bio.PDB.Entity docstring test ... ok
Bio.PDB.FragmentMapper docstring test ... ok
Bio.PDB.HSExposure docstring test ... ok
Bio.PDB.MMCIF2Dict docstring test ... ok
Bio.PDB.MMCIFParser docstring test ... ok
Bio.PDB.Model docstring test ... ok
Bio.PDB.NACCESS docstring test ... ok
Bio.PDB.NeighborSearch docstring test ... ok
Bio.PDB.PDBExceptions docstring test ... ok
Bio.PDB.PDBIO docstring test ... ok
Bio.PDB.PDBList docstring test ... ok
Bio.PDB.PDBParser docstring test ... ok
Bio.PDB.PICIO docstring test ... ok
Bio.PDB.PSEA docstring test ... ok
Bio.PDB.Polypeptide docstring test ... ok
Bio.PDB.Residue docstring test ... ok
Bio.PDB.ResidueDepth docstring test ... ok
Bio.PDB.SASA docstring test ... ok
Bio.PDB.SCADIO docstring test ... ok
Bio.PDB.Selection docstring test ... ok
Bio.PDB.Structure docstring test ... ok
Bio.PDB.StructureAlignment docstring test ... ok
Bio.PDB.StructureBuilder docstring test ... ok
Bio.PDB.Superimposer docstring test ... ok
Bio.PDB.ccealign docstring test ... ok
Bio.PDB.cealign docstring test ... ok
Bio.PDB.ic_data docstring test ... ok
Bio.PDB.ic_rebuild docstring test ... ok
Bio.PDB.internal_coords docstring test ... ok
Bio.PDB.kdtrees docstring test ... ok
Bio.PDB.mmcifio docstring test ... ok
Bio.PDB.mmtf docstring test ... skipped, missing Python dependency
Bio.PDB.mmtf.DefaultParser docstring test ... skipped, missing Python dependency
Bio.PDB.mmtf.mmtfio docstring test ... skipped, missing Python dependency
Bio.PDB.parse_pdb_header docstring test ... ok
Bio.PDB.qcprot docstring test ... ok
Bio.PDB.vectors docstring test ... ok
Bio.Pathway docstring test ... ok
Bio.Pathway.Rep docstring test ... ok
Bio.Pathway.Rep.Graph docstring test ... ok
Bio.Pathway.Rep.MultiGraph docstring test ... ok
Bio.Phylo docstring test ... ok
Bio.Phylo.Applications docstring test ... ok
Bio.Phylo.Applications._Fasttree docstring test ... ok
Bio.Phylo.Applications._Phyml docstring test ... ok
Bio.Phylo.Applications._Raxml docstring test ... ok
Bio.Phylo.BaseTree docstring test ... ok
Bio.Phylo.CDAO docstring test ... ok
Bio.Phylo.CDAOIO docstring test ... skipped, missing Python dependency
Bio.Phylo.Consensus docstring test ... ok
Bio.Phylo.NeXML docstring test ... ok
Bio.Phylo.NeXMLIO docstring test ... ok
Bio.Phylo.Newick docstring test ... ok
Bio.Phylo.NewickIO docstring test ... ok
Bio.Phylo.NexusIO docstring test ... ok
Bio.Phylo.PAML docstring test ... ok
Bio.Phylo.PAML._paml docstring test ... ok
Bio.Phylo.PAML._parse_baseml docstring test ... ok
Bio.Phylo.PAML._parse_codeml docstring test ... ok
Bio.Phylo.PAML._parse_yn00 docstring test ... ok
Bio.Phylo.PAML.baseml docstring test ... ok
Bio.Phylo.PAML.chi2 docstring test ... ok
Bio.Phylo.PAML.codeml docstring test ... ok
Bio.Phylo.PAML.yn00 docstring test ... ok
Bio.Phylo.PhyloXML docstring test ... ok
Bio.Phylo.PhyloXMLIO docstring test ... ok
Bio.Phylo.TreeConstruction docstring test ... ok
Bio.Phylo._cdao_owl docstring test ... ok
Bio.Phylo._io docstring test ... ok
Bio.Phylo._utils docstring test ... ok
Bio.PopGen docstring test ... ok
Bio.PopGen.GenePop docstring test ... ok
Bio.PopGen.GenePop.Controller docstring test ... ok
Bio.PopGen.GenePop.EasyController docstring test ... ok
Bio.PopGen.GenePop.FileParser docstring test ... ok
Bio.PopGen.GenePop.LargeFileParser docstring test ... ok
Bio.Restriction docstring test ... ok
Bio.Restriction.PrintFormat docstring test ... ok
Bio.Restriction.Restriction docstring test ... ok
Bio.Restriction.Restriction_Dictionary docstring test ... ok
Bio.SCOP docstring test ... ok
Bio.SCOP.Cla docstring test ... ok
Bio.SCOP.Des docstring test ... ok
Bio.SCOP.Dom docstring test ... ok
Bio.SCOP.Hie docstring test ... ok
Bio.SCOP.Raf docstring test ... ok
Bio.SCOP.Residues docstring test ... ok
Bio.SVDSuperimposer docstring test ... ok
Bio.SearchIO docstring test ... ok
Bio.SearchIO.BlastIO docstring test ... ok
Bio.SearchIO.BlastIO.blast_tab docstring test ... ok
Bio.SearchIO.BlastIO.blast_xml docstring test ... ok
Bio.SearchIO.BlatIO docstring test ... ok
Bio.SearchIO.ExonerateIO docstring test ... ok
Bio.SearchIO.ExonerateIO._base docstring test ... ok
Bio.SearchIO.ExonerateIO.exonerate_cigar docstring test ... ok
Bio.SearchIO.ExonerateIO.exonerate_text docstring test ... ok
Bio.SearchIO.ExonerateIO.exonerate_vulgar docstring test ... ok
Bio.SearchIO.FastaIO docstring test ... ok
Bio.SearchIO.HHsuiteIO docstring test ... ok
Bio.SearchIO.HHsuiteIO.hhsuite2_text docstring test ... ok
Bio.SearchIO.HmmerIO docstring test ... ok
Bio.SearchIO.HmmerIO._base docstring test ... ok
Bio.SearchIO.HmmerIO.hmmer2_text docstring test ... ok
Bio.SearchIO.HmmerIO.hmmer3_domtab docstring test ... ok
Bio.SearchIO.HmmerIO.hmmer3_tab docstring test ... ok
Bio.SearchIO.HmmerIO.hmmer3_text docstring test ... ok
Bio.SearchIO.InterproscanIO docstring test ... ok
Bio.SearchIO.InterproscanIO.interproscan_xml docstring test ... ok
Bio.SearchIO._index docstring test ... ok
Bio.SearchIO._model docstring test ... ok
Bio.SearchIO._model._base docstring test ... ok
Bio.SearchIO._model.hit docstring test ... ok
Bio.SearchIO._model.hsp docstring test ... ok
Bio.SearchIO._model.query docstring test ... ok
Bio.SearchIO._utils docstring test ... ok
Bio.Seq docstring test ... ok
Bio.SeqFeature docstring test ... ok
Bio.SeqIO docstring test ... ok
Bio.SeqIO.AbiIO docstring test ... ok
Bio.SeqIO.AceIO docstring test ... ok
Bio.SeqIO.FastaIO docstring test ... ok
Bio.SeqIO.GckIO docstring test ... ok
Bio.SeqIO.GfaIO docstring test ... ok
Bio.SeqIO.IgIO docstring test ... ok
Bio.SeqIO.InsdcIO docstring test ... ok
Bio.SeqIO.Interfaces docstring test ... ok
Bio.SeqIO.NibIO docstring test ... ok
Bio.SeqIO.PdbIO docstring test ... ok
Bio.SeqIO.PhdIO docstring test ... ok
Bio.SeqIO.PirIO docstring test ... ok
Bio.SeqIO.QualityIO docstring test ... ok
Bio.SeqIO.SeqXmlIO docstring test ... ok
Bio.SeqIO.SffIO docstring test ... ok
Bio.SeqIO.SnapGeneIO docstring test ... ok
Bio.SeqIO.SwissIO docstring test ... ok
Bio.SeqIO.TabIO docstring test ... ok
Bio.SeqIO.TwoBitIO docstring test ... ok
Bio.SeqIO.UniprotIO docstring test ... ok
Bio.SeqIO.XdnaIO docstring test ... ok
Bio.SeqIO._index docstring test ... ok
Bio.SeqIO._twoBitIO docstring test ... ok
Bio.SeqRecord docstring test ... ok
Bio.SeqUtils docstring test ... ok
Bio.SeqUtils.CheckSum docstring test ... ok
Bio.SeqUtils.IsoelectricPoint docstring test ... ok
Bio.SeqUtils.MeltingTemp docstring test ... ok
Bio.SeqUtils.ProtParam docstring test ... ok
Bio.SeqUtils.ProtParamData docstring test ... ok
Bio.SeqUtils.lcc docstring test ... ok
Bio.Sequencing docstring test ... ok
Bio.Sequencing.Ace docstring test ... ok
Bio.Sequencing.Applications docstring test ... ok
Bio.Sequencing.Applications._Novoalign docstring test ... ok
Bio.Sequencing.Applications._bwa docstring test ... ok
Bio.Sequencing.Applications._samtools docstring test ... ok
Bio.Sequencing.Phd docstring test ... ok
Bio.SwissProt docstring test ... ok
Bio.SwissProt.KeyWList docstring test ... ok
Bio.UniGene docstring test ... ok
Bio.UniProt.GOA docstring test ... ok
Bio._utils docstring test ... ok
Bio.bgzf docstring test ... ok
Bio.codonalign docstring test ... ok
Bio.codonalign.codonalignment docstring test ... ok
Bio.codonalign.codonseq docstring test ... ok
Bio.cpairwise2 docstring test ... ok
Bio.kNN docstring test ... ok
Bio.motifs docstring test ... ok
Bio.motifs._pwm docstring test ... ok
Bio.motifs.alignace docstring test ... ok
Bio.motifs.applications docstring test ... ok
Bio.motifs.applications._xxmotif docstring test ... ok
Bio.motifs.clusterbuster docstring test ... ok
Bio.motifs.jaspar docstring test ... ok
Bio.motifs.jaspar.db docstring test ... skipped, missing Python dependency
Bio.motifs.mast docstring test ... ok
Bio.motifs.matrix docstring test ... ok
Bio.motifs.meme docstring test ... ok
Bio.motifs.minimal docstring test ... ok
Bio.motifs.pfm docstring test ... ok
Bio.motifs.thresholds docstring test ... ok
Bio.motifs.transfac docstring test ... ok
Bio.motifs.xms docstring test ... ok
Bio.pairwise2 docstring test ... ok
Bio.phenotype docstring test ... ok
Bio.phenotype.phen_micro docstring test ... ok
Bio.phenotype.pm_fitting docstring test ... ok
BioSQL docstring test ... ok
BioSQL.BioSeq docstring test ... ok
BioSQL.BioSeqDatabase docstring test ... ok
BioSQL.DBUtils docstring test ... ok
BioSQL.Loader docstring test ... ok
----------------------------------------------------------------------
Ran 556 tests in 261.000 seconds

jrom99 avatar Feb 09 '24 04:02 jrom99

@cymon can you take a look at this Nexus optimization please?

peterjc avatar Feb 09 '24 09:02 peterjc

@cymon can you take a look at this Nexus optimization please?

Any updates?

jrom99 avatar Mar 06 '24 06:03 jrom99

Maybe @etal or @MarkusPiotrowski could take a look? I've not used Nexus files in a long time, but this seems resonable and a worthy speedup.

peterjc avatar Mar 06 '24 09:03 peterjc

Any updates?

jrom99 avatar Apr 24 '24 01:04 jrom99

@biopython/team-biopython if there are no objections, I propose to squash-and-merge this at the end of the week.

peterjc avatar Apr 24 '24 12:04 peterjc

Thanks for your patience @jrom99 - merged now.

peterjc avatar May 03 '24 23:05 peterjc