PyVCF icon indicating copy to clipboard operation
PyVCF copied to clipboard

Single threaded but uses pysam, so not single threaded

Open schultzm opened this issue 5 years ago • 0 comments

Hi, Thanks for the package. I'm using PyVCF to parse out some VCFs from snpEff. It's a simple task, reading in some text (i.e., the vcf) and building a table and printing to stdout. As far as I can tell, at the initial parsing step (vcf.Reader(open(sys.argv[1], 'r'))), PySam is being used. Hence, this apparently single-threaded action is really multithreaded under the hood. I would like to run this package using gnu parallel, but this is causing problems on our 72 core 320GB server since I have no way to control the upper CPU limit. Each process (i.e., one read of a VCF file) is using upwards of 500% CPU (5 cores). I didn't ask it to do that and I can't prevent PyVCF from doing this. Is this a PySam problem?

Thanks.

schultzm avatar May 18 '20 11:05 schultzm