check list for common setback
- inconsistance chromosome naming convention between phenotype_per_chrom list and plink_per_chrom_list, wonder how this happened.
- File name too long... should real change the default naming convention of A+B
- Memory issue. ~80G for tensorQTL?
- sos issue should high light -n not works with -{j,q,c,s}
- singularity issue: singularity_bind needed in csg.yml and .bashrc
- Documentation issue: should clearout the parameters set for test MWE 6.1. i.e. PEER --N
- Documentation issue: should highlight the phenotype_group requirement for leafcutter
@hsun3163 i like this list. We should create a pitfall page or Q&A page for the protocol paper. Let's keep it here for reference (don't close the ticket).
- sumstat_standardizer need to be renovate to accommodate the gigantic methylation output. For 13Gb of input and 14G of output, the step required 109.219GB of mem. As mqtl have 4 times of phenotypes, it will take ~440 mem for 1 chromosome, which apperatly is not feasible.
The easiest way to solve this is to further partition the file per and do it per chunk, otherwise this post could potentially help https://towardsdatascience.com/optimize-memory-tips-in-python-3bbb44512937
- Cant start new thread in tensorQTL
Mapping files: 100%|██████████| 3/3 [00:06<00:00, 2.17s/it]
Traceback (most recent call last):
File "/mnt/vast/hpc/csg/snuc_pseudo_bulk/eight_celltypes_analysis/MWE/QTL_association/tmp60g786k5/singularity_run_59234.py", line 26, in <module>
genotype_df = pr.load_genotypes()
File "/opt/conda/lib/python3.8/site-packages/tensorqtl/genotypeio.py", line 222, in load_genotypes
return pd.DataFrame(self.bed.compute(), index=self.bim['snp'], columns=self.fam['iid'])
File "/opt/conda/lib/python3.8/site-packages/dask/base.py", line 315, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/dask/base.py", line 600, in compute
results = schedule(dsk, keys, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/dask/threaded.py", line 89, in get
results = get_async(
File "/opt/conda/lib/python3.8/site-packages/dask/local.py", line 499, in get_async
fire_tasks(chunksize)
File "/opt/conda/lib/python3.8/site-packages/dask/local.py", line 494, in fire_tasks
fut = submit(batch_execute_tasks, each_args)
File "/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 188, in submit
self._adjust_thread_count()
File "/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 213, in _adjust_thread_count
t.start()
File "/opt/conda/lib/python3.8/threading.py", line 852, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
- Cant start new thread in tensorQTL
Mapping files: 100%|██████████| 3/3 [00:06<00:00, 2.17s/it] Traceback (most recent call last): File "/mnt/vast/hpc/csg/snuc_pseudo_bulk/eight_celltypes_analysis/MWE/QTL_association/tmp60g786k5/singularity_run_59234.py", line 26, in <module> genotype_df = pr.load_genotypes() File "/opt/conda/lib/python3.8/site-packages/tensorqtl/genotypeio.py", line 222, in load_genotypes return pd.DataFrame(self.bed.compute(), index=self.bim['snp'], columns=self.fam['iid']) File "/opt/conda/lib/python3.8/site-packages/dask/base.py", line 315, in compute (result,) = compute(self, traverse=False, **kwargs) File "/opt/conda/lib/python3.8/site-packages/dask/base.py", line 600, in compute results = schedule(dsk, keys, **kwargs) File "/opt/conda/lib/python3.8/site-packages/dask/threaded.py", line 89, in get results = get_async( File "/opt/conda/lib/python3.8/site-packages/dask/local.py", line 499, in get_async fire_tasks(chunksize) File "/opt/conda/lib/python3.8/site-packages/dask/local.py", line 494, in fire_tasks fut = submit(batch_execute_tasks, each_args) File "/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 188, in submit self._adjust_thread_count() File "/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 213, in _adjust_thread_count t.start() File "/opt/conda/lib/python3.8/threading.py", line 852, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread
This problem is due to the --monitor commands, need to get rid of it after the protocols are done monitored. This will also prevent the sos to give the correct error msg
As it turn out, monitor.py will not cause the issue, but may indeed greatly increase the chance of having the issue