xqtl-protocol icon indicating copy to clipboard operation
xqtl-protocol copied to clipboard

check list for common setback

Open hsun3163 opened this issue 3 years ago • 4 comments

  1. inconsistance chromosome naming convention between phenotype_per_chrom list and plink_per_chrom_list, wonder how this happened.
  2. File name too long... should real change the default naming convention of A+B
  3. Memory issue. ~80G for tensorQTL?
  4. sos issue should high light -n not works with -{j,q,c,s}
  5. singularity issue: singularity_bind needed in csg.yml and .bashrc
  6. Documentation issue: should clearout the parameters set for test MWE 6.1. i.e. PEER --N
  7. Documentation issue: should highlight the phenotype_group requirement for leafcutter

hsun3163 avatar Sep 30 '22 14:09 hsun3163

@hsun3163 i like this list. We should create a pitfall page or Q&A page for the protocol paper. Let's keep it here for reference (don't close the ticket).

gaow avatar Sep 30 '22 15:09 gaow

  1. sumstat_standardizer need to be renovate to accommodate the gigantic methylation output. For 13Gb of input and 14G of output, the step required 109.219GB of mem. As mqtl have 4 times of phenotypes, it will take ~440 mem for 1 chromosome, which apperatly is not feasible.

The easiest way to solve this is to further partition the file per and do it per chunk, otherwise this post could potentially help https://towardsdatascience.com/optimize-memory-tips-in-python-3bbb44512937

hsun3163 avatar Oct 03 '22 19:10 hsun3163

  1. Cant start new thread in tensorQTL
Mapping files: 100%|██████████| 3/3 [00:06<00:00,  2.17s/it]
Traceback (most recent call last):
  File "/mnt/vast/hpc/csg/snuc_pseudo_bulk/eight_celltypes_analysis/MWE/QTL_association/tmp60g786k5/singularity_run_59234.py", line 26, in <module>
    genotype_df = pr.load_genotypes()
  File "/opt/conda/lib/python3.8/site-packages/tensorqtl/genotypeio.py", line 222, in load_genotypes
    return pd.DataFrame(self.bed.compute(), index=self.bim['snp'], columns=self.fam['iid'])
  File "/opt/conda/lib/python3.8/site-packages/dask/base.py", line 315, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/dask/base.py", line 600, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/dask/threaded.py", line 89, in get
    results = get_async(
  File "/opt/conda/lib/python3.8/site-packages/dask/local.py", line 499, in get_async
    fire_tasks(chunksize)
  File "/opt/conda/lib/python3.8/site-packages/dask/local.py", line 494, in fire_tasks
    fut = submit(batch_execute_tasks, each_args)
  File "/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 188, in submit
    self._adjust_thread_count()
  File "/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 213, in _adjust_thread_count
    t.start()
  File "/opt/conda/lib/python3.8/threading.py", line 852, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread

hsun3163 avatar Oct 06 '22 21:10 hsun3163

  1. Cant start new thread in tensorQTL
Mapping files: 100%|██████████| 3/3 [00:06<00:00,  2.17s/it]
Traceback (most recent call last):
  File "/mnt/vast/hpc/csg/snuc_pseudo_bulk/eight_celltypes_analysis/MWE/QTL_association/tmp60g786k5/singularity_run_59234.py", line 26, in <module>
    genotype_df = pr.load_genotypes()
  File "/opt/conda/lib/python3.8/site-packages/tensorqtl/genotypeio.py", line 222, in load_genotypes
    return pd.DataFrame(self.bed.compute(), index=self.bim['snp'], columns=self.fam['iid'])
  File "/opt/conda/lib/python3.8/site-packages/dask/base.py", line 315, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/dask/base.py", line 600, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/dask/threaded.py", line 89, in get
    results = get_async(
  File "/opt/conda/lib/python3.8/site-packages/dask/local.py", line 499, in get_async
    fire_tasks(chunksize)
  File "/opt/conda/lib/python3.8/site-packages/dask/local.py", line 494, in fire_tasks
    fut = submit(batch_execute_tasks, each_args)
  File "/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 188, in submit
    self._adjust_thread_count()
  File "/opt/conda/lib/python3.8/concurrent/futures/thread.py", line 213, in _adjust_thread_count
    t.start()
  File "/opt/conda/lib/python3.8/threading.py", line 852, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread

This problem is due to the --monitor commands, need to get rid of it after the protocols are done monitored. This will also prevent the sos to give the correct error msg


As it turn out, monitor.py will not cause the issue, but may indeed greatly increase the chance of having the issue

hsun3163 avatar Oct 07 '22 16:10 hsun3163