Use sumMatrices to implement scatter-gather technique for cloud deployment
Welcome to the HiCExplorer GitHub repository! Before opening the issue please check that the following requirements are met :
-
[x] Search whether this issue (or a similar issue) has been solved before using the search tab above. Link the previous issue if appropriate below.
-
[x] Paste your HiCExplorer version (
hicInfo --version) and your python version (python --version) below. 3.7.2 -
[x] Have you checked our documentation on hicexplorer.readthedocs.io?
-
[x] Do you use conda to install HiCExplorer?
-
[x] Do you use the latest HiCExplorer release? If not, please install it via a conda environment:
conda create --name hicexplorer hicexplorer=3.6 python=3.8 -c bioconda -c conda-forgeand activate the environment:conda activate hicexplorer. Retry your command. You can exit a conda environment viaconda deactivate. To learn more about conda and environments, please consider the following documentation.
Retry your command, is it solved now? If not please continue with the following:
-
[x] Paste the full HiCExplorer command that produces the issue below (ignore if you simply spotted the issue in the code/documentation).
-
[x] Paste the output printed on screen from the command that produces the issue below (ignore if you simply spotted the issue in the code/documentation).
I am implementing a hic-explorer pipeline in WDL via cromwell for deployment in AWS.
One of the easiest ways to reduce running time, in pipelines is usually to chop up fastq-files (scatter), run the entire pipeline with each chunk, and then merge the results (gather). If I do this with the hic-explorer pipeline, can I run hicSumMatrices across all the matrices generated for each chunk and get an equivalent matrix as if I run the whole pipeline with one pair of big fastq files?