xqtl-protocol icon indicating copy to clipboard operation
xqtl-protocol copied to clipboard

Documentation fix-up for an updated release

Open gaow opened this issue 3 years ago • 2 comments

@hsun3163 I'm raising this issue here, for there are problems recently reported due to obsolete or incomplete minimal working example commands as a result of changes made. I'll make a TODO list in this ticket -- please fix them with MWE (without paths on your computer, so I can readily copy-paste to reproduce; use symbolic links as necessary. Let's start from modules first, then move on to protocol. Here are the pages to be updated:

  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/phenotype/phenotype_formatting.html
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/covariate/BiCV_factor.html
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/covariate/PEER_factor.html

gaow avatar May 11 '22 16:05 gaow

@hsun3163 Example of wrong MWE command that summarizes pitfalls you have in MWE sections:

nohup sos run pipeline/RNA_calling.ipynb rsem_call \
    --cwd ./ \
    --samples data/sample_fastq.list \
    --data-dir data \
    --STAR-index ./STAR_Index/ \
    --RSEM-index ./RSEM_Index/ \
    --container /mnt/mfs/statgen/container/rna_quantification.sif \
    --ref_flat ref_data/gtf_ref.flat \
    --riboIntervals  \
    --mem 16G -n &
  • Dont use nohup
  • Dont use absolute path. Should be container/rna_quantification.sif
  • Refernce data should be under reference_data/STAR_Index -- please be consistent with reference_data.ipynb
  • --ref_flat use --ref-flat convention
  • reference_data not ref_data if I remember it from previous MWE -- please be consistent.
  • Dont use -n
  • Dont use &

gaow avatar May 11 '22 16:05 gaow

additional pipeline review:

  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/genotype/VCF_QC.html
    • [ ] There is a FIXME to handel
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/genotype/GWAS_QC.html
    • [ ] There is a FIXME to handel
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/genotype/PCA.html
    • [ ] We need to write reporter functions / packages for summarizing log files. Currently they are just bash code chunks. I will not change them until later.
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/genotype/genotype_formatting.html
    • [ ] We need to rework the genotype_per_region file and make it accept a region list with start and en information not based on cis-window
    • [ ] We also need to include some sample codes to make it RDS files for other pipelines to use ..
    • [ ] Convert LD using https://github.com/G2Lab/ldmat
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/phenotype/gene_annotation.html
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/phenotype/phenotype_formatting.html
    • [ ] Need to clean up further
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/phenotype/phenotype_imputation.html
    • [ ] Need to rework with Zining updates
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/covariate/covariate_formatting.html
  • [x] https://cumc.github.io/xqtl-pipeline/code/data_preprocessing/covariate/covariate_hidden_factor.html

gaow avatar May 17 '22 22:05 gaow