The most appropriate way to estimate sequence coverage
Hi,
Thanks for a great tool! I have a question on the expected coverage you use to input for running fit_model_extra.py. When we have no reason to suspect there should be systematic increase or decrease in coverage, what is the most appropriate way to estimate this? E.g. median or arithmetic mean from output of samtools depth?
"## Run fit_model_extra.py to fit the model
docker run
-v ${INPUT_DIR}:${INPUT_DIR}
-v ${OUTPUT_DIR}:${OUTPUT_DIR}
mobinasri/flagger:v0.3.2
python3 /home/programs/src/fit_gmm.py
--counts ${INPUT_DIR}/read_alignment.counts
--cov ${EXPECTED_COVERAGE}
--output ${OUTPUT_DIR}/read_alignment.table "
Kind regards, Andreas
Hi @andosl Thanks for using Flagger. It should be robust to median or mean as long they are not highly different but median is better. Since Flagger only uses the value you pass as the initial value for fitting parameters by EM algorithm, the final output should be robust to this parameter.