selene icon indicating copy to clipboard operation
selene copied to clipboard

Fasta file in `quickstart_training` config file conflicts with zenodo

Open evancofer opened this issue 6 years ago • 0 comments

In the yml configuration file for quickstart_training tutorial, it specifies that male.hg19.fasta should be used as the reference sequence, but the archive from zenodo (https://zenodo.org/record/1319886/files/selene_quickstart_tutorial.tar.gz) includes hg19.fasta. Interestingly, this file only contains chr1, but not the chromosomes listed in the validation/test set of the configuration file. I noticed that getting_started_with_selene tutorial includes male.hg19.fasta in its configuration file, as well as in its zenodo archive (https://zenodo.org/record/1443558/files/selene_quickstart.tar.gz). Moreover, despite the selene_quickstart.tar.gz name, the directory inside the archive is selene_quickstart_tutorial and most of the file names inside that directory are the same. It looks like the inclusion in quickstart_training is an accident, or perhaps an incorrect archive version is being linked in the ipynb? Perhaps we should change the config file or update zenodo? Maybe this was brought on by confusion from the very similar archive names? If so, it might also be nice to change the path names so that the user doesn't need to move any files out of the zenodo archives once they're unpacked.

evancofer avatar Aug 17 '19 14:08 evancofer