Errors while running xpore dataprep
Dear GoekeLab,
I am trying to run xpore on the cluster of our institute, everythings goes well using the demo data, however I got this error/warning while running xpore dataprep with my own data, by chance do you have any ideas of the causes and how to fix it ?
Error. nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS" (64)/home/mycomputer/.local/ lib/python3.7/site-packages/xpore-2.1-py3.7.egg/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['line_length'].sum() /home/mycomputer/.local/lib/python3.7/site-packages/xpore-2.1-py3.7.egg/xpore/scripts/dataprep.py:72: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
Best regards,
Jeremy
Hi Jeremy,
Thanks for reaching out! It will be great if you can provide the command you used for running xpore dataprep. Other than that, you can also look into the following two things:
- After you see this error/warning, was
xpore dataprepstill generating thedataprep/data.jsonfile (see whether it increases in size byls -lh dataprep/data.json)? If yes,xpore dataprepis still running fine. - What is the value you put in for
--n_processes? Is this value larger than yourenvironment variable "NUMEXPR_MAX_THREADS"? If yes, you might want to either change--n_processesto a smaller value or increase the value of yourenvironment variable "NUMEXPR_MAX_THREADS"
Best wishes, Yuk Kei
Hi Yuk Kei,
I’m working with Jeremy on running xpore dataprep.
Here is the command I used for running xpore data prep:
xpore dataprep \
--eventalign “eventalign_Araport11_GTF_genes_transposons-col0.txt" \
--gtf_or_gff “Araport11_GTF_genes_transposons_final_xpore.sorted.gtf" \
--transcript_fasta “Araport11_GTF_genes_transposons.fa" \
--out_dir dataprep \
--genome
After seeing the error/warning, xpore dataprep only generated the eventalign.index file. No other output files are generated when I try to run xpore dataprep.
Best, Erika
Hi Erika,
Thank you for the information! Do you mind showing me the head of eventalign_Araport11_GTF_genes_transposons-col0.txt, Araport11_GTF_genes_transposons_final_xpore.sorted.gtf, and Araport11_GTF_genes_transposons.fa, please? I am suspecting that this might be due to a customized gtf file.
Thanks!
Best wishes, Yuk Kei
Hi Yuk Kei,
Here is the head for the eventalign.txt, GTF, and FASTA files.
eventalign_Araport11_GTF_genes_transposons-col0.txt:
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_meamodel_stdv standardized_level start_idx end_idx
AT1G01020.2 426 TTCTG 29 t 429 78.67 1.821 0.00664 TTCTG 79.59 2.07 -0.36 29062 29082
AT1G01020.2 426 TTCTG 29 t 430 82.91 1.990 0.00332 TTCTG 79.59 2.07 1.32 29052 29062
AT1G01020.2 427 TCTGA 29 t 431 95.35 1.866 0.00232 TCTGA 91.37 2.85 1.15 29045 29052
AT1G01020.2 427 TCTGA 29 t 432 99.25 1.877 0.00631 TCTGA 91.37 2.85 2.27 29026 29045
AT1G01020.2 427 TCTGA 29 t 433 94.57 2.016 0.00266 TCTGA 91.37 2.85 0.92 29018 29026
AT1G01020.2 427 TCTGA 29 t 434 98.04 1.761 0.00797 TCTGA 91.37 2.85 1.92 28994 29018
AT1G01020.2 428 CTGAT 29 t 435 122.09 3.429 0.00730 CTGAT 111.64 4.49 1.91 28972 28994
AT1G01020.2 428 CTGAT 29 t 436 117.08 2.426 0.00299 CTGAT 111.64 4.49 0.99 28963 28972
AT1G01020.2 429 TGATT 29 t 437 136.43 6.966 0.00266 TGATT 127.73 5.10 1.40 28955 28963
Araport11_GTF_genes_transposons_final_xpore.sorted.gtf:
1 Araport11 transcript 3631 5899 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 3631 3913 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 3996 4276 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 4486 4605 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 4706 5095 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 5174 5326 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 5439 5899 . + . gene_id "AT1G01010"; transcript_id "AT1G01010.1";
1 Araport11 exon 6788 7069 . - . gene_id "AT1G01020"; transcript_id "AT1G01020.2";
1 Araport11 exon 6788 7069 . - . gene_id "AT1G01020"; transcript_id "AT1G01020.6";
1 Araport11 exon 6788 7069 . - . gene_id "AT1G01020"; transcript_id "AT1G01020.1";
Araport11_GTF_genes_transposons.fa:
>AT1G01010.1
AAATTATTAGATATACCAAACCAGAGAAAACAAATACATAATCGGAGAAATACAGATTACAGAGAGCGAG
AGAGATCGACGGCGAAGCTCTTTACCCGGAAACCATTGAAATCGGACGGTTTAGTGAAAATGGAGGATCA
AGTTGGGTTTGGGTTCCGTCCGAACGACGAGGAGCTCGTTGGTCACTATCTCCGTAACAAAATCGAAGGA
AACACTAGCCGCGACGTTGAAGTAGCCATCAGCGAGGTCAACATCTGTAGCTACGATCCTTGGAACTTGC
GCTTCCAGTCAAAGTACAAATCGAGAGATGCTATGTGGTACTTCTTCTCTCGTAGAGAAAACAACAAAGG
GAATCGACAGAGCAGGACAACGGTTTCTGGTAAATGGAAGCTTACCGGAGAATCTGTTGAGGTCAAGGAC
CAGTGGGGATTTTGTAGTGAGGGCTTTCGTGGTAAGATTGGTCATAAAAGGGTTTTGGTGTTCCTCGATG
GAAGATACCCTGACAAAACCAAATCTGATTGGGTTATCCACGAGTTCCACTACGACCTCTTACCAGAACA
TCAGAGGACATATGTCATCTGCAGACTTGAGTACAAGGGTGATGATGCGGACATTCTATCTGCTTATGCA
Thank you, Erika
Hi Erika,
Thank you for sharing the eventalign.txt, GTF, and FASTA files! Those should be compatible with xpore dataprep.
I think you should look into the first line of the error message Error. nthreads cannot be larger than environment variable "NUMEXPR_MAX_THREADS", which contacting the cluster maintainers of your institute will help.
Thanks!
Best wishes, Yuk Kei
hello Yuk Kei
I am also trying to use xpore dataprep and Encountered the same problem,the dataprep/eventalign.index is generating, but data.json , data.index, data.log and data.readcount is empty, I have no idea about it and may I ask for your help?
The command I running xpore dataprep is
xpore dataprep \
--eventalign data/${file}/nanopolish/eventalign.txt \
--gtf_or_gff all.gtf \
--transcript_fasta ref.fa \
--out_dir data/${file}/dataprep \
--genome
I got error
/mycomputer/miniconda3/lib/python3.9/site-packages/xpore-2.1-py3.9.egg/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance.
pos_end += eventalign_result.loc[index]['line_length'].sum()
/mycomputer/miniconda3/lib/python3.9/site-packages/xpore-2.1-py3.9.egg/xpore/scripts/dataprep.py:72: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
And my eventalign.txt, GTF, and FASTA all seem like @erika-fukuhara, do you solve this problem or have any suggestion?
Thank you! Jeffer
Hey,
I'm having the same problem. I run xpore dataprep but the data.json data.log and other files are empty.
Do you know how we can fix it?