rnacentral-webcode icon indicating copy to clipboard operation
rnacentral-webcode copied to clipboard

JSON download does not contain as many sequences as advertised

Open blakesweeney opened this issue 9 years ago • 1 comments

I downloaded search results in JSON from http://rnacentral.org/export/results?job=fdb8a27d-bf41-443a-ad51-64919244db8f, which states there are 12405 sequences. After downloading with safari, examining the file at the command line with jq gives 12367.

blakesweeney avatar Jan 11 '17 16:01 blakesweeney

I was just looking into this a little and it seems that the difference is due to the fact that the json download is not species specific (it uses ids like URS00004C03FA), while the fasta and id download are (they use ids like URS00004C03FA_637910).

blakesweeney avatar Apr 05 '17 11:04 blakesweeney