GEOparse icon indicating copy to clipboard operation
GEOparse copied to clipboard

download fails because of content-length mismatch when content-encoding=gzip

Open vttrifonov opened this issue 4 years ago • 1 comments

get_GEO('GPL1641') fails with

OSError: Download failed due to 'Downloaded size do not match the expected size for http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?targ=self&acc=GPL16417&form=text&view=full'. ID could be incorrect or the data might not be public yet.

The issue is that Downloader._download_http assumes that content-length is the same as the size before encoding. This is not the case when content-encoding=gzip because then content-length is the compressed size (i.e. after encoding/compression).

It is not clear how to get the size of the chunk before decoding/decompression unless you want to deal with the raw stream directly: it will be chunk_size, except for the last chunk... Might be best to drop the content-length enforcement.

vttrifonov avatar Jan 07 '22 23:01 vttrifonov

you can try this code "os.environ['GEOPARSE_USE_HTTP_FOR_FTP'] = 'yes'" before get_GEO('GPL1641')

bionewplayer avatar Aug 17 '23 09:08 bionewplayer