FTPFileSystem open file cannot read from filezilla server
The ABOR command is sent when the ftp file download is finished, the server no longer responds and raises the TimeoutError
code:
def test1():
fs: FTPFileSystem = fsspec.filesystem('ftp', asynchronous=True, host='127.0.0.1', port=21, username='admin',
password='admin', timeout=10)
with fs.open(fs.ls('/realtime_update')[0]['name'], compression='gzip') as f:
f.read()
fsspec.implementations.ftp.FTPFile._fetch_range
# stop transfer, we got enough bytes for this block
self.fs.ftp.abort()
debug log
*get* '150 Starting data transfer.\n'
*resp* '150 Starting data transfer.'
*put urgent* b'ABOR\r\n'
*get* '226 Operation successful\n'
Traceback (most recent call last):
File "F:\project\python\external_ftp_pipeline\.venv\Lib\site-packages\fsspec\implementations\ftp.py", line 323, in _fetch_range
self.fs.ftp.retrbinary(
File "D:\Applications\scoop\apps\python\current\Lib\ftplib.py", line 441, in retrbinary
callback(data)
File "F:\project\python\external_ftp_pipeline\.venv\Lib\site-packages\fsspec\implementations\ftp.py", line 315, in callback
raise TransferDone
fsspec.implementations.ftp.TransferDone
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:\project\python\external_ftp_pipeline\src\external_ftp\gongshang_ftp.py", line 154, in <module>
asyncio.run(main())
File "D:\Applications\scoop\apps\python\current\Lib\asyncio\runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "D:\Applications\scoop\apps\python\current\Lib\asyncio\runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Applications\scoop\apps\python\current\Lib\asyncio\base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "F:\project\python\external_ftp_pipeline\src\external_ftp\gongshang_ftp.py", line 136, in main
for tag, data in load_ftp_infos():
File "F:\project\python\external_ftp_pipeline\src\external_ftp\gongshang_ftp.py", line 127, in load_ftp_infos
for data in decode_lines(f):
File "F:\project\python\external_ftp_pipeline\src\external_ftp\gongshang_ftp.py", line 117, in decode_lines
for line in source:
File "D:\Applications\scoop\apps\python\current\Lib\gzip.py", line 399, in readline
return self._buffer.readline(size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Applications\scoop\apps\python\current\Lib\_compression.py", line 68, in readinto
data = self.read(len(byte_view))
^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Applications\scoop\apps\python\current\Lib\gzip.py", line 499, in read
if not self._read_gzip_header():
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Applications\scoop\apps\python\current\Lib\gzip.py", line 468, in _read_gzip_header
last_mtime = _read_gzip_header(self._fp)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Applications\scoop\apps\python\current\Lib\gzip.py", line 423, in _read_gzip_header
magic = fp.read(2)
^^^^^^^^^^
File "D:\Applications\scoop\apps\python\current\Lib\gzip.py", line 97, in read
self.file.read(size-self._length+read)
File "F:\project\python\external_ftp_pipeline\.venv\Lib\site-packages\fsspec\spec.py", line 1856, in read
out = self.cache._fetch(self.loc, self.loc + length)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\project\python\external_ftp_pipeline\.venv\Lib\site-packages\fsspec\caching.py", line 189, in _fetch
self.cache = self.fetcher(start, end) # new block replaces old
^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\project\python\external_ftp_pipeline\.venv\Lib\site-packages\fsspec\implementations\ftp.py", line 333, in _fetch_range
self.fs.ftp.getmultiline()
File "D:\Applications\scoop\apps\python\current\Lib\ftplib.py", line 230, in getmultiline
line = self.getline()
^^^^^^^^^^^^^^
File "D:\Applications\scoop\apps\python\current\Lib\ftplib.py", line 212, in getline
line = self.file.readline(self.maxline + 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Applications\scoop\apps\python\current\Lib\socket.py", line 706, in readinto
return self._sock.recv_into(b)
^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: timed out
Thanks for reporting.
Are you sure that ABOR is happening when a file transfer is completed? It is supposed to be for the case that only part of a file has been read. Since it appears to be gzip doing the read call, it may be that you merely need to increase the size of the files blocksize to bigger than the size of the file.
Unfortunately, FTP servers are very poorly standardised. I'm not sure that there's anything we can do at the client side about i.
After commenting the abor code, the program runs fine!
Forced by the poor standardization of ftp server, I finally chose to implement gzipreader on aioftp
After commenting the abor code, the program runs fine!
Can you please add logging to see whether the whole file is being downloaded multiple times?
@martindurant
unchanged code log
*cmd* 'TYPE A'
*resp* '200 Type set to A'
*cmd* 'PASV'
*resp* '227 Entering Passive Mode (127,0,0,1,159,68)'
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 0 - 2
*cmd* 'MLSD /realtime_update'
*resp* '150 Starting data transfer.'
*resp* '226 Operation successful'
*cmd* 'TYPE I'
*resp* '200 Type set to I'
*cmd* 'PASV'
*resp* '227 Entering Passive Mode (127,0,0,1,159,70)'
*cmd* 'REST 0'
*resp* '350 Restarting at 0'
*cmd* 'RETR /realtime_update/2023102516.New.txt.gz'
*resp* '150 Starting data transfer.'
Traceback (most recent call last):
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\site-packages\fsspec\implementations\ftp.py", line 323, in _fetch_range
self.fs.ftp.retrbinary(
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\ftplib.py", line 441, in retrbinary
callback(data)
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\site-packages\fsspec\implementations\ftp.py", line 315, in callback
raise TransferDone
fsspec.implementations.ftp.TransferDone
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:\project\python\fasttext\ftptest\main.py", line 14, in <module>
run()
File "F:\project\python\fasttext\ftptest\main.py", line 12, in run
f.read()
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\gzip.py", line 300, in read
return self._buffer.read(size)
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\gzip.py", line 487, in read
if not self._read_gzip_header():
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\gzip.py", line 430, in _read_gzip_header
magic = self._fp.read(2)
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\gzip.py", line 96, in read
self.file.read(size-self._length+read)
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\site-packages\fsspec\spec.py", line 1856, in read
out = self.cache._fetch(self.loc, self.loc + length)
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\site-packages\fsspec\caching.py", line 189, in _fetch
self.cache = self.fetcher(start, end) # new block replaces old
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\site-packages\fsspec\implementations\ftp.py", line 333, in _fetch_range
self.fs.ftp.getmultiline()
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\ftplib.py", line 230, in getmultiline
line = self.getline()
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\ftplib.py", line 212, in getline
line = self.file.readline(self.maxline + 1)
File "E:\ProgramData\Miniconda3\envs\fasttext\lib\socket.py", line 704, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
delete `self.fs.ftp.abort()` log
*cmd* 'TYPE A'
*resp* '200 Type set to A'
*cmd* 'PASV'
*resp* '227 Entering Passive Mode (127,0,0,1,158,221)'
*cmd* 'MLSD /realtime_update'
*resp* '150 Starting data transfer.'
*resp* '226 Operation successful'
*cmd* 'TYPE I'
*resp* '200 Type set to I'
*cmd* 'PASV'
*resp* '227 Entering Passive Mode (127,0,0,1,158,223)'
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 0 - 2
*cmd* 'REST 0'
*resp* '350 Restarting at 0'
*cmd* 'RETR /realtime_update/2023102516.New.txt.gz'
*resp* '150 Starting data transfer.'
*resp* '226 Operation successful'
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 2 - 10
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 10 - 8202
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 8202 - 11221
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 11221 - 14076
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 14076 - 16441
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 16441 - 18679
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 18679 - 20969
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 20969 - 23165
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 23165 - 25292
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 25292 - 27419
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 27419 - 30027
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 30027 - 32318
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 32318 - 34601
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 34601 - 36810
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 36810 - 39141
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 39141 - 41712
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 41712 - 44157
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 44157 - 46544
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 46544 - 48754
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 48754 - 50947
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 50947 - 53343
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 53343 - 55620
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 55620 - 57836
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 57836 - 60129
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 60129 - 62657
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 62657 - 65035
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 65035 - 67481
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 67481 - 69753
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 69753 - 72074
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 72074 - 74321
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 74321 - 76616
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 76616 - 78649
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 78649 - 80916
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 80916 - 83289
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 83289 - 85553
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 85553 - 87725
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 87725 - 89976
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 89976 - 92451
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 92451 - 94691
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 94691 - 96949
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 96949 - 99125
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 99125 - 101487
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 101487 - 103930
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 103930 - 106360
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 106360 - 108616
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 108616 - 110985
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 110985 - 113242
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 113242 - 115512
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 115512 - 117602
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 117602 - 119840
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 119840 - 122086
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 122086 - 124212
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 124212 - 126351
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 126351 - 128304
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 128304 - 130333
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 130333 - 132413
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 132413 - 134869
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 134869 - 137422
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 137422 - 139608
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 139608 - 142130
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 142130 - 144478
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 144478 - 146745
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 146745 - 148955
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 148955 - 151443
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 151443 - 153697
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 153697 - 155869
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 155869 - 158299
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 158299 - 160676
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 160676 - 162747
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 162747 - 164906
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 164906 - 167174
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 167174 - 169558
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 169558 - 171836
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 171836 - 174033
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 174033 - 176401
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 176401 - 178833
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 178833 - 181068
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 181068 - 183108
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 183108 - 185395
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 185395 - 187590
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 187590 - 190119
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 190119 - 192506
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 192506 - 194689
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 194689 - 196943
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 196943 - 199122
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 199122 - 201194
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 201194 - 203539
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 203539 - 205982
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 205982 - 208610
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 208610 - 211157
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 211157 - 213326
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 213326 - 215466
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 215466 - 217674
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 216799 - 219832
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 216799 - 221934
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 216799 - 223922
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 216799 - 216800
DEBUG:fsspec:<File-like object FTPFileSystem, /realtime_update/2023102516.New.txt.gz> read: 216799 - 216801
I guess ABOR coul be optional? I don't know how you could do random access without it, though.