When in docker, python client code hanging when uploading as multipart large file.
(posting this here as instructed from B2 support)
Hi,
I'm developing an application and trying B2 as a storage backend. I think I've hit a bug with b2 CLI python code when uploading files as multipart. I'm using the code for the b2 python CLI directly from python but I see the same behaviour when using the tool from CLI (leading to upload speeds as low as few KB/s).
It seems to be a threading deadlock. I attach a txt with some information that I hope helps debug it.
-- My code:
info = SqliteAccountInfo()
self.b2_api = B2Api(info, AuthInfoCache(info))
self.b2_api.authorize_account('production',
self.conf['account_id'],
self.conf['application_key'])
bucket = self.b2_api.get_bucket_by_name(media_file.collection.safe_name)
file_infos = {'src_last_modified_millis': str(int(os.path.getmtime(media_file.path) * 1000))} # Each fileInfo is of the form "a=b".
with make_progress_listener(media_file.path, True) as progress_listener:
file_info = bucket.upload_local_file(
local_file=media_file.path,
file_name=media_file.checksum,
content_type=media_file.get_mime_type(),
file_infos=file_infos,
sha1_sum=media_file.checksum,
min_part_size=(1024 * 1014 * 1024), # Size in bytes, defaults to 100M, set to 1GB
progress_listener=progress_listener
)
response = file_info.as_dict()
log.info(response)
Trying with a 13M file. I’ve also tried with larger files (1GB, 500MB) with the same behaviour. When min_part_size is None:
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>> storage_manager.migrate_storage(sample_file, 'backblaze', False)
INFO:storage.locations:Migrating media file from minio_local to backblaze
DEBUG:b2.api:calling B2Api.create_bucket(self=<b2.api.B2Api object at 0x7f6d90f7bba8>, name='collection-1', type_='allPublic')
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_create_bucket HTTP/1.1" 400 100
DEBUG:storage.locations:download_file found file in local cache
DEBUG:b2.bucket:calling Bucket.upload_local_file(self=Bucket<50aedae9b486281c5595021c,collection-1,None>, local_file='/srv/data/cache/collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f', file_name='a3bb6417571f362ebc7f8566591ca4b78083545f', content_type='video/mpeg4-generic', file_infos={'src_last_modified_millis': '1483117620156'}, sha1_sum='a3bb6417571f362ebc7f8566591ca4b78083545f', min_part_size=None, progress_listener=<b2.progress.DoNothingProgressListener object at 0x7f6d8ccb93c8>)
DEBUG:b2.bucket:calling Bucket.upload(self=Bucket<50aedae9b486281c5595021c,collection-1,None>, upload_source=<b2.upload_source.UploadSourceLocalFile object at 0x7f6d8cc9f940>, file_name='a3bb6417571f362ebc7f8566591ca4b78083545f', content_type='video/mpeg4-generic', file_info={'src_last_modified_millis': '1483117620156'}, min_part_size=None, progress_listener=<b2.progress.DoNothingProgressListener object at 0x7f6d8ccb93c8>)
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:requests.packages.urllib3.connectionpool:Resetting dropped connection: api001.backblazeb2.com
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1036-13.backblaze.com
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.clear_bucket_upload_data(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:requests.packages.urllib3.connectionpool:Resetting dropped connection: api001.backblazeb2.com
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1037-08.backblaze.com
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.clear_bucket_upload_data(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1031-15.backblaze.com
DEBUG:requests.packages.urllib3.connectionpool:https://pod-000-1031-15.backblaze.com:443 "POST /b2api/v1/b2_upload_file/50aedae9b486281c5595021c/c001_v0001031_t0007 HTTP/1.1" 200 490
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.put_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c') (hidden args: upload_url, upload_auth_token)
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.clear_bucket_upload_data(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:requests.packages.urllib3.connectionpool:Resetting dropped connection: api001.backblazeb2.com
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1037-17.backblaze.com
^C
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/srv/src/webapp/storage/locations.py", line 219, in migrate_storage
destination.upload_file(media_file)
File "/srv/src/webapp/storage/locations.py", line 165, in upload_file
progress_listener=progress_listener
File "/usr/local/lib/python3.5/site-packages/logfury/v0_1/trace_call.py", line 84, in wrapper
return function(*wrapee_args, **wrapee_kwargs)
File "/usr/local/lib/python3.5/site-packages/b2/bucket.py", line 291, in upload_local_file
progress_listener=progress_listener
File "/usr/local/lib/python3.5/site-packages/logfury/v0_1/trace_call.py", line 84, in wrapper
return function(*wrapee_args, **wrapee_kwargs)
File "/usr/local/lib/python3.5/site-packages/b2/bucket.py", line 339, in upload
return f.result()
File "/usr/local/lib/python3.5/concurrent/futures/_base.py", line 400, in result
self._condition.wait(timeout)
File "/usr/local/lib/python3.5/threading.py", line 293, in wait
waiter.acquire()
KeyboardInterrupt
This runs indefinetly until I hit control+c. The traceback is always the same for all the tries I’ve done. Always locked in waiter.acquire().
With min_part_size very high (1GB) to ensure multi part upload is not used, it works.
INFO:storage.locations:Migrating media file from minio_local to backblaze (0.001) SELECT "mediabroker_mediacollection"."id", "mediabroker_mediacollection"."name", "mediabroker_mediacollection"."type", "mediabroker_mediacollection"."shared", "mediabroker_mediacollection"."user_id", "mediabroker_mediacollection"."date_created", "mediabroker_mediacollection"."date_last_updated" FROM "mediabroker_mediacollection" WHERE "mediabroker_mediacollection"."id" = 1; args=(1,) DEBUG:django.db.backends:(0.001) SELECT "mediabroker_mediacollection"."id", "mediabroker_mediacollection"."name", "mediabroker_mediacollection"."type", "mediabroker_mediacollection"."shared", "mediabroker_mediacollection"."user_id", "mediabroker_mediacollection"."date_created", "mediabroker_mediacollection"."date_last_updated" FROM "mediabroker_mediacollection" WHERE "mediabroker_mediacollection"."id" = 1; args=(1,) DEBUG:b2.api:calling B2Api.create_bucket(self=<b2.api.B2Api object at 0x7fe903205ba8>, name='collection-1', type_='allPublic') DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api001.backblazeb2.com DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_create_bucket HTTP/1.1" 400 100 DEBUG:storage.locations:download_file found file in local cache DEBUG:b2.api:calling B2Api.list_buckets(self=<b2.api.B2Api object at 0x7fe903205ba8>) DEBUG:requests.packages.urllib3.connectionpool:Resetting dropped connection: api001.backblazeb2.com DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_list_buckets HTTP/1.1" 200 257 DEBUG:b2.account_info.sqlite_account_info:calling SqliteAccountInfo.refresh_entire_bucket_name_cache(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7fe903205898>) (hidden args: name_id_iterable) DEBUG:b2.bucket:calling Bucket.upload_local_file(self=Bucket<50aedae9b486281c5595021c,collection-1,allPublic>, local_file='/srv/data/cache/collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f', file_name='a3bb6417571f362ebc7f8566591ca4b78083545f', content_type='video/mpeg4-generic', file_infos={'src_last_modified_millis': '1483117620156'}, sha1_sum='a3bb6417571f362ebc7f8566591ca4b78083545f', min_part_size=1063256064, progress_listener=<b2.progress.DoNothingProgressListener object at 0x7fe8fef14860>) DEBUG:b2.bucket:calling Bucket.upload(self=Bucket<50aedae9b486281c5595021c,collection-1,allPublic>, upload_source=<b2.upload_source.UploadSourceLocalFile object at 0x7fe8fef2a940>, file_name='a3bb6417571f362ebc7f8566591ca4b78083545f', content_type='video/mpeg4-generic', file_info={'src_last_modified_millis': '1483117620156'}, min_part_size=1063256064, progress_listener=<b2.progress.DoNothingProgressListener object at 0x7fe8fef14860>) DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7fe903205898>, bucket_id='50aedae9b486281c5595021c') DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289 DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1037-09.backblaze.com DEBUG:requests.packages.urllib3.connectionpool:https://pod-000-1037-09.backblaze.com:443 "POST /b2api/v1/b2_upload_file/50aedae9b486281c5595021c/c001_v0001037_t0058 HTTP/1.1" 200 490 DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.put_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7fe903205898>, bucket_id='50aedae9b486281c5595021c') (hidden args: upload_url, upload_auth_token) INFO:storage.locations:{'action': 'upload', 'uploadTimestamp': 1483878948000, 'fileName': 'a3bb6417571f362ebc7f8566591ca4b78083545f', 'size': 13343422, 'fileId': '4_z50aedae9b486281c5595021c_f10966612c0c706f1_d20170108_m123548_c001_v0001037_t0058'}
Can you please try to upload the same file with the console tool?
Could you please describe your use case?
How many times did you try this?
Have you tried this from different environments (network-wise)?
Hi @ppolewicz
Thanks for the prompt reply.
Can you please try to upload the same file with the console tool?
I did -> "when using the tool from CLI (leading to upload speeds as low as few KB/s).". Same file.
Could you please describe your use case?
Currently i'm just uploading a local file to backblaze. My target is to work with files that average 200-500 MB, so multipart upload is pretty important for me.
How many times did you try this?
Around 5 times in 3 different days.
Have you tried this from different environments (network-wise)?
Not really. I've a 50Mbps FO so I don't think this should be the problem.
Please try uploading with --threads 1. Does it work?
Could you please show us the logs of a b2 CLI upload using --debugLogs ConsoleTool flag? There is some environment information there that I'd like to see.
Regarding the use case, if possible, could you please describe what the application is going to be doing in its final state? We rarely have a chance to talk with a b2 CLI API user, so I'd like to use the opportunity to prioritize future development.
In the first post you said the problem occurred with a 13MB file too. Please confirm that this was really 13MB - b2 CLI code uses a different path to handle files smaller than 100 MB.
Hello,
Thanks for the prompt response.
In the first post you said the problem occurred with a 13MB file too. Please confirm that this was really 13MB - b2 CLI code uses a different path to handle files smaller than 100 MB.
Sorry I was not clear here. I've tried
- a 13MB file both from CLI and python code.
- a 500MB file with default min_part_size (doesn't work) and changing the min_part_size to 1GB (works).
- a 1.2GB file with default min_part_size (doesn't work).
Regarding the use case, if possible, could you please describe what the application is going to be doing in its final state? We rarely have a chance to talk with a b2 CLI API user, so I'd like to use the opportunity to prioritize future development.
Sure, no problem. Can we move this out of github though? pau.minoves /at/ gmail /dot/ com
Some environment info:
root@03f970c1e168:/srv/data/cache# uname -a
Linux 03f970c1e168 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux
root@03f970c1e168:/srv/data/cache# pip freeze
amqp==2.1.4
b2==0.6.8
Babel==2.3.4
billiard==3.5.0.2
celery==4.0.1
certifi==2015.4.28
Django==1.10.2
django-appconf==1.0.2
django-avatar==3.1.0
django-bootstrap3==7.1.0
django-celery-beat==1.0.1
django-datatables-view==1.13.2
django-extensions==1.7.5
ffmpy==0.2.2
flower==0.9.1
funcsigs==1.0.2
kombu==4.0.2
logfury==0.1.2
minio==2.0.4
Pillow==3.4.2
psycopg2==2.6.2
pytz==2016.10
redis==2.10.5
requests==2.12.4
six==1.10.0
tornado==4.2
tqdm==4.10.0
urllib3==1.19.1
vine==1.1.3
The test you request below:
A 1.9GB file with --debugLogs. It started with decent 5MB/s speed but then it dropped,m so I had to kill it:
root@03f970c1e168:/srv/data/cache# b2 upload_file --debugLogs collection-1 collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c 6d2f27f8a479798c60ea5e2ef3b0e010c999754c
collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c: 3%|███▊ | 67.0M/1.92G [07:18<1423:56:37, 361B/s]collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c: 4%|███▉ | 67.2M/1.92G [09:05<4:10:29, 123KB/s]
Interrupted. Shutting down...
A 1.9GB file with --debugLogs and --threads 1. In this case it didn't advance from here in 5 minutes:
root@03f970c1e168:/srv/data/cache# b2 upload_file --debugLogs --threads 1 collection-1 collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c 6d2f27f8a479798c60ea5e2ef3b0e010c999754c
collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c: 0%|▎ | 5.64M/1.92G [00:55<55:50:38, 9.51KB/s]collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c: 0%|▎ | 5.68M/1.92G [04:04<22:53:30, 23.2KB/s]
Interrupted. Shutting down...
root@03f970c1e168:/srv/data/cache#
a 13MB file. You can see it worked, but the transfer was by bursts (you can see the low transfer speed overall). When on a burst, the speed was good:
root@03f970c1e168:/srv/data/cache# b2 upload_file collection-1 collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f a3bb6417571f362ebc7f8566591ca4b78083545f
collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13.3M/13.3M [03:41<00:00, 60.3KB/s]
URL by file name: https://f001.backblazeb2.com/file/collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f
URL by fileId: https://f001.backblazeb2.com/b2api/v1/b2_download_file_by_id?fileId=4_z50aedae9b486281c5595021c_f1165ea41f344d37c_d20170111_m085239_c001_v0001036_t0041
{
"action": "upload",
"fileId": "4_z50aedae9b486281c5595021c_f1165ea41f344d37c_d20170111_m085239_c001_v0001036_t0041",
"fileName": "a3bb6417571f362ebc7f8566591ca4b78083545f",
"size": 13343422,
"uploadTimestamp": 1484124759000
}
This looks like a very interesting environment issue. You are doing the same things others do and it works for everyone, but it is not working in your environment for some reason. We just need to find it.
Pease rule out:
- CPU steal
- drops and errors on the network interface
- underlying source device being slow
- WAF or other "smart" traffic filters
If you don't know how to check for some of those, I can explain - just ask.
I definitely need some instructions to rule out those cases :)
Do you want dmesg of the virt machine?
Ok, CPU steal first. Start top in a parallel terminal prior to starting the upload, set the refresh frequency to 1s (d1<enter>) and observe the steal value during the test. As far as I know, values larger than 0.1% are a concern.
Use ifconfig from root to see stats of network interfaces before and after the test. If there is a difference between the values of network errors, duplicates, drops etc, then it is a concern. I've been told that good drop level is 0.1%.
To test slowness of underlying device, you could try uploading a file from a /dev/shm device, which is basically guaranteed to be fast. It is a "ram disk" for Linux. You need to Google a tutorial for it.
Detection of WAF is not something I am familiar with. If it is a corporate environment, you can ask your network administrator whether there is any device or program between your vm and the Internet. Such applications often have a function to detect or prevent exfiltration of data - which would explain a lot :)
If you found something interesting in dmesg, then please share it.
By the way, would you be willing to help extend b2 CLI with additional features to make some hard problems debuggable?
Hi!
Thanks for the detailed explanations. I'm trying it below. You should know that I'm running b2 inside a docker container on a virtual machine, both debian jessie. The host is macos.
CPU: The python process has been stable at 2-3% and the steal time has never moved from 0.0%. Disk: I run other hd intensive processes without problem so I think we can rule out device slowness. Network: No WAF (application level or other kind) I can confirm. Ifconfig shows nothing strange IMHO:
Within the VM:
--- p/broad-data/doc/deb.ifconfig.before 2017-01-12 09:33:19.536000000 +0100
+++ p/broad-data/doc/deb.ifconfig.after 2017-01-12 09:35:53.920000000 +0100
@@ -2,10 +2,10 @@
inet addr:172.18.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:c8ff:fefd:aa98/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:49472 errors:0 dropped:0 overruns:0 frame:0
- TX packets:71265 errors:0 dropped:0 overruns:0 carrier:0
+ RX packets:55033 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:80829 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
- RX bytes:208416676 (198.7 MiB) TX bytes:4838705 (4.6 MiB)
+ RX bytes:230599486 (219.9 MiB) TX bytes:5470040 (5.2 MiB)
docker0 Link encap:Ethernet HWaddr 02:42:c4:21:e7:5a
inet addr:172.17.42.1 Bcast:0.0.0.0 Mask:255.255.0.0
@@ -20,10 +20,10 @@
inet6 addr: fdb2:2c26:f4e4:0:21c:42ff:fe70:fafe/64 Scope:Global
inet6 addr: fe80::21c:42ff:fe70:fafe/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:87082 errors:0 dropped:0 overruns:0 frame:0
- TX packets:61386 errors:0 dropped:0 overruns:0 carrier:0
+ RX packets:97880 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:68011 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
- RX bytes:6080409 (5.7 MiB) TX bytes:216592324 (206.5 MiB)
+ RX bytes:6798926 (6.4 MiB) TX bytes:239808380 (228.6 MiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
@@ -45,34 +45,34 @@
veth04301c7 Link encap:Ethernet HWaddr 36:de:1b:48:3d:8a
inet6 addr: fe80::34de:1bff:fe48:3d8a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:761 errors:0 dropped:0 overruns:0 frame:0
- TX packets:662 errors:0 dropped:0 overruns:0 carrier:0
+ RX packets:919 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:789 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
- RX bytes:125694 (122.7 KiB) TX bytes:293242 (286.3 KiB)
+ RX bytes:150748 (147.2 KiB) TX bytes:362023 (353.5 KiB)
veth1f71926 Link encap:Ethernet HWaddr aa:bc:f5:e4:9b:b5
inet6 addr: fe80::a8bc:f5ff:fee4:9bb5/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:300 errors:0 dropped:0 overruns:0 frame:0
- TX packets:365 errors:0 dropped:0 overruns:0 carrier:0
+ RX packets:5861 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:9929 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
- RX bytes:28219 (27.5 KiB) TX bytes:366803 (358.2 KiB)
+ RX bytes:22288883 (21.2 MiB) TX bytes:998138 (974.7 KiB)
veth264345e Link encap:Ethernet HWaddr 32:84:86:33:00:a3
inet6 addr: fe80::3084:86ff:fe33:a3/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:1572 errors:0 dropped:0 overruns:0 frame:0
- TX packets:1770 errors:0 dropped:0 overruns:0 carrier:0
+ RX packets:1967 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:2204 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
- RX bytes:409319 (399.7 KiB) TX bytes:412234 (402.5 KiB)
+ RX bytes:511549 (499.5 KiB) TX bytes:517990 (505.8 KiB)
veth346a7ba Link encap:Ethernet HWaddr 02:b9:91:48:a7:84
inet6 addr: fe80::b9:91ff:fe48:a784/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:2241 errors:0 dropped:0 overruns:0 frame:0
- TX packets:2346 errors:0 dropped:0 overruns:0 carrier:0
+ RX packets:2802 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:2899 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
- RX bytes:686677 (670.5 KiB) TX bytes:535438 (522.8 KiB)
+ RX bytes:861214 (841.0 KiB) TX bytes:662722 (647.1 KiB)
veth9da60b2 Link encap:Ethernet HWaddr 02:18:62:de:d3:37
inet6 addr: fe80::18:62ff:fede:d337/64 Scope:Link
Within docker:
--- p/broad-data/var/data/ifconfig.before 2017-01-12 09:32:14.000000000 +0100
+++ p/broad-data/var/data/ifconfig.after 2017-01-12 09:35:46.464000000 +0100
@@ -2,17 +2,17 @@
inet addr:172.18.0.6 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:acff:fe12:6/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:364 errors:0 dropped:0 overruns:0 frame:0
- TX packets:299 errors:0 dropped:0 overruns:0 carrier:0
+ RX packets:9929 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:5861 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
- RX bytes:366761 (358.1 KiB) TX bytes:28177 (27.5 KiB)
+ RX bytes:998138 (974.7 KiB) TX bytes:22288883 (21.2 MiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
- RX packets:40 errors:0 dropped:0 overruns:0 frame:0
- TX packets:40 errors:0 dropped:0 overruns:0 carrier:0
+ RX packets:120 errors:0 dropped:0 overruns:0 frame:0
+ TX packets:120 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
- RX bytes:3762 (3.6 KiB) TX bytes:3762 (3.6 KiB)
+ RX bytes:11382 (11.1 KiB) TX bytes:11382 (11.1 KiB)
The thing is that the behaviour is clearly on bursts, so I don't think it is a matter of the environment being slow. Usually it goes fast to 3-5 MB/s and then there it slows, gradually going down to some KB/s. Sometimes there is a short 1-2MB/s burst in the middle.
I tried this from within the VM (so, no docker) and I got sustained 7-9 MB/s speeds. I guess this rules out a lot of things. I'll dive deeper into it later.
I did a test removing PYTHONUNBUFFERED 1 from the Dockerfile and the behaviour seems better but ends hanging too.
To wrap up the issue, having the client working on a docker is not of interest for this project unless consulting work is paid.
Operation within Docker is definitely of interest to us. Unfortunately, we haven't had time to set up a Docker environment ourselves to dig into what is causing the problem.
What I would do next is to try to figure out where the code is slowing down, perhaps by instrumenting the code to time calls that block (disk and network I/O).
Hi Brian,
I tried to run python -m to instrumentalize that, but failed as the b2 has no .py termination. Using a softlink or a rename didn't work. If you know how to run it with the b2 cli, I'll be happy to post results here.
Thanks.
If you check out the repository from GitHub, you can run it with "python -m" from there.
Ok! I managed to do it with the sources. Profiling output below.
No surprises. As mentioned, it blocks on waiter.acquire().
root@684e47ba752a:~/p/B2_Command_Line_Tool/build/lib# python print_dump.py
Thu Feb 2 16:51:09 2017 out.log
32588 function calls (32179 primitive calls) in 873.669 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 873.665 873.665 /root/p/B2_Command_Line_Tool/build/lib/b2/console_tool.py:962(run_command)
1 0.000 0.000 873.663 873.663 /root/p/B2_Command_Line_Tool/build/lib/b2/console_tool.py:894(run)
7/2 0.000 0.000 873.662 436.831 /usr/local/lib/python3.5/site-packages/logfury/v0_1/trace_call.py:36(wrapper)
1 0.000 0.000 873.662 873.662 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:301(upload_local_file)
1 0.000 0.000 873.662 873.662 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:324(upload)
1 0.000 0.000 873.662 873.662 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:409(_upload_large_file)
1 0.000 0.000 871.072 871.072 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:450(<listcomp>)
1 0.010 0.010 871.072 871.072 /root/p/B2_Command_Line_Tool/build/lib/b2/utils.py:51(interruptible_get_result)
870 0.033 0.000 871.062 1.001 /usr/local/lib/python3.5/concurrent/futures/_base.py:378(result)
881 0.031 0.000 871.059 0.989 /usr/local/lib/python3.5/threading.py:261(wait)
1819 871.016 0.479 871.016 0.479 {method 'acquire' of '_thread.lock' objects}
3 0.000 0.000 2.548 0.849 /root/p/B2_Command_Line_Tool/build/lib/b2/session.py:29(wrapper)
3 0.000 0.000 2.546 0.849 /root/p/B2_Command_Line_Tool/build/lib/b2/raw_api.py:124(_post_json)
3 0.000 0.000 2.546 0.849 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:270(post_json_return_json)
3 0.000 0.000 2.545 0.848 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:230(post_content_return_json)
3 0.000 0.000 2.545 0.848 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:86(_translate_and_retry)
3 0.000 0.000 2.545 0.848 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:41(_translate_errors)
3 0.000 0.000 2.545 0.848 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:252(do_post)
3 0.000 0.000 2.538 0.846 /usr/local/lib/python3.5/site-packages/requests/sessions.py:525(post)
3 0.000 0.000 2.538 0.846 /usr/local/lib/python3.5/site-packages/requests/sessions.py:411(request)
3 0.000 0.000 2.529 0.843 /usr/local/lib/python3.5/site-packages/requests/sessions.py:569(send)
3 0.000 0.000 2.528 0.843 /usr/local/lib/python3.5/site-packages/requests/adapters.py:375(send)
3 0.000 0.000 2.526 0.842 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py:449(urlopen)
3 0.000 0.000 2.525 0.842 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py:326(_make_request)
1 0.000 0.000 2.342 2.342 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:457(_find_unfinished_file)
10 0.000 0.000 2.101 0.210 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:263(list_unfinished_large_files)
1 0.000 0.000 2.100 2.100 /root/p/B2_Command_Line_Tool/build/lib/b2/raw_api.py:371(list_unfinished_large_files)
3 0.000 0.000 1.615 0.538 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py:827(_validate_conn)
1 0.000 0.000 1.615 1.615 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connection.py:279(connect)
1 0.000 0.000 1.331 1.331 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/util/ssl_.py:277(ssl_wrap_socket)
1 0.000 0.000 1.322 1.322 /usr/local/lib/python3.5/ssl.py:369(wrap_socket)
1 0.000 0.000 1.322 1.322 /usr/local/lib/python3.5/ssl.py:667(__init__)
1 0.000 0.000 1.322 1.322 /usr/local/lib/python3.5/ssl.py:981(do_handshake)
1 0.000 0.000 1.322 1.322 /usr/local/lib/python3.5/ssl.py:631(do_handshake)
1 1.322 1.322 1.322 1.322 {method 'do_handshake' of '_ssl._SSLSocket' objects}
3 0.000 0.000 0.909 0.303 /usr/local/lib/python3.5/http/client.py:1153(getresponse)
3 0.000 0.000 0.909 0.303 /usr/local/lib/python3.5/http/client.py:290(begin)
3 0.000 0.000 0.907 0.302 /usr/local/lib/python3.5/http/client.py:257(_read_status)
21 0.000 0.000 0.907 0.043 {method 'readline' of '_io.BufferedReader' objects}
3 0.000 0.000 0.907 0.302 /usr/local/lib/python3.5/socket.py:561(readinto)
3 0.000 0.000 0.907 0.302 /usr/local/lib/python3.5/ssl.py:918(recv_into)
3 0.000 0.000 0.907 0.302 /usr/local/lib/python3.5/ssl.py:783(read)
3 0.000 0.000 0.907 0.302 /usr/local/lib/python3.5/ssl.py:568(read)
3 0.907 0.302 0.907 0.302 {method 'read' of '_ssl._SSLSocket' objects}
1 0.000 0.000 0.283 0.283 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connection.py:124(_new_conn)
1 0.000 0.000 0.283 0.283 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/util/connection.py:51(create_connection)
1 0.000 0.000 0.241 0.241 /root/p/B2_Command_Line_Tool/build/lib/b2/api.py:210(list_parts)
1 0.000 0.000 0.240 0.240 /root/p/B2_Command_Line_Tool/build/lib/b2/raw_api.py:361(list_parts)
1 0.000 0.000 0.206 0.206 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:280(start_large_file)
1 0.000 0.000 0.206 0.206 /root/p/B2_Command_Line_Tool/build/lib/b2/raw_api.py:383(start_large_file)
1 0.191 0.191 0.191 0.191 {method 'connect' of '_socket.socket' objects}
1 0.000 0.000 0.091 0.091 /usr/local/lib/python3.5/socket.py:715(getaddrinfo)
1 0.091 0.091 0.091 0.091 {built-in method _socket.getaddrinfo}
1 0.000 0.000 0.041 0.041 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:436(<listcomp>)
19 0.000 0.000 0.040 0.002 /usr/local/lib/python3.5/concurrent/futures/thread.py:104(submit)
19 0.000 0.000 0.040 0.002 /usr/local/lib/python3.5/concurrent/futures/thread.py:117(_adjust_thread_count)
11 0.000 0.000 0.039 0.004 /usr/local/lib/python3.5/threading.py:826(start)
11 0.001 0.000 0.038 0.003 /usr/local/lib/python3.5/threading.py:531(wait)
1 0.008 0.008 0.008 0.008 {method 'load_verify_locations' of '_ssl._SSLContext' objects}
3 0.000 0.000 0.008 0.003 /usr/local/lib/python3.5/site-packages/requests/sessions.py:371(prepare_request)
3 0.000 0.000 0.007 0.002 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:328(_run_post_request_hooks)
Thanks for getting that info. It looks like uploading the first part of the file never finished.
This is the code that uploads a large file:
# Tell the executor to upload each of the parts
part_futures = [
self.api.get_thread_pool().submit(
self._upload_part,
file_id,
part_index + 1, # part number
part_range,
upload_source,
large_file_upload_state,
finished_parts
) for (part_index, part_range) in enumerate(part_ranges)
]
# Collect the sha1 checksums of the parts as the uploads finish.
# If any of them raised an exception, that same exception will
# be raised here by result()
part_sha1_array = [interruptible_get_result(f)['contentSha1'] for f in part_futures]
And it's hanging in the call to interruptible_get_result:
def interruptible_get_result(future):
"""
Waits for the result of a future in a way that can be interrupted
by a KeyboardInterrupt.
This is not necessary in Python 3, but is needed for Python 2.
"""
while True:
try:
return future.result(timeout=1.0)
except futures.TimeoutError:
pass
There was only one call to interruptible_get_result, and 870 calls to future.result, and one per second, which accounts for the 873 seconds of run time.
Now we're down to the real question: why aren't the uploads going?
If you have time, I'm really curious what happens with the HTTP call to upload the file. I think what I would try doing is uploading a file using curl. The docs have sample curl calls to upload a file.
The first step is to get your account auth token:
$ b2 get-account-info
{
"accountAuthToken": "3_20170123184412_..._acct",
"accountId": "...",
"apiUrl": "https://api001.backblazeb2.com",
"applicationKey": "...",
"downloadUrl": "https://f001.backblazeb2.com"
}
Then call b2_get_upload_url:
curl \
-H 'Authorization: $ACCOUNT_AUTHORIZATION_TOKEN' \
-d '{"bucketId": "$BUCKET_ID"}' \
$API_URL/b2api/v1/b2_get_upload_url
And finally, b2_upload_file:
FILE_TO_UPLOAD=typing_test.txt
MIME_TYPE=text/plain
SHA1_OF_FILE=$(openssl dgst -sha1 $FILE_TO_UPLOAD | awk '{print $2;}')
UPLOAD_URL=... # from b2_get_upload_url call
UPLOAD_AUTHORIZATION_TOKEN=... # from b2_get_upload_url call
curl \
-H "Authorization: $UPLOAD_AUTHORIZATION_TOKEN" \
-H "X-Bz-File-Name: $FILE_TO_UPLOAD" \
-H "Content-Type: $MIME_TYPE" \
-H "X-Bz-Content-Sha1: $SHA1_OF_FILE" \
-H "X-Bz-Info-Author: unknown" \
--data-binary "@$FILE_TO_UPLOAD" \
$UPLOAD_URL
Hi @bwbeach,
Sure, here you have it. Apparently it works at good speed.
root@98464a386713:/srv/src/webapp# curl -v -H "Authorization: X_upld" -H "X-Bz-File-Name: 6d2f27f8a479798c60ea5e2ef3b0e010c999754c" -H "Content-Type: application/octet-stream" -H "X-Bz-Content-Sha1: 6d2f27f8a479798c60ea5e2ef3b0e010c999754c" -H "X-Bz-Info-Author: unknown" --data-binary @/srv/data/cache/collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c https://pod-000-1026-02.backblaze.com/b2api/v1/b2_upload_file/50aedae9b486281c5595021c/c001_v0001026_t0004
* Hostname was NOT found in DNS cache
* Trying 162.244.62.37...
* Connected to pod-000-1026-02.backblaze.com (162.244.62.37) port 443 (#0)
* successfully set certificate verify locations:
* CAfile: none
CApath: /etc/ssl/certs
* SSLv3, TLS handshake, Client hello (1):
* SSLv3, TLS handshake, Server hello (2):
* SSLv3, TLS handshake, CERT (11):
* SSLv3, TLS handshake, Server key exchange (12):
* SSLv3, TLS handshake, Server finished (14):
* SSLv3, TLS handshake, Client key exchange (16):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-SHA
* Server certificate:
* subject: OU=Domain Control Validated; CN=*.backblaze.com
* start date: 2016-01-22 16:13:38 GMT
* expire date: 2018-03-06 00:33:16 GMT
* subjectAltName: pod-000-1026-02.backblaze.com matched
* issuer: C=US; ST=Arizona; L=Scottsdale; O=GoDaddy.com, Inc.; OU=http://certs.godaddy.com/repository/; CN=Go Daddy Secure Certificate Authority - G2
* SSL certificate verify ok.
> POST /b2api/v1/b2_upload_file/50aedae9b486281c5595021c/c001_v0001026_t0004 HTTP/1.1
> User-Agent: curl/7.38.0
> Host: pod-000-1026-02.backblaze.com
> Accept: */*
> Authorization: X_upld
> X-Bz-File-Name: 6d2f27f8a479798c60ea5e2ef3b0e010c999754c
> Content-Type: application/octet-stream
> X-Bz-Content-Sha1: 6d2f27f8a479798c60ea5e2ef3b0e010c999754c
> X-Bz-Info-Author: unknown
> Content-Length: 1916917357
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
* Server Apache-Coyote/1.1 is not blacklisted
< Server: Apache-Coyote/1.1
< Cache-Control: max-age=0, no-cache, no-store
< Content-Type: application/json;charset=UTF-8
< Content-Length: 473
< Date: Thu, 02 Feb 2017 20:58:14 GMT
<
{
"accountId": "0ea9468c552c",
"action": "upload",
"bucketId": "50aedae9b486281c5595021c",
"contentLength": 1916917357,
"contentSha1": "6d2f27f8a479798c60ea5e2ef3b0e010c999754c",
"contentType": "application/octet-stream",
"fileId": "4_z50aedae9b486281c5595021c_f102d7e77cbedbee0_d20170202_m204939_c001_v0001026_t0004",
"fileInfo": {
"author": "unknown"
},
"fileName": "6d2f27f8a479798c60ea5e2ef3b0e010c999754c",
"uploadTimestamp": 1486068579000
}
* Connection #0 to host pod-000-1026-02.backblaze.com left intact
Thanks for doing that. It may be time for me to set up a Docker environment and try it myself, to debug where the issue is. Probably won't get to it for a couple weeks, though. (This week is busy, and I'll be on vacation next week.)
My approach will be to reproduce the problem, then gradually simplify the code until getting to the essence of the problem. I think the first thing to do is to serialize the uploads, rather than running them in a thread pool.
Hello @bwbeach could you make any advance? Thanks.
I'm afraid I didn't get a chance to look at it, and then it slid off my task list. I'll try and find some time to look at it.
Closing as stale; please reopen if errors still occur with the current version.