B2_Command_Line_Tool icon indicating copy to clipboard operation
B2_Command_Line_Tool copied to clipboard

When in docker, python client code hanging when uploading as multipart large file.

Open pau-minoves opened this issue 9 years ago • 22 comments

(posting this here as instructed from B2 support)

Hi,

I'm developing an application and trying B2 as a storage backend. I think I've hit a bug with b2 CLI python code when uploading files as multipart. I'm using the code for the b2 python CLI directly from python but I see the same behaviour when using the tool from CLI (leading to upload speeds as low as few KB/s).

It seems to be a threading deadlock. I attach a txt with some information that I hope helps debug it.

-- My code:

info = SqliteAccountInfo()
self.b2_api = B2Api(info, AuthInfoCache(info))
self.b2_api.authorize_account('production',
                              self.conf['account_id'],

                              self.conf['application_key'])


bucket = self.b2_api.get_bucket_by_name(media_file.collection.safe_name)


file_infos = {'src_last_modified_millis': str(int(os.path.getmtime(media_file.path) * 1000))}  # Each fileInfo is of the form "a=b".



with make_progress_listener(media_file.path, True) as progress_listener:

    file_info = bucket.upload_local_file(

        local_file=media_file.path,

        file_name=media_file.checksum,

        content_type=media_file.get_mime_type(),

        file_infos=file_infos,

        sha1_sum=media_file.checksum,

        min_part_size=(1024 * 1014 * 1024),  # Size in bytes, defaults to 100M, set to 1GB

        progress_listener=progress_listener

    )

response = file_info.as_dict()
log.info(response)

Trying with a 13M file. I’ve also tried with larger files (1GB, 500MB) with the same behaviour. When min_part_size is None:

>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>> storage_manager.migrate_storage(sample_file, 'backblaze', False)

INFO:storage.locations:Migrating media file from minio_local to backblaze
DEBUG:b2.api:calling B2Api.create_bucket(self=<b2.api.B2Api object at 0x7f6d90f7bba8>, name='collection-1', type_='allPublic')
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_create_bucket HTTP/1.1" 400 100
DEBUG:storage.locations:download_file found file in local cache
DEBUG:b2.bucket:calling Bucket.upload_local_file(self=Bucket<50aedae9b486281c5595021c,collection-1,None>, local_file='/srv/data/cache/collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f', file_name='a3bb6417571f362ebc7f8566591ca4b78083545f', content_type='video/mpeg4-generic', file_infos={'src_last_modified_millis': '1483117620156'}, sha1_sum='a3bb6417571f362ebc7f8566591ca4b78083545f', min_part_size=None, progress_listener=<b2.progress.DoNothingProgressListener object at 0x7f6d8ccb93c8>)
DEBUG:b2.bucket:calling Bucket.upload(self=Bucket<50aedae9b486281c5595021c,collection-1,None>, upload_source=<b2.upload_source.UploadSourceLocalFile object at 0x7f6d8cc9f940>, file_name='a3bb6417571f362ebc7f8566591ca4b78083545f', content_type='video/mpeg4-generic', file_info={'src_last_modified_millis': '1483117620156'}, min_part_size=None, progress_listener=<b2.progress.DoNothingProgressListener object at 0x7f6d8ccb93c8>)
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:requests.packages.urllib3.connectionpool:Resetting dropped connection: api001.backblazeb2.com
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1036-13.backblaze.com
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.clear_bucket_upload_data(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:requests.packages.urllib3.connectionpool:Resetting dropped connection: api001.backblazeb2.com
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1037-08.backblaze.com
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.clear_bucket_upload_data(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1031-15.backblaze.com
DEBUG:requests.packages.urllib3.connectionpool:https://pod-000-1031-15.backblaze.com:443 "POST /b2api/v1/b2_upload_file/50aedae9b486281c5595021c/c001_v0001031_t0007 HTTP/1.1" 200 490
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.put_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c') (hidden args: upload_url, upload_auth_token)
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.clear_bucket_upload_data(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7f6d90f7b898>, bucket_id='50aedae9b486281c5595021c')
DEBUG:requests.packages.urllib3.connectionpool:Resetting dropped connection: api001.backblazeb2.com
DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289
DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1037-17.backblaze.com
^C
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/srv/src/webapp/storage/locations.py", line 219, in migrate_storage
    destination.upload_file(media_file)
  File "/srv/src/webapp/storage/locations.py", line 165, in upload_file
    progress_listener=progress_listener
  File "/usr/local/lib/python3.5/site-packages/logfury/v0_1/trace_call.py", line 84, in wrapper
    return function(*wrapee_args, **wrapee_kwargs)
  File "/usr/local/lib/python3.5/site-packages/b2/bucket.py", line 291, in upload_local_file
    progress_listener=progress_listener
  File "/usr/local/lib/python3.5/site-packages/logfury/v0_1/trace_call.py", line 84, in wrapper
    return function(*wrapee_args, **wrapee_kwargs)
  File "/usr/local/lib/python3.5/site-packages/b2/bucket.py", line 339, in upload
    return f.result()
  File "/usr/local/lib/python3.5/concurrent/futures/_base.py", line 400, in result
    self._condition.wait(timeout)
  File "/usr/local/lib/python3.5/threading.py", line 293, in wait
    waiter.acquire()
KeyboardInterrupt

This runs indefinetly until I hit control+c. The traceback is always the same for all the tries I’ve done. Always locked in waiter.acquire().

With min_part_size very high (1GB) to ensure multi part upload is not used, it works.

INFO:storage.locations:Migrating media file from minio_local to backblaze (0.001) SELECT "mediabroker_mediacollection"."id", "mediabroker_mediacollection"."name", "mediabroker_mediacollection"."type", "mediabroker_mediacollection"."shared", "mediabroker_mediacollection"."user_id", "mediabroker_mediacollection"."date_created", "mediabroker_mediacollection"."date_last_updated" FROM "mediabroker_mediacollection" WHERE "mediabroker_mediacollection"."id" = 1; args=(1,) DEBUG:django.db.backends:(0.001) SELECT "mediabroker_mediacollection"."id", "mediabroker_mediacollection"."name", "mediabroker_mediacollection"."type", "mediabroker_mediacollection"."shared", "mediabroker_mediacollection"."user_id", "mediabroker_mediacollection"."date_created", "mediabroker_mediacollection"."date_last_updated" FROM "mediabroker_mediacollection" WHERE "mediabroker_mediacollection"."id" = 1; args=(1,) DEBUG:b2.api:calling B2Api.create_bucket(self=<b2.api.B2Api object at 0x7fe903205ba8>, name='collection-1', type_='allPublic') DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): api001.backblazeb2.com DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_create_bucket HTTP/1.1" 400 100 DEBUG:storage.locations:download_file found file in local cache DEBUG:b2.api:calling B2Api.list_buckets(self=<b2.api.B2Api object at 0x7fe903205ba8>) DEBUG:requests.packages.urllib3.connectionpool:Resetting dropped connection: api001.backblazeb2.com DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_list_buckets HTTP/1.1" 200 257 DEBUG:b2.account_info.sqlite_account_info:calling SqliteAccountInfo.refresh_entire_bucket_name_cache(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7fe903205898>) (hidden args: name_id_iterable) DEBUG:b2.bucket:calling Bucket.upload_local_file(self=Bucket<50aedae9b486281c5595021c,collection-1,allPublic>, local_file='/srv/data/cache/collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f', file_name='a3bb6417571f362ebc7f8566591ca4b78083545f', content_type='video/mpeg4-generic', file_infos={'src_last_modified_millis': '1483117620156'}, sha1_sum='a3bb6417571f362ebc7f8566591ca4b78083545f', min_part_size=1063256064, progress_listener=<b2.progress.DoNothingProgressListener object at 0x7fe8fef14860>) DEBUG:b2.bucket:calling Bucket.upload(self=Bucket<50aedae9b486281c5595021c,collection-1,allPublic>, upload_source=<b2.upload_source.UploadSourceLocalFile object at 0x7fe8fef2a940>, file_name='a3bb6417571f362ebc7f8566591ca4b78083545f', content_type='video/mpeg4-generic', file_info={'src_last_modified_millis': '1483117620156'}, min_part_size=1063256064, progress_listener=<b2.progress.DoNothingProgressListener object at 0x7fe8fef14860>) DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.take_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7fe903205898>, bucket_id='50aedae9b486281c5595021c') DEBUG:requests.packages.urllib3.connectionpool:https://api001.backblazeb2.com:443 "POST /b2api/v1/b2_get_upload_url HTTP/1.1" 200 289 DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): pod-000-1037-09.backblaze.com DEBUG:requests.packages.urllib3.connectionpool:https://pod-000-1037-09.backblaze.com:443 "POST /b2api/v1/b2_upload_file/50aedae9b486281c5595021c/c001_v0001037_t0058 HTTP/1.1" 200 490 DEBUG:b2.account_info.upload_url_pool:calling UrlPoolAccountInfo.put_bucket_upload_url(self=<b2.account_info.sqlite_account_info.SqliteAccountInfo object at 0x7fe903205898>, bucket_id='50aedae9b486281c5595021c') (hidden args: upload_url, upload_auth_token) INFO:storage.locations:{'action': 'upload', 'uploadTimestamp': 1483878948000, 'fileName': 'a3bb6417571f362ebc7f8566591ca4b78083545f', 'size': 13343422, 'fileId': '4_z50aedae9b486281c5595021c_f10966612c0c706f1_d20170108_m123548_c001_v0001037_t0058'}

pau-minoves avatar Jan 10 '17 11:01 pau-minoves

Can you please try to upload the same file with the console tool?

Could you please describe your use case?

How many times did you try this?

Have you tried this from different environments (network-wise)?

ppolewicz avatar Jan 10 '17 11:01 ppolewicz

Hi @ppolewicz

Thanks for the prompt reply.

Can you please try to upload the same file with the console tool?

I did -> "when using the tool from CLI (leading to upload speeds as low as few KB/s).". Same file.

Could you please describe your use case?

Currently i'm just uploading a local file to backblaze. My target is to work with files that average 200-500 MB, so multipart upload is pretty important for me.

How many times did you try this?

Around 5 times in 3 different days.

Have you tried this from different environments (network-wise)?

Not really. I've a 50Mbps FO so I don't think this should be the problem.

pau-minoves avatar Jan 10 '17 12:01 pau-minoves

Please try uploading with --threads 1. Does it work?

ppolewicz avatar Jan 10 '17 15:01 ppolewicz

Could you please show us the logs of a b2 CLI upload using --debugLogs ConsoleTool flag? There is some environment information there that I'd like to see.

ppolewicz avatar Jan 10 '17 20:01 ppolewicz

Regarding the use case, if possible, could you please describe what the application is going to be doing in its final state? We rarely have a chance to talk with a b2 CLI API user, so I'd like to use the opportunity to prioritize future development.

ppolewicz avatar Jan 10 '17 20:01 ppolewicz

In the first post you said the problem occurred with a 13MB file too. Please confirm that this was really 13MB - b2 CLI code uses a different path to handle files smaller than 100 MB.

ppolewicz avatar Jan 10 '17 20:01 ppolewicz

Hello,

Thanks for the prompt response.

In the first post you said the problem occurred with a 13MB file too. Please confirm that this was really 13MB - b2 CLI code uses a different path to handle files smaller than 100 MB.

Sorry I was not clear here. I've tried

  1. a 13MB file both from CLI and python code.
  2. a 500MB file with default min_part_size (doesn't work) and changing the min_part_size to 1GB (works).
  3. a 1.2GB file with default min_part_size (doesn't work).

Regarding the use case, if possible, could you please describe what the application is going to be doing in its final state? We rarely have a chance to talk with a b2 CLI API user, so I'd like to use the opportunity to prioritize future development.

Sure, no problem. Can we move this out of github though? pau.minoves /at/ gmail /dot/ com

Some environment info:

root@03f970c1e168:/srv/data/cache# uname -a
Linux 03f970c1e168 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux
root@03f970c1e168:/srv/data/cache# pip freeze
amqp==2.1.4
b2==0.6.8
Babel==2.3.4
billiard==3.5.0.2
celery==4.0.1
certifi==2015.4.28
Django==1.10.2
django-appconf==1.0.2
django-avatar==3.1.0
django-bootstrap3==7.1.0
django-celery-beat==1.0.1
django-datatables-view==1.13.2
django-extensions==1.7.5
ffmpy==0.2.2
flower==0.9.1
funcsigs==1.0.2
kombu==4.0.2
logfury==0.1.2
minio==2.0.4
Pillow==3.4.2
psycopg2==2.6.2
pytz==2016.10
redis==2.10.5
requests==2.12.4
six==1.10.0
tornado==4.2
tqdm==4.10.0
urllib3==1.19.1
vine==1.1.3

The test you request below:

A 1.9GB file with --debugLogs. It started with decent 5MB/s speed but then it dropped,m so I had to kill it:

root@03f970c1e168:/srv/data/cache# b2 upload_file --debugLogs collection-1 collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c  6d2f27f8a479798c60ea5e2ef3b0e010c999754c
collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c:   3%|███▊                                                                                                          | 67.0M/1.92G [07:18<1423:56:37, 361B/s]collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c:   4%|███▉                                                                                                            | 67.2M/1.92G [09:05<4:10:29, 123KB/s]

Interrupted.  Shutting down...

A 1.9GB file with --debugLogs and --threads 1. In this case it didn't advance from here in 5 minutes:

root@03f970c1e168:/srv/data/cache# b2 upload_file --debugLogs --threads 1 collection-1 collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c  6d2f27f8a479798c60ea5e2ef3b0e010c999754c
collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c:   0%|▎                                                                                                             | 5.64M/1.92G [00:55<55:50:38, 9.51KB/s]collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c:   0%|▎                                                                                                             | 5.68M/1.92G [04:04<22:53:30, 23.2KB/s]

Interrupted.  Shutting down...

root@03f970c1e168:/srv/data/cache#

a 13MB file. You can see it worked, but the transfer was by bursts (you can see the low transfer speed overall). When on a burst, the speed was good:

root@03f970c1e168:/srv/data/cache# b2 upload_file  collection-1 collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f a3bb6417571f362ebc7f8566591ca4b78083545f
collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13.3M/13.3M [03:41<00:00, 60.3KB/s]
URL by file name: https://f001.backblazeb2.com/file/collection-1/a3bb6417571f362ebc7f8566591ca4b78083545f
URL by fileId: https://f001.backblazeb2.com/b2api/v1/b2_download_file_by_id?fileId=4_z50aedae9b486281c5595021c_f1165ea41f344d37c_d20170111_m085239_c001_v0001036_t0041
{
  "action": "upload",
  "fileId": "4_z50aedae9b486281c5595021c_f1165ea41f344d37c_d20170111_m085239_c001_v0001036_t0041",
  "fileName": "a3bb6417571f362ebc7f8566591ca4b78083545f",
  "size": 13343422,
  "uploadTimestamp": 1484124759000
}

pau-minoves avatar Jan 11 '17 08:01 pau-minoves

This looks like a very interesting environment issue. You are doing the same things others do and it works for everyone, but it is not working in your environment for some reason. We just need to find it.

Pease rule out:

  • CPU steal
  • drops and errors on the network interface
  • underlying source device being slow
  • WAF or other "smart" traffic filters

If you don't know how to check for some of those, I can explain - just ask.

ppolewicz avatar Jan 11 '17 09:01 ppolewicz

I definitely need some instructions to rule out those cases :)

Do you want dmesg of the virt machine?

pau-minoves avatar Jan 11 '17 09:01 pau-minoves

Ok, CPU steal first. Start top in a parallel terminal prior to starting the upload, set the refresh frequency to 1s (d1<enter>) and observe the steal value during the test. As far as I know, values larger than 0.1% are a concern.

Use ifconfig from root to see stats of network interfaces before and after the test. If there is a difference between the values of network errors, duplicates, drops etc, then it is a concern. I've been told that good drop level is 0.1%.

To test slowness of underlying device, you could try uploading a file from a /dev/shm device, which is basically guaranteed to be fast. It is a "ram disk" for Linux. You need to Google a tutorial for it.

Detection of WAF is not something I am familiar with. If it is a corporate environment, you can ask your network administrator whether there is any device or program between your vm and the Internet. Such applications often have a function to detect or prevent exfiltration of data - which would explain a lot :)

If you found something interesting in dmesg, then please share it.

By the way, would you be willing to help extend b2 CLI with additional features to make some hard problems debuggable?

ppolewicz avatar Jan 11 '17 11:01 ppolewicz

Hi!

Thanks for the detailed explanations. I'm trying it below. You should know that I'm running b2 inside a docker container on a virtual machine, both debian jessie. The host is macos.

CPU: The python process has been stable at 2-3% and the steal time has never moved from 0.0%. Disk: I run other hd intensive processes without problem so I think we can rule out device slowness. Network: No WAF (application level or other kind) I can confirm. Ifconfig shows nothing strange IMHO:

Within the VM:

--- p/broad-data/doc/deb.ifconfig.before	2017-01-12 09:33:19.536000000 +0100
+++ p/broad-data/doc/deb.ifconfig.after	2017-01-12 09:35:53.920000000 +0100
@@ -2,10 +2,10 @@
           inet addr:172.18.0.1  Bcast:0.0.0.0  Mask:255.255.0.0
           inet6 addr: fe80::42:c8ff:fefd:aa98/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
-          RX packets:49472 errors:0 dropped:0 overruns:0 frame:0
-          TX packets:71265 errors:0 dropped:0 overruns:0 carrier:0
+          RX packets:55033 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:80829 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
-          RX bytes:208416676 (198.7 MiB)  TX bytes:4838705 (4.6 MiB)
+          RX bytes:230599486 (219.9 MiB)  TX bytes:5470040 (5.2 MiB)

 docker0   Link encap:Ethernet  HWaddr 02:42:c4:21:e7:5a
           inet addr:172.17.42.1  Bcast:0.0.0.0  Mask:255.255.0.0
@@ -20,10 +20,10 @@
           inet6 addr: fdb2:2c26:f4e4:0:21c:42ff:fe70:fafe/64 Scope:Global
           inet6 addr: fe80::21c:42ff:fe70:fafe/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
-          RX packets:87082 errors:0 dropped:0 overruns:0 frame:0
-          TX packets:61386 errors:0 dropped:0 overruns:0 carrier:0
+          RX packets:97880 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:68011 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
-          RX bytes:6080409 (5.7 MiB)  TX bytes:216592324 (206.5 MiB)
+          RX bytes:6798926 (6.4 MiB)  TX bytes:239808380 (228.6 MiB)

 lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
@@ -45,34 +45,34 @@
 veth04301c7 Link encap:Ethernet  HWaddr 36:de:1b:48:3d:8a
           inet6 addr: fe80::34de:1bff:fe48:3d8a/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
-          RX packets:761 errors:0 dropped:0 overruns:0 frame:0
-          TX packets:662 errors:0 dropped:0 overruns:0 carrier:0
+          RX packets:919 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:789 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
-          RX bytes:125694 (122.7 KiB)  TX bytes:293242 (286.3 KiB)
+          RX bytes:150748 (147.2 KiB)  TX bytes:362023 (353.5 KiB)

 veth1f71926 Link encap:Ethernet  HWaddr aa:bc:f5:e4:9b:b5
           inet6 addr: fe80::a8bc:f5ff:fee4:9bb5/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
-          RX packets:300 errors:0 dropped:0 overruns:0 frame:0
-          TX packets:365 errors:0 dropped:0 overruns:0 carrier:0
+          RX packets:5861 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:9929 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
-          RX bytes:28219 (27.5 KiB)  TX bytes:366803 (358.2 KiB)
+          RX bytes:22288883 (21.2 MiB)  TX bytes:998138 (974.7 KiB)

 veth264345e Link encap:Ethernet  HWaddr 32:84:86:33:00:a3
           inet6 addr: fe80::3084:86ff:fe33:a3/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
-          RX packets:1572 errors:0 dropped:0 overruns:0 frame:0
-          TX packets:1770 errors:0 dropped:0 overruns:0 carrier:0
+          RX packets:1967 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:2204 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
-          RX bytes:409319 (399.7 KiB)  TX bytes:412234 (402.5 KiB)
+          RX bytes:511549 (499.5 KiB)  TX bytes:517990 (505.8 KiB)

 veth346a7ba Link encap:Ethernet  HWaddr 02:b9:91:48:a7:84
           inet6 addr: fe80::b9:91ff:fe48:a784/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
-          RX packets:2241 errors:0 dropped:0 overruns:0 frame:0
-          TX packets:2346 errors:0 dropped:0 overruns:0 carrier:0
+          RX packets:2802 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:2899 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
-          RX bytes:686677 (670.5 KiB)  TX bytes:535438 (522.8 KiB)
+          RX bytes:861214 (841.0 KiB)  TX bytes:662722 (647.1 KiB)

 veth9da60b2 Link encap:Ethernet  HWaddr 02:18:62:de:d3:37
           inet6 addr: fe80::18:62ff:fede:d337/64 Scope:Link

Within docker:

--- p/broad-data/var/data/ifconfig.before	2017-01-12 09:32:14.000000000 +0100
+++ p/broad-data/var/data/ifconfig.after	2017-01-12 09:35:46.464000000 +0100
@@ -2,17 +2,17 @@
           inet addr:172.18.0.6  Bcast:0.0.0.0  Mask:255.255.0.0
           inet6 addr: fe80::42:acff:fe12:6/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
-          RX packets:364 errors:0 dropped:0 overruns:0 frame:0
-          TX packets:299 errors:0 dropped:0 overruns:0 carrier:0
+          RX packets:9929 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:5861 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
-          RX bytes:366761 (358.1 KiB)  TX bytes:28177 (27.5 KiB)
+          RX bytes:998138 (974.7 KiB)  TX bytes:22288883 (21.2 MiB)

 lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:65536  Metric:1
-          RX packets:40 errors:0 dropped:0 overruns:0 frame:0
-          TX packets:40 errors:0 dropped:0 overruns:0 carrier:0
+          RX packets:120 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:120 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
-          RX bytes:3762 (3.6 KiB)  TX bytes:3762 (3.6 KiB)
+          RX bytes:11382 (11.1 KiB)  TX bytes:11382 (11.1 KiB)

The thing is that the behaviour is clearly on bursts, so I don't think it is a matter of the environment being slow. Usually it goes fast to 3-5 MB/s and then there it slows, gradually going down to some KB/s. Sometimes there is a short 1-2MB/s burst in the middle.

I tried this from within the VM (so, no docker) and I got sustained 7-9 MB/s speeds. I guess this rules out a lot of things. I'll dive deeper into it later.

pau-minoves avatar Jan 12 '17 09:01 pau-minoves

I did a test removing PYTHONUNBUFFERED 1 from the Dockerfile and the behaviour seems better but ends hanging too.

pau-minoves avatar Jan 12 '17 09:01 pau-minoves

To wrap up the issue, having the client working on a docker is not of interest for this project unless consulting work is paid.

pau-minoves avatar Jan 30 '17 07:01 pau-minoves

Operation within Docker is definitely of interest to us. Unfortunately, we haven't had time to set up a Docker environment ourselves to dig into what is causing the problem.

What I would do next is to try to figure out where the code is slowing down, perhaps by instrumenting the code to time calls that block (disk and network I/O).

bwbeach avatar Jan 30 '17 16:01 bwbeach

Hi Brian,

I tried to run python -m to instrumentalize that, but failed as the b2 has no .py termination. Using a softlink or a rename didn't work. If you know how to run it with the b2 cli, I'll be happy to post results here.

Thanks.

pau-minoves avatar Jan 31 '17 08:01 pau-minoves

If you check out the repository from GitHub, you can run it with "python -m" from there.

bwbeach avatar Jan 31 '17 18:01 bwbeach

Ok! I managed to do it with the sources. Profiling output below.

No surprises. As mentioned, it blocks on waiter.acquire().

root@684e47ba752a:~/p/B2_Command_Line_Tool/build/lib# python print_dump.py
Thu Feb  2 16:51:09 2017    out.log

         32588 function calls (32179 primitive calls) in 873.669 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000  873.665  873.665 /root/p/B2_Command_Line_Tool/build/lib/b2/console_tool.py:962(run_command)
        1    0.000    0.000  873.663  873.663 /root/p/B2_Command_Line_Tool/build/lib/b2/console_tool.py:894(run)
      7/2    0.000    0.000  873.662  436.831 /usr/local/lib/python3.5/site-packages/logfury/v0_1/trace_call.py:36(wrapper)
        1    0.000    0.000  873.662  873.662 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:301(upload_local_file)
        1    0.000    0.000  873.662  873.662 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:324(upload)
        1    0.000    0.000  873.662  873.662 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:409(_upload_large_file)
        1    0.000    0.000  871.072  871.072 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:450(<listcomp>)
        1    0.010    0.010  871.072  871.072 /root/p/B2_Command_Line_Tool/build/lib/b2/utils.py:51(interruptible_get_result)
      870    0.033    0.000  871.062    1.001 /usr/local/lib/python3.5/concurrent/futures/_base.py:378(result)
      881    0.031    0.000  871.059    0.989 /usr/local/lib/python3.5/threading.py:261(wait)
     1819  871.016    0.479  871.016    0.479 {method 'acquire' of '_thread.lock' objects}
        3    0.000    0.000    2.548    0.849 /root/p/B2_Command_Line_Tool/build/lib/b2/session.py:29(wrapper)
        3    0.000    0.000    2.546    0.849 /root/p/B2_Command_Line_Tool/build/lib/b2/raw_api.py:124(_post_json)
        3    0.000    0.000    2.546    0.849 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:270(post_json_return_json)
        3    0.000    0.000    2.545    0.848 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:230(post_content_return_json)
        3    0.000    0.000    2.545    0.848 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:86(_translate_and_retry)
        3    0.000    0.000    2.545    0.848 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:41(_translate_errors)
        3    0.000    0.000    2.545    0.848 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:252(do_post)
        3    0.000    0.000    2.538    0.846 /usr/local/lib/python3.5/site-packages/requests/sessions.py:525(post)
        3    0.000    0.000    2.538    0.846 /usr/local/lib/python3.5/site-packages/requests/sessions.py:411(request)
        3    0.000    0.000    2.529    0.843 /usr/local/lib/python3.5/site-packages/requests/sessions.py:569(send)
        3    0.000    0.000    2.528    0.843 /usr/local/lib/python3.5/site-packages/requests/adapters.py:375(send)
        3    0.000    0.000    2.526    0.842 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py:449(urlopen)
        3    0.000    0.000    2.525    0.842 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py:326(_make_request)
        1    0.000    0.000    2.342    2.342 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:457(_find_unfinished_file)
       10    0.000    0.000    2.101    0.210 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:263(list_unfinished_large_files)
        1    0.000    0.000    2.100    2.100 /root/p/B2_Command_Line_Tool/build/lib/b2/raw_api.py:371(list_unfinished_large_files)
        3    0.000    0.000    1.615    0.538 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py:827(_validate_conn)
        1    0.000    0.000    1.615    1.615 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connection.py:279(connect)
        1    0.000    0.000    1.331    1.331 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/util/ssl_.py:277(ssl_wrap_socket)
        1    0.000    0.000    1.322    1.322 /usr/local/lib/python3.5/ssl.py:369(wrap_socket)
        1    0.000    0.000    1.322    1.322 /usr/local/lib/python3.5/ssl.py:667(__init__)
        1    0.000    0.000    1.322    1.322 /usr/local/lib/python3.5/ssl.py:981(do_handshake)
        1    0.000    0.000    1.322    1.322 /usr/local/lib/python3.5/ssl.py:631(do_handshake)
        1    1.322    1.322    1.322    1.322 {method 'do_handshake' of '_ssl._SSLSocket' objects}
        3    0.000    0.000    0.909    0.303 /usr/local/lib/python3.5/http/client.py:1153(getresponse)
        3    0.000    0.000    0.909    0.303 /usr/local/lib/python3.5/http/client.py:290(begin)
        3    0.000    0.000    0.907    0.302 /usr/local/lib/python3.5/http/client.py:257(_read_status)
       21    0.000    0.000    0.907    0.043 {method 'readline' of '_io.BufferedReader' objects}
        3    0.000    0.000    0.907    0.302 /usr/local/lib/python3.5/socket.py:561(readinto)
        3    0.000    0.000    0.907    0.302 /usr/local/lib/python3.5/ssl.py:918(recv_into)
        3    0.000    0.000    0.907    0.302 /usr/local/lib/python3.5/ssl.py:783(read)
        3    0.000    0.000    0.907    0.302 /usr/local/lib/python3.5/ssl.py:568(read)
        3    0.907    0.302    0.907    0.302 {method 'read' of '_ssl._SSLSocket' objects}
        1    0.000    0.000    0.283    0.283 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/connection.py:124(_new_conn)
        1    0.000    0.000    0.283    0.283 /usr/local/lib/python3.5/site-packages/requests/packages/urllib3/util/connection.py:51(create_connection)
        1    0.000    0.000    0.241    0.241 /root/p/B2_Command_Line_Tool/build/lib/b2/api.py:210(list_parts)
        1    0.000    0.000    0.240    0.240 /root/p/B2_Command_Line_Tool/build/lib/b2/raw_api.py:361(list_parts)
        1    0.000    0.000    0.206    0.206 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:280(start_large_file)
        1    0.000    0.000    0.206    0.206 /root/p/B2_Command_Line_Tool/build/lib/b2/raw_api.py:383(start_large_file)
        1    0.191    0.191    0.191    0.191 {method 'connect' of '_socket.socket' objects}
        1    0.000    0.000    0.091    0.091 /usr/local/lib/python3.5/socket.py:715(getaddrinfo)
        1    0.091    0.091    0.091    0.091 {built-in method _socket.getaddrinfo}
        1    0.000    0.000    0.041    0.041 /root/p/B2_Command_Line_Tool/build/lib/b2/bucket.py:436(<listcomp>)
       19    0.000    0.000    0.040    0.002 /usr/local/lib/python3.5/concurrent/futures/thread.py:104(submit)
       19    0.000    0.000    0.040    0.002 /usr/local/lib/python3.5/concurrent/futures/thread.py:117(_adjust_thread_count)
       11    0.000    0.000    0.039    0.004 /usr/local/lib/python3.5/threading.py:826(start)
       11    0.001    0.000    0.038    0.003 /usr/local/lib/python3.5/threading.py:531(wait)
        1    0.008    0.008    0.008    0.008 {method 'load_verify_locations' of '_ssl._SSLContext' objects}
        3    0.000    0.000    0.008    0.003 /usr/local/lib/python3.5/site-packages/requests/sessions.py:371(prepare_request)
        3    0.000    0.000    0.007    0.002 /root/p/B2_Command_Line_Tool/build/lib/b2/b2http.py:328(_run_post_request_hooks)

pau-minoves avatar Feb 02 '17 17:02 pau-minoves

Thanks for getting that info. It looks like uploading the first part of the file never finished.

This is the code that uploads a large file:

        # Tell the executor to upload each of the parts
        part_futures = [
            self.api.get_thread_pool().submit(
                self._upload_part,
                file_id,
                part_index + 1,  # part number
                part_range,
                upload_source,
                large_file_upload_state,
                finished_parts
            ) for (part_index, part_range) in enumerate(part_ranges)
        ]

        # Collect the sha1 checksums of the parts as the uploads finish.
        # If any of them raised an exception, that same exception will
        # be raised here by result()
        part_sha1_array = [interruptible_get_result(f)['contentSha1'] for f in part_futures]

And it's hanging in the call to interruptible_get_result:

def interruptible_get_result(future):
    """
    Waits for the result of a future in a way that can be interrupted
    by a KeyboardInterrupt.

    This is not necessary in Python 3, but is needed for Python 2.
    """
    while True:
        try:
            return future.result(timeout=1.0)
        except futures.TimeoutError:
            pass

There was only one call to interruptible_get_result, and 870 calls to future.result, and one per second, which accounts for the 873 seconds of run time.

Now we're down to the real question: why aren't the uploads going?

If you have time, I'm really curious what happens with the HTTP call to upload the file. I think what I would try doing is uploading a file using curl. The docs have sample curl calls to upload a file.

The first step is to get your account auth token:

$ b2 get-account-info
{
    "accountAuthToken": "3_20170123184412_..._acct", 
    "accountId": "...", 
    "apiUrl": "https://api001.backblazeb2.com", 
    "applicationKey": "...", 
    "downloadUrl": "https://f001.backblazeb2.com"
}

Then call b2_get_upload_url:

curl \
    -H 'Authorization: $ACCOUNT_AUTHORIZATION_TOKEN' \
    -d '{"bucketId": "$BUCKET_ID"}' \
    $API_URL/b2api/v1/b2_get_upload_url

And finally, b2_upload_file:

FILE_TO_UPLOAD=typing_test.txt
MIME_TYPE=text/plain
SHA1_OF_FILE=$(openssl dgst -sha1 $FILE_TO_UPLOAD | awk '{print $2;}')
UPLOAD_URL=... # from b2_get_upload_url call
UPLOAD_AUTHORIZATION_TOKEN=... # from b2_get_upload_url call

curl \
    -H "Authorization: $UPLOAD_AUTHORIZATION_TOKEN" \
    -H "X-Bz-File-Name: $FILE_TO_UPLOAD" \
    -H "Content-Type: $MIME_TYPE" \
    -H "X-Bz-Content-Sha1: $SHA1_OF_FILE" \
    -H "X-Bz-Info-Author: unknown" \
    --data-binary "@$FILE_TO_UPLOAD" \
    $UPLOAD_URL

bwbeach avatar Feb 02 '17 17:02 bwbeach

Hi @bwbeach,

Sure, here you have it. Apparently it works at good speed.

root@98464a386713:/srv/src/webapp# curl  -v   -H "Authorization: X_upld"     -H "X-Bz-File-Name: 6d2f27f8a479798c60ea5e2ef3b0e010c999754c"     -H "Content-Type: application/octet-stream"     -H "X-Bz-Content-Sha1: 6d2f27f8a479798c60ea5e2ef3b0e010c999754c"     -H "X-Bz-Info-Author: unknown"     --data-binary @/srv/data/cache/collection-1/6d2f27f8a479798c60ea5e2ef3b0e010c999754c   https://pod-000-1026-02.backblaze.com/b2api/v1/b2_upload_file/50aedae9b486281c5595021c/c001_v0001026_t0004
* Hostname was NOT found in DNS cache
*   Trying 162.244.62.37...
* Connected to pod-000-1026-02.backblaze.com (162.244.62.37) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* SSLv3, TLS handshake, Client hello (1):
* SSLv3, TLS handshake, Server hello (2):
* SSLv3, TLS handshake, CERT (11):
* SSLv3, TLS handshake, Server key exchange (12):
* SSLv3, TLS handshake, Server finished (14):
* SSLv3, TLS handshake, Client key exchange (16):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSLv3, TLS change cipher, Client hello (1):
* SSLv3, TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-SHA
* Server certificate:
* 	 subject: OU=Domain Control Validated; CN=*.backblaze.com
* 	 start date: 2016-01-22 16:13:38 GMT
* 	 expire date: 2018-03-06 00:33:16 GMT
* 	 subjectAltName: pod-000-1026-02.backblaze.com matched
* 	 issuer: C=US; ST=Arizona; L=Scottsdale; O=GoDaddy.com, Inc.; OU=http://certs.godaddy.com/repository/; CN=Go Daddy Secure Certificate Authority - G2
* 	 SSL certificate verify ok.
> POST /b2api/v1/b2_upload_file/50aedae9b486281c5595021c/c001_v0001026_t0004 HTTP/1.1
> User-Agent: curl/7.38.0
> Host: pod-000-1026-02.backblaze.com
> Accept: */*
> Authorization: X_upld
> X-Bz-File-Name: 6d2f27f8a479798c60ea5e2ef3b0e010c999754c
> Content-Type: application/octet-stream
> X-Bz-Content-Sha1: 6d2f27f8a479798c60ea5e2ef3b0e010c999754c
> X-Bz-Info-Author: unknown
> Content-Length: 1916917357
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
< HTTP/1.1 200 OK
* Server Apache-Coyote/1.1 is not blacklisted
< Server: Apache-Coyote/1.1
< Cache-Control: max-age=0, no-cache, no-store
< Content-Type: application/json;charset=UTF-8
< Content-Length: 473
< Date: Thu, 02 Feb 2017 20:58:14 GMT
<
{
  "accountId": "0ea9468c552c",
  "action": "upload",
  "bucketId": "50aedae9b486281c5595021c",
  "contentLength": 1916917357,
  "contentSha1": "6d2f27f8a479798c60ea5e2ef3b0e010c999754c",
  "contentType": "application/octet-stream",
  "fileId": "4_z50aedae9b486281c5595021c_f102d7e77cbedbee0_d20170202_m204939_c001_v0001026_t0004",
  "fileInfo": {
    "author": "unknown"
  },
  "fileName": "6d2f27f8a479798c60ea5e2ef3b0e010c999754c",
  "uploadTimestamp": 1486068579000
}
* Connection #0 to host pod-000-1026-02.backblaze.com left intact

pau-minoves avatar Feb 02 '17 21:02 pau-minoves

Thanks for doing that. It may be time for me to set up a Docker environment and try it myself, to debug where the issue is. Probably won't get to it for a couple weeks, though. (This week is busy, and I'll be on vacation next week.)

My approach will be to reproduce the problem, then gradually simplify the code until getting to the essence of the problem. I think the first thing to do is to serialize the uploads, rather than running them in a thread pool.

bwbeach avatar Feb 02 '17 22:02 bwbeach

Hello @bwbeach could you make any advance? Thanks.

pau-minoves avatar May 09 '17 07:05 pau-minoves

I'm afraid I didn't get a chance to look at it, and then it slid off my task list. I'll try and find some time to look at it.

bwbeach avatar May 09 '17 11:05 bwbeach

Closing as stale; please reopen if errors still occur with the current version.

mjurbanski-reef avatar Nov 10 '23 08:11 mjurbanski-reef