ClientConnectorCertificateError on GET request to any blob
What happened: We are trying to read file(s) from Google storage bucket, but it is not possible
What you expected to happen:
We can run any gcsfs API command
Minimal Complete Verifiable Example: Please, note that this is a minimal example. For instance, if we run any other command (e.g. the code for opening a file), it will cause the same error.
import gcsfs
fs = gcsfs.GCSFileSystem(project='my-project')
fs.ls('my-bucket')
This code will cause an exception. Error traceback:
Traceback (most recent call last):
File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 936, in _wrap_create_connection
return await self._loop.create_connection(*args, **kwargs) # type: ignore # noqa
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 981, in create_connection
ssl_handshake_timeout=ssl_handshake_timeout)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 1009, in _create_connection_transport
await waiter
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/sslproto.py", line 530, in data_received
ssldata, appdata = self._sslpipe.feed_ssldata(data)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/sslproto.py", line 189, in feed_ssldata
self._sslobj.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 774, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/path/to/my-project/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-10-daebfc8e4d60>", line 1, in <module>
fs.ls('bduk-dev-tmt')
File "/path/to/my-project/python3.7/site-packages/fsspec/asyn.py", line 121, in wrapper
return maybe_sync(func, self, *args, **kwargs)
File "/path/to/my-project/python3.7/site-packages/fsspec/asyn.py", line 100, in maybe_sync
return sync(loop, func, *args, **kwargs)
File "/path/to/my-project/python3.7/site-packages/fsspec/asyn.py", line 71, in sync
raise exc.with_traceback(tb)
File "/path/to/my-project/python3.7/site-packages/fsspec/asyn.py", line 55, in f
result[0] = await future
File "/path/to/my-project/python3.7/site-packages/gcsfs/core.py", line 808, in _ls
out = await self._list_objects(path)
File "/path/to/my-project/python3.7/site-packages/gcsfs/core.py", line 598, in _list_objects
items, prefixes = await self._do_list_objects(path)
File "/path/to/my-project/python3.7/site-packages/gcsfs/core.py", line 633, in _do_list_objects
json_out=True,
File "/path/to/my-project/python3.7/site-packages/gcsfs/core.py", line 494, in _call
timeout=self.requests_timeout,
File "/path/to/my-project/python3.7/site-packages/aiohttp/client.py", line 1012, in __aenter__
self._resp = await self._coro
File "/path/to/my-project/python3.7/site-packages/aiohttp/client.py", line 483, in _request
timeout=real_timeout
File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 523, in connect
proto = await self._create_connection(req, traces, timeout)
File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 859, in _create_connection
req, traces, timeout)
File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 1004, in _create_direct_connection
raise last_exc
File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 986, in _create_direct_connection
req=req, client_error=client_error)
File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 939, in _wrap_create_connection
req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorCertificateError: Cannot connect to host www.googleapis.com:443 ssl:True [SSLCertVerificationError: (1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)')]
Anything else we need to know?:
It looks like this issue can be caused by this. But recompiling Python is not a handy solution. There is should be simplier solution or fix
This issue makes pandas.read_excel and pandas.read_csv command failed, what makes this issue more painful
Environment:
- Dask version: We don't use Dask,
gcsfsversion is 0.7.1 - Python version: Python 3.7.4
- Operating System: MacOS Catalina 10.15.7
- Install method (conda, pip, source): pip
pip freeze:
aiohttp==3.6.2
appnope==0.1.0
argon2-cffi==20.1.0
async-generator==1.10
async-timeout==3.0.1
attrs==19.3.0
backcall==0.2.0
bleach==3.2.1
cachetools==4.1.1
certifi==2020.6.20
cffi==1.14.3
chardet==3.0.4
click==7.1.2
decorator==4.4.2
defusedxml==0.6.0
entrypoints==0.3
Flask==1.1.2
fsspec==0.8.4
gcsfs==0.7.1
google-api-core==1.21.0
google-api-python-client==1.10.0
google-auth==1.19.2
google-auth-httplib2==0.0.4
google-auth-oauthlib==0.4.1
google-cloud-core==1.4.3
google-cloud-pubsub==1.7.0
google-cloud-storage==1.31.0
google-cloud-trace==0.23.0
google-crc32c==1.0.0
google-resumable-media==1.1.0
googleapis-common-protos==1.52.0
grpc-google-iam-v1==0.12.3
grpcio==1.30.0
httplib2==0.18.1
idna==2.9
importlib-metadata==2.0.0
ipykernel==5.3.4
ipython==7.18.1
ipython-genutils==0.2.0
itsdangerous==1.1.0
jedi==0.17.2
Jinja2==2.11.2
jsonschema==3.2.0
jupyter-client==6.1.7
jupyter-core==4.6.3
jupyterlab-pygments==0.1.2
MarkupSafe==1.1.1
mistune==0.8.4
multidict==4.7.6
nbclient==0.5.0
nbconvert==6.0.7
nbformat==5.0.8
nest-asyncio==1.4.1
notebook==6.1.4
numpy==1.19.2
oauthlib==3.1.0
opencensus==0.7.9
opencensus-context==0.1.1
packaging==20.4
pandas==1.1.2
pandocfilters==1.4.2
parso==0.7.1
pexpect==4.8.0
pickleshare==0.7.5
prometheus-client==0.8.0
prompt-toolkit==3.0.8
protobuf==3.12.2
ptyprocess==0.6.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
Pygments==2.7.1
pyparsing==2.4.7
pyrsistent==0.17.3
python-dateutil==2.8.1
pytz==2020.1
PyYAML==5.3.1
pyzmq==19.0.2
requests==2.24.0
requests-oauthlib==1.3.0
rsa==4.6
Send2Trash==1.5.0
six==1.15.0
terminado==0.9.1
testpath==0.4.4
tornado==6.0.4
traitlets==5.0.4
typing-extensions==3.7.4.3
uritemplate==3.0.1
urllib3==1.25.9
wcwidth==0.2.5
webencodings==0.5.1
Werkzeug==1.0.1
wrapt==1.12.1
xlrd==1.2.0
yarl==1.5.1
zipp==3.3.0
Do you succeed with other calls, such as connecting and listing a bucket? Do google's own python APIs work for you?
To me, an SSL error suggests that you may be behind some complex firewall or proxy. It seems unlikely to me that GCS requires some special weak cypher to be compiled into python - other people are connecting just fine.
Hi @martindurant,
| Do you succeed with other calls, such as connecting and listing a bucket?
I used fs.ls() call in this example. I also tried to read the file blob with fs.open() and I got the same error. So I'm pretty sure this is a common error for any HTTP call.
| Do google's own python APIs work for you?
As we can't read spreadsheets within pandas directly due to this issue, we successfully read them manually by Google official google.cloud.storage module with further passing them as BytesIO object to pandas. IOW we can read any GSC file without issues. So it doesn't look like some common gateway issue.
Perhaps with a combination of pdb and logging you can figure out exactly what call the google API is making, and then, why the gcsfs via aiohttp is different. This error is coming from pretty deep within python.
Note that I don't see cryptography or pyopenssl (or any ssl) in your installed packages.
Please also check any environment variables or configuration you might have relating to certificate trust stores.
(ping)
Hi @martindurant Sorry, have been pretty busy so far. I'm going to go with a debugger and update you. For now, just let me share some thoughts:
- If I don't use async mode, it should not use asyncio on my opinion.
- All HTPP related libs work without any third-paty SSL and/or encryption libs.
If I don't use async mode, it should not use asyncio
To have the distinction would mean writing two separate implementation with double the code. Even if you don't use asyncio directly, you might still appreciate the concurrent bulk operations it provides you.
https://github.com/aio-libs/aiohttp/issues/5375#issuecomment-791034670 solved the problem for me.
@lorabit110 , do you know which version that is released in?