typedb-driver-python icon indicating copy to clipboard operation
typedb-driver-python copied to clipboard

GRPC memory leak from opening session using empty with block

Open jmsfltchr opened this issue 4 years ago • 2 comments

Description

A basic session open/close example in docs has broken as of upgrading from client 2.2.0 to 2.4.0. The breakage comes from GRPC, seemingly because a resource wasn't closed properly. It's very suspicious that this only breaks for an empty with block, so I have named the issue as such.

Environment

  1. OS (where TypeDB server runs): Factory (not reproducible locally on Mac OS 10)
  2. TypeDB version (and platform): TypeDB 2.5
  3. TypeDB client-python version: client-python 2.5
  4. Python version: 3.6

Reproducible Steps

Steps to create the smallest reproducible scenario:

Run the docs tests in Factory, and most of the time the Python tests fail in this way:

INFO: Found 1 target...
1255
[0 / 1] [Prepa] BazelWorkspaceStatusAction stable-status.txt
1256
Target //test/example/python:social-network up-to-date:
1257
  bazel-bin/test/example/python/social_network_quickstart_query.py
1258
  bazel-bin/test/example/python/social_network_python_client_a.py
1259
  bazel-bin/test/example/python/social_network_python_client_b.py
1260
  bazel-bin/test/example/python/social_network_python_client_c.py
1261
  bazel-bin/test/example/python/social_network_python_client_d.py
1262
  bazel-bin/test/example/python/social-network
1263
INFO: Elapsed time: 0.842s, Critical Path: 0.51s
1264
INFO: 5 processes: 1 remote cache hit, 4 internal.
1265
INFO: Build completed successfully, 5 total actions
1266
INFO: Running command line: external/bazel_tools/tools/test/test-setup.sh test/example/python/social-network
1267
INFO: Build completed successfully, 5 total actions
1268
exec ${PAGER:-/usr/bin/less} "$0" || exit 1
1269
Executing tests from //test/example/python:social-network
1270
-----------------------------------------------------------------------------
1271
test_social_network_python_client_a (__main__.SocialNetworkTest) ... ok
1272
test_social_network_python_client_b (__main__.SocialNetworkTest) ... E1026 09:05:48.592246686    4855 metadata.cc:253]            WARNING: 1 metadata elements were leaked
1273
E1026 09:05:48.592321591    4855 metadata.cc:260]            mdelem 'user-agent' = 'grpc-python/1.38.0 grpc-c/16.0.0 (linux; chttp2)'
1274
E1026 09:05:48.592333692    4855 metadata.cc:253]            WARNING: 1 metadata elements were leaked
1275
E1026 09:05:48.592339292    4855 metadata.cc:260]            mdelem ':authority' = 'localhost:1729'

Investigating, test_social_network_python_client_b seems to be giving a warning of a leak.I presume that although this says "WARNING", given the lack of subsequent output, this is our culprit. The user-agent is grpc-python and the error codes seem consistent with issues on grpc: https://github.com/grpc/grpc/issues/7121

The snippet of the docs being tested, social_network_python_client_b, is trivial and is as follows:

from typedb.client import *

with TypeDB.core_client("localhost:1729") as client:
    with client.session("social_network", SessionType.DATA) as session:
        ## session is open
        pass
    ## session is closed
## client is closed

It looks like this has been broken since commit fb4fbfed0c97f35ccd7984e637f6f6b4c2fcf023, which upgraded the python client dependency from 2.2.0 to 2.4.0.

Note that this issue is non-deterministic, I just re-ran the tests on Factory and they passed. I was also not able to reproduce this locally.

Expected Output

Session should close with no effect!

Actual Output

Memory leak and error from GRPC.

jmsfltchr avatar Nov 01 '21 15:11 jmsfltchr

See https://github.com/vaticle/docs/issues/545 .

I'll close that issue as a duplicate, since this one goes into more detail. My comments on it were:

I've tried a few fixes - removing --action_env=PATH, upgrading client-python, downgrading client-python, and using pip to install client-python instead of Bazel.

Nothing works. This error started occurring randomly on Fri 16 Jul with no code changes whatsoever, as verified by rerunning an older workflow in Grabl.

I've verified that all client-python tests still pass, so the client itself is (probably) not broken.

alexjpwalker avatar Nov 01 '21 16:11 alexjpwalker

In case it helps, I'll add my system details for when I experienced this, as they are different from the original poster's.

Environment

  • OS (where TypeDB server runs): Ubuntu 20.04
  • TypeDB version (and platform): TypeDB 2.11.1
  • TypeDB client-python version: 2.11.1
  • Python version: 3.9.12

Further details about circumstances when I experienced this are here: https://github.com/vaticle/typedb-client-python/issues/265

suciokhan avatar Oct 14 '22 12:10 suciokhan