Out of memory error
PGSync version: 2.1.9
Postgres version: 12
Elasticsearch version: 7.16.1
Redis version: latest
Python version: 3.7
Problem Description:
When running a sync with a 6GB database, I get the following error:
pgsync_1 | (Background on this error at: https://sqlalche.me/e/14/e3q8)
pgsync_1 | Traceback (most recent call last):
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1815, in _execute_context
pgsync_1 | result = context._setup_result_proxy()
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 1414, in _setup_result_proxy
pgsync_1 | self.cursor, self.execution_options
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/cursor.py", line 1030, in __init__
pgsync_1 | self._rowbuffer = collections.deque(dbapi_cursor.fetchmany(1))
pgsync_1 | psycopg2.errors.ProgramLimitExceeded: out of memory
pgsync_1 | DETAIL: Cannot enlarge string buffer containing 1073738399 bytes by 6633 more bytes.
pgsync_1 |
pgsync_1 |
pgsync_1 | The above exception was the direct cause of the following exception:
pgsync_1 |
pgsync_1 | Traceback (most recent call last):
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/pgsync/sync.py", line 875, in sync
pgsync_1 | self.es.bulk(self.index, docs)
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/pgsync/elastichelper.py", line 133, in bulk
pgsync_1 | raise_on_error=raise_on_error,
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py", line 484, in parallel_bulk
pgsync_1 | actions, chunk_size, max_chunk_bytes, client.transport.serializer
pgsync_1 | File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 748, in next
pgsync_1 | raise value
pgsync_1 | File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 121, in worker
pgsync_1 | result = (True, func(*args, **kwds))
pgsync_1 | File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 140, in _helper_reraises_exception
pgsync_1 | raise ex
pgsync_1 | File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 292, in _guarded_task_generation
pgsync_1 | for i, x in enumerate(iterable):
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py", line 155, in _chunk_actions
pgsync_1 | for action, data in actions:
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/pgsync/sync.py", line 831, in _sync
pgsync_1 | self.fetchmany(node._subquery)
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/pgsync/base.py", line 882, in fetchmany
pgsync_1 | statement.select()
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1286, in execute
pgsync_1 | return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 326, in _execute_on_connection
pgsync_1 | self, multiparams, params, execution_options
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1488, in _execute_clauseelement
pgsync_1 | cache_hit=cache_hit,
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1843, in _execute_context
pgsync_1 | e, statement, parameters, cursor, context
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2024, in _handle_dbapi_exceptionpgsync_1 | sqlalchemy_exception, with_traceback=exc_info[2], from_=e
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
pgsync_1 | raise exception
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1815, in _execute_context
pgsync_1 | result = context._setup_result_proxy()
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 1414, in _setup_result_proxy
pgsync_1 | self.cursor, self.execution_options
pgsync_1 | File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/cursor.py", line 1030, in __init__
pgsync_1 | self._rowbuffer = collections.deque(dbapi_cursor.fetchmany(1))
pgsync_1 | sqlalchemy.exc.OperationalError: (psycopg2.errors.ProgramLimitExceeded) out of memory
pgsync_1 | DETAIL: Cannot enlarge string buffer containing 1073738399 bytes by 6633 more bytes.
{
"database": "my-db",
"index": "index1",
"mapping": {
"Attributes": {
"type": "nested"
}
},
"nodes": {
"table": "table1",
"schema": "public",
"columns": [
"Id",
"Name"
],
"children": [
{
"table": "Attributes",
"columns": [
"Id",
"Value"
],
"relationship": {
"variant": "object",
"type": "one_to_many"
},
"children": [
{
"table": "AttributeTypes",
"columns": [
"Id",
"Name"
],
"relationship": {
"variant": "object",
"type": "one_to_one"
}
}
]
}
]
}
}
Thanks
Does reducing the value of QUERY_CHUNK_SIZE have any effect ?
The default is 10000
QUERY_CHUNK_SIZE has no effect whatever the value (100, 1000, ...).
- Can you check the value of work_mem on the Postgres database.
-
show work_mem; - Also can you try increasing this value
work_mem has the default value (4MB).
work_mem has the default value (4MB).
@rere16 As mentioned by @toluaina, try to increase your server's work_mem. Even though it's the default, 4MB is quite low for modern hardware. Please see the work_mem section of the Postgres tuning guide.
Also a question, how many records are in your Attributes and AttributeTypes tables?
Increasing the value of work_mem has no effect (tested with 64MB, 128MB, 512MB)
The table Attributes has millions of records and AttributeTypes thousands.
this looks like a client error. does the host running pgsync have sufficient memory to run?