pgsync icon indicating copy to clipboard operation
pgsync copied to clipboard

Out of memory error

Open rere16 opened this issue 4 years ago • 7 comments

PGSync version: 2.1.9

Postgres version: 12

Elasticsearch version: 7.16.1

Redis version: latest

Python version: 3.7

Problem Description:

When running a sync with a 6GB database, I get the following error:

pgsync_1         | (Background on this error at: https://sqlalche.me/e/14/e3q8)
pgsync_1         | Traceback (most recent call last):
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1815, in _execute_context
pgsync_1         |     result = context._setup_result_proxy()
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 1414, in _setup_result_proxy
pgsync_1         |     self.cursor, self.execution_options
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/cursor.py", line 1030, in __init__
pgsync_1         |     self._rowbuffer = collections.deque(dbapi_cursor.fetchmany(1))
pgsync_1         | psycopg2.errors.ProgramLimitExceeded: out of memory
pgsync_1         | DETAIL:  Cannot enlarge string buffer containing 1073738399 bytes by 6633 more bytes.
pgsync_1         |
pgsync_1         |
pgsync_1         | The above exception was the direct cause of the following exception:
pgsync_1         |
pgsync_1         | Traceback (most recent call last):
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/pgsync/sync.py", line 875, in sync
pgsync_1         |     self.es.bulk(self.index, docs)
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/pgsync/elastichelper.py", line 133, in bulk
pgsync_1         |     raise_on_error=raise_on_error,
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py", line 484, in parallel_bulk
pgsync_1         |     actions, chunk_size, max_chunk_bytes, client.transport.serializer
pgsync_1         |   File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 748, in next
pgsync_1         |     raise value
pgsync_1         |   File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 121, in worker
pgsync_1         |     result = (True, func(*args, **kwds))
pgsync_1         |   File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 140, in _helper_reraises_exception
pgsync_1         |     raise ex
pgsync_1         |   File "/usr/local/lib/python3.7/multiprocessing/pool.py", line 292, in _guarded_task_generation
pgsync_1         |     for i, x in enumerate(iterable):
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/elasticsearch/helpers/actions.py", line 155, in _chunk_actions
pgsync_1         |     for action, data in actions:
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/pgsync/sync.py", line 831, in _sync
pgsync_1         |     self.fetchmany(node._subquery)
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/pgsync/base.py", line 882, in fetchmany
pgsync_1         |     statement.select()
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1286, in execute
pgsync_1         |     return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 326, in _execute_on_connection
pgsync_1         |     self, multiparams, params, execution_options
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1488, in _execute_clauseelement
pgsync_1         |     cache_hit=cache_hit,
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1843, in _execute_context
pgsync_1         |     e, statement, parameters, cursor, context
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2024, in _handle_dbapi_exceptionpgsync_1         |     sqlalchemy_exception, with_traceback=exc_info[2], from_=e
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
pgsync_1         |     raise exception
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1815, in _execute_context
pgsync_1         |     result = context._setup_result_proxy()
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 1414, in _setup_result_proxy
pgsync_1         |     self.cursor, self.execution_options
pgsync_1         |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/cursor.py", line 1030, in __init__
pgsync_1         |     self._rowbuffer = collections.deque(dbapi_cursor.fetchmany(1))
pgsync_1         | sqlalchemy.exc.OperationalError: (psycopg2.errors.ProgramLimitExceeded) out of memory
pgsync_1         | DETAIL:  Cannot enlarge string buffer containing 1073738399 bytes by 6633 more bytes.
{
  "database": "my-db",
  "index": "index1",
  "mapping": {
    "Attributes": {
      "type": "nested"
    }
  },
  "nodes": {
    "table": "table1",
    "schema": "public",
    "columns": [
      "Id",
      "Name"
    ],
    "children": [
      {
        "table": "Attributes",
        "columns": [
          "Id",
          "Value"
        ],
        "relationship": {
          "variant": "object",
          "type": "one_to_many"
        },
        "children": [
          {
            "table": "AttributeTypes",
            "columns": [
              "Id",
              "Name"
            ],
            "relationship": {
              "variant": "object",
              "type": "one_to_one"
            }
          }
        ]
      }
    ]
  }
}

Thanks

rere16 avatar Dec 24 '21 10:12 rere16

Does reducing the value of QUERY_CHUNK_SIZE have any effect ? The default is 10000

toluaina avatar Jan 04 '22 09:01 toluaina

QUERY_CHUNK_SIZE has no effect whatever the value (100, 1000, ...).

rere16 avatar Jan 04 '22 10:01 rere16

  • Can you check the value of work_mem on the Postgres database.
  • show work_mem;
  • Also can you try increasing this value

toluaina avatar Jan 07 '22 13:01 toluaina

work_mem has the default value (4MB).

rere16 avatar Jan 08 '22 10:01 rere16

work_mem has the default value (4MB).

@rere16 As mentioned by @toluaina, try to increase your server's work_mem. Even though it's the default, 4MB is quite low for modern hardware. Please see the work_mem section of the Postgres tuning guide.

Also a question, how many records are in your Attributes and AttributeTypes tables?

voyc-jean avatar Jan 10 '22 12:01 voyc-jean

Increasing the value of work_mem has no effect (tested with 64MB, 128MB, 512MB)

The table Attributes has millions of records and AttributeTypes thousands.

rere16 avatar Jan 12 '22 19:01 rere16

this looks like a client error. does the host running pgsync have sufficient memory to run?

toluaina avatar Jan 16 '22 19:01 toluaina