databricks-sql-python icon indicating copy to clipboard operation
databricks-sql-python copied to clipboard

Databricks SQL ExecuteStatement Failing During Chunked Updates to a Single Cell (Large JSON Field)

Open sonali-t opened this issue 7 months ago • 0 comments

We’re encountering issues with Databricks SQL when attempting to update a row that contains a large JSON array field. Originally, we tried inserting the entire JSON directly into the column, but this failed due to request size limitations. To address that, we redesigned the approach to:

  • Split the full JSON into smaller chunks (~100 items)
  • Append each chunk incrementally to the same row/column using separate UPDATE statements
  • Commit each chunk using a new thread and session to ensure SQLAlchemy thread safety

Despite chunking, the request eventually fails when the dmt_data field grows large enough (presumably ~1–2MB compressed). The SQL API returns:

pgsqlCopyEdit(databricks.sql.exc.RequestError) Error during request to server. 
ExecuteStatement command can only be retried for codes 429 and 503 

This confirms that each UPDATE's request body is still exceeding Databricks SQL's internal limits, even though we’re only appending small pieces. What We’re Looking For:

  • Confirmation of the exact request body size limit for INSERT/UPDATE operations over Databricks SQL
  • Recommended practice for incrementally updating a single JSON column that grows over time

sonali-t avatar Jun 10 '25 12:06 sonali-t