DataflowTemplates icon indicating copy to clipboard operation
DataflowTemplates copied to clipboard

The writing is slow

Open ascarhon-atahujaev opened this issue 6 years ago • 2 comments

Can you please tell why the library for writing from BigqueryToSpanner is very slow? About 20 records per second. Should I generate template with additional options to improve performance of library/

ascarhon-atahujaev avatar Sep 09 '19 05:09 ascarhon-atahujaev

20 cases per second is certainly slow. In my case, I insert 1 million records in about 10 minutes (total Dataflow job). Do you know where the bottleneck is in the Dataflow step? As a cause of slowness, the write processing on the Spanner side may be a bottleneck. Whether Spanner is a bottleneck can be confirmed by the CPU usage rate of the Spanner console. If the rate is higher than 80% you need to increase the number of instances of Spanner.

orfeon avatar Sep 10 '19 04:09 orfeon

Solved! The problem was with incorrect data i was trying to write into spanner (in my case it was a duplicate combined primary key). Now, with correct primary keys, it is loading into spanner about 3 milliions per 8 minutes.

ascarhon-atahujaev avatar Sep 12 '19 03:09 ascarhon-atahujaev