Why worksheet.update_cells work is too slow?
I try do worksheet.update_cells 1, 1, array, where array.count 1000 rows and 15 columns.
It operation work is too slow, 30 seconds.
You write, what this method do not upload data to server (Google Drive).
After worksheet.update_cells I call worksheet.save.
And this method work more 1.5 minutes.
Why?
sorry for my english...
log.debug :start_data if DEBUG
worksheet.update_cells 2, 1, orders
log.debug :stop_data if DEBUG
log.debug :start_save if DEBUG
worksheet.save
log.debug :stop_save if DEBUG
log.info orders_count: orders.count if DEBUG
↓
D, [2016-11-27T19:00:02.584455 #10110] DEBUG -- : :start_data
D, [2016-11-27T19:00:33.389481 #10110] DEBUG -- : :stop_data
D, [2016-11-27T19:00:33.389635 #10110] DEBUG -- : :start_save
D, [2016-11-27T19:01:32.302826 #10110] DEBUG -- : :stop_save
I, [2016-11-27T19:01:32.303079 #10110] INFO -- : {:orders_count=>1203}
Thanks for the report.
It may be related to #170, though not exactly sure without seeing profiling result for this case...
I proposed some ideas in https://github.com/gimite/google-drive-ruby/issues/170#issuecomment-153735846.
Just noticed that writing empty strings to cells (@ws[x, y] = '') is extremely slow (1 row/s vs. 1000+ rows/s). Using @ws[x, y] = string.presence || '-' as a workaround for now.
This may explain why changing the parser (#170) sometimes gives performance improvements.
Yes, as explained in https://github.com/gimite/google-drive-ruby/issues/170#issuecomment-153735846 , putting an empty value clears the cache, causing heavy recalculation. I wrote some ideas in https://github.com/gimite/google-drive-ruby/issues/170#issuecomment-153735846 .
Thanks for the great gem!
Doing something like this:
num_rows = ws.num_rows
num_cols = ws.num_cols
(1..num_rows).each do |row_number|
(1..num_cols).each do |column_number|
ws[row_number, column_number] = nil
end
end
ws.save
speeds up the first part with ws[] = nil. However, ws.save gives this error:
HTTPClient::KeepAliveDisconnected: Broken pipe
Doing this:
(1..ws.num_rows).each do |row_number|
(1..ws.num_cols).each do |column_number|
ws[row_number, column_number] = nil
end
end
ws.save
goes very slow on the ws[] = nil line, and I never got to ws.save on a sheet with 45,000 rows.
Hmm you may workaround the HTTPClient error by calling save() occasionally during the iteration?