dsbulk
dsbulk copied to clipboard
dsbulk unload stuck when config -maxConcurrentFiles (write concurrency) greater than 1
dsbulk version: 1.10.0
I'm unloading 10000000 rows from C* table with by using LIMIT query
dsbulk unload -query "SELECT col1, col2 FROM keyspace.table LIMIT 10000000" -maxRecords 1000000 -header false -verbosity high --connector.csv.compression gzip -url table.csv.gz
The command generates 1 read concurrency & 4 write concurrency, checking the logs I didn't find Operation UNLOAD_20230216-042948-286777 closed. line as usual, and still see dsbulk process when checking with ps aux
total | failed | rows/s | p50ms | p99ms | p999ms
10,000,000 | 0 | 97,745 | 50.37 | 167.77 | 289.41
Operation UNLOAD_20230216-042948-286777 completed successfully in 1 minute and 45 seconds.
Operation UNLOAD_20230216-042948-286777 closing.
Done writing file:/app/table.csv.gz/output-000011.csv.gz
Done writing file:/app/table.csv.gz/output-000009.csv.gz
Done writing file:/app/table.csv.gz/output-000010.csv.gz
Done writing file:/app/table.csv.gz/output-000012.csv.gz
This bug was not found in dsbulk version: 1.9.1 or set -maxConcurrentFiles 1
same issue.