Parallelize `predict` subcommand
Predicting from the CLI is painfully slow since it runs only on a single thread, so here's a quick fix.
Caveat: this essentially preloads the whole CSV into memory. It is still a net improvement though since prediction on a single core becomes a problem far earlier (for a CSV which is less than 100MB it's already unusably slow) than consuming too much memory does.
@koute thanks for the PR! We decided to make prediction single threaded for exactly the reason you identified: to avoid buffering the whole CSV. I agree that it is a good idea to provide the option to parallelize predictions, but I think we should put it behind a command line argument, --parallel, that explains the tradeoff. Would you be willing to update this PR with that change?
Okay, now it doesn't read the whole CSV but will still run in parallel. (:
Speed-wise it's the same; the old version with preloading the CSV:
real 0m11.739s
user 10m29.132s
sys 0m14.350s
the new version without preloading the CSV:
real 0m11.425s
user 9m53.702s
sys 0m51.725s
Is this acceptable or do you still want me to add an argument to disable parallelization?