tokenizers remove use of parallel iterators except in batch methods

This is an alternative to #306. It simply removes the use of parallel iterators except within the batch methods (encode_batch, decode_batch). The result would be that the non-batch versions of encode/decode would be safe to use before and after forking.

Surprisingly, this actually improves performance in the encode benchmarks across the board by a HUGE margin in some cases. So this could be a good thing not just for Python safety, but for performance in general.

Jun 17 '20 22:06 epwalsh

#187

Jun 17 '20 22:06 epwalsh

We should probably do some more benchmarks for this. This is indeed surprising, but I guess it is highly dependant on the different use cases, and might not reflect the reality.

I was thinking about maybe having some way to limit the use of the parallel iterator only in cases where there are enough stuff to process. Maybe using the same method that we added in #311, while providing a minimum size to activate it for example. What do you think?

Jun 29 '20 16:06 n1t0

I agree. I guess we could add some benchmarks that vary the size of the input sequence to see if there's an obvious cutoff where parallelization helps.

Jun 29 '20 17:06 epwalsh