Batch processing does not speed up `en_core_web_trf`

Open njaramish opened this issue 1 year ago • 0 comments

How to reproduce the behaviour

spacy.prefer_gpu()
nlp = spacy.load(
                "en_core_web_trf", 
                disable=['tagger', 'ner', 'lemmatizer', 'textcat']
            )

node = """Some really long string, 3000 characters"""

# simulating 96 pretty long docs
nodes = [node*25]*96

Then, run each of the below lines separately and time it:

# 1 minute 7.5 s
[list(doc.sents) for doc in nlp.pipe(nodes, batch_size=96)]

# 1 minute 7.3 s 
[list(doc.sents) for doc in nlp.pipe(nodes, batch_size=32)]

# 1 m 8.2 s
[list(doc.sents) for doc in nlp.pipe(nodes, batch_size=1)]

Running the same thing with en_core_web_lg results in substantial gains due to batching. Largest batch size is roughly 1/4 of the runtime of batch_size=1.

Your Environment

Using a single RTX A6000

python -m spacy info --markdown:

Info about spaCy

spaCy version: 3.7.4
Platform: Linux-5.15.0-94-generic-x86_64-with-glibc2.35
Python version: 3.10.12
Pipelines: en_core_web_lg (3.7.1), en_core_web_trf (3.7.3), en_core_web_sm (3.7.1), de_core_news_sm (3.7.0)

Expected Behavior

My understanding from the documentation and this issue is that we should expect significant gains from batching, as observed with en_core_web_lg. However, using en_core_web_trf does not yield significant gains from batching.

I'm wondering if this is a bug, or if we should not expect improved performance due to batching for a Transformer-Parser pipeline. Thanks for this awesome package, and in advance for your help!

May 16 '24 17:05 njaramish